We use IP ACLs to serve web content out of AFS via a pool of load- balanced servers. We've had a problem lately where after a reboot, the IP ACL for the rebooted web server/AFS client sometimes stops working on a random AFS file server. It works for all the rest of the AFS servers, just not the one deciding to be difficult.
While tracing this down, I noticed that in all the server/client cases where there was a problem (verified for three or four cases), the client IP address was not in "hosts.dump" file (generated via "kill -XCPU <fileserver process>"). I'm pretty sure this is why the IP ACL fails to work. What I can't figure out is how to get the client back in that table. I thought it was automatically added when the client contacted the server, but this isn't happening. I tried changing the cache uuid via "fs uuid -generate" (a shot in the dark), wondering if perhaps there was some internal uuid caching going on. I tried using the "flushcps" program to tickle the table, but that didn't help either. Here's the particulars for a current client/server problem: Example Server: Solaris 8, OpenAFS v1.4.7 Example Client: RHEL 5, OpenAFS v1.4.10 So far, all of the problems have been on v1.4.7 servers, but I don't have a large enough of a sample size to know if it's coincidence. Is there something I'm just missing, or is this possilby a bug? It's really starting to cause us problems, so I'd appreciate any hints as to which direction to explore. Workarounds to let us bring our web servers back online would be most welcome as well. Thanks in advance for your help. William Setzer OIT Systems and Hosted Services NC State University _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
