On Tue, Apr 3, 2012 at 10:25 AM, Andrew Deason <[email protected]>wrote:
> On Mon, 2 Apr 2012 19:04:19 -0700 > Ken Elkabany <[email protected]> wrote: > > > Over time these errors become more and more frequent. The problem is > > that the client who hits this issue will experience a 5-10s delay in > > accessing a file, which hurts performance significantly. The clients > > are 1.6pre1, and the server is 1.4.14 > > 1.6.0pre1? Or 1.6.1pre1? > 1.6.0pre1 which was packaged with Ubuntu 11.10. Should we make it a priority to upgrade? > > > Using afsmonitor, I do see that one of the clients hitting this issue > > (I haven't checked whether all client have the problem, but many seem > > to) has 17M callbacks alloced. Could that be suspect? > > Yes; that should not be possible unless the client is within a certain > narrow range of versions. The client could be tied up trying to clear up > that queue of GUCB messages, which is why everything would appear to > freeze for a short time, and you get that ProbeUuid failure. > > What are GUCB messages? Why would they pile up, and in which circumstances? > -- > Andrew Deason > [email protected] > > _______________________________________________ > OpenAFS-info mailing list > [email protected] > https://lists.openafs.org/mailman/listinfo/openafs-info > I traced the ProbeUuid failure to the OpenAFS fileservers using the incorrect IP for certain clients. The clients each have one interface, but are accessible via 2 IP addresses (one external/internet/WAN, one internal/local). The fileservers would use their external IP address, which the firewall would block. After opening up the external IP address ports, the probeuuid errors disappeared. Anyone seen this problem before? The servers are sitting in Amazon EC2, so there's additional complexity with how the fileserver resolves the client IP address.
