On Dec 19, 2011, at 16:28 , [email protected] wrote:

> Are you testing this in "lab" conditions? Im curious as to how you are 
> replicating the issue.

I think it's described fairly accurately in 
https://rt.central.org/rt/Ticket/Display.html?id=130327 . In short: have a few 
dozen clients writing large files to the same fileserver, then wait for O(30m). 
See how 1.4 clients - and 1.6 clients with idledead disabled - succeed, and 
unmodified 1.6 clients fail and hang up.

> Also if you can, can you try this by running it on a single core or disabling 
> threads and getting the same results?

If it helps shed more light on the issue and find a solution more satisfactory 
than disabling idledead (which I'd be absolutely happy with), I will.

But according to this thread 
https://lists.openafs.org/pipermail/openafs-devel/2011-November/018583.html , 
it seems already well understood what's going on.

> Quoting Stephan Wiesand <[email protected]>:
> 
>> OS: EL6.1
>> arch: amd64
>> kernels: 2.6.32-131.21.1.el6, 2.6.32-220.1.1.el6 (module built against 
>> 2.6.32-71.el6)
>> 
>> It builds, and it basically works.
>> 
>> It seems to partially address the nat ping issue, but servers still get 
>> pinged more often than intended.
>> 
>> It fails to fix RT #130327. If a fileserver is very busy, clients fail 
>> writing to it and then hang, making AFS unusable on the client machine until 
>> it's rebooted.

-- 
Stephan Wiesand
DESY - DV -
Platanenallee 6
15738 Zeuthen, Germany

_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to