On 02/25/2011 11:39 AM, Andreas Dilger wrote:
> On 2011-02-25, at 6:28, "Brian J. Murrell" <[email protected]> wrote:
>> On 11-02-25 06:18 AM, Francois wrote:
>>>
>>> I continue to parse debug logs and keep them posted.
>>
>> I don't understand why you don't just fix your application to handle a
>> perfectly valid and expected condition (that it's currently not
>> handling) instead of wasting time trying to find the cause of the
>> expected condition. Even if you find it, it's likely not a bug and not
>> something that can/will be fixed. It's your application that needs to
>> be fixed.
>
> In all fairness Brian, it isn't always possible to fix an application like
> you suggest. It might be commercial (binary only), it might be complex code
> using 3rd party libraries to do the IO that would lose support if modifed,
> etc.
>
> I think the first action to debug this is to run on the client with "lctl
> set_param debug=+trace" or "=~0" which will enable function entry/exit
> tracing in Lustre. Then when the problem us hit run "lctl dk /tmp/debug" to
> dump the Lustre debug log, and search for -4 (which is -EINTR) to see where
> this error is first appearing.
>
> At that point we can make a determination where the source of the error is,
> and if it is Lustre's fault. I know at one time there was a related problem
> in the l_wait_event() macro that was improperly masking signals, but I
> thought it was fixed by 1.8.5.
Setting aside the moral question of which calls should be interruptible,
I think that the handling of the LUSTRE_FATAL_SIGS (defined in
lustre_lib.h to be SIGKILL, SIGINT, SIGTERM, SIGQUIT, SIGALRM) is
slightly broken. Under certain situations, Lustre will return -EINTR
although no signals were delivered. That's probably not the end of the
world for most applications, but OTOH I don't think anybody assumes that
-EINTR will be delivered spuriously.
Consider the following sequence:
1) Process P has a Lustre file F open.
2) P has SIGALRM pending (but blocked).
3) P starts to writing to F and ends up sleeping in (something like):
sys_write()
...
ll_extent_lock()
...
osc_enqueue()
...
ptlrpc_queue_wait().
4) The OST does not respond to the request before the deadline, so
l_wait_event() replaces the signal mask of P with the LUSTRE_FATAL_SIGS,
notices that SIGALRM is now deliverable, restores the signal mask of P,
and ptlrpc_queue_wait() returns -EINTR.
5) P is exiting from sys_write(), SIGALRM is blocked (but still pending)
so it doesn't get delivered.
6) P spuriously returns -EINTR from sys_write().
I can reproduce this on 1.8.5/RHEL 5.5. If the goal is to emulate NFS's
interruptibility during congestion then returning -ERESTARTSYS would be
more appropriate. Also, it might be worthwhile to make this extra
interruptibility a mount flag, as NFS does.
Best,
John
--
John L. Hammond, Ph.D.
TACC, The University of Texas at Austin
[email protected]
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss