On 02/25/2011 11:39 AM, Andreas Dilger wrote:
> On 2011-02-25, at 6:28, "Brian J. Murrell" <[email protected]> wrote:
>> On 11-02-25 06:18 AM, Francois  wrote:
>>>
>>> I continue to parse debug logs and keep them posted.
>>
>> I don't understand why you don't just fix your application to handle a
>> perfectly valid and expected condition (that it's currently not
>> handling) instead of wasting time trying to find the cause of the
>> expected condition.  Even if you find it, it's likely not a bug and not
>> something that can/will be fixed.  It's your application that needs to
>> be fixed.
> 
> In all fairness Brian, it isn't always possible to fix an application like 
> you suggest. It might be commercial (binary only), it might be complex code 
> using 3rd party libraries to do the IO that would lose support if modifed, 
> etc. 
> 
> I think the first action to debug this is to run on the client with "lctl 
> set_param debug=+trace" or "=~0" which will enable function entry/exit 
> tracing in Lustre. Then when the problem us hit run "lctl dk /tmp/debug" to 
> dump the Lustre debug log, and search for -4 (which is -EINTR) to see where 
> this error is first appearing. 
> 
> At that point we can make a determination where the source of the error is, 
> and if it is Lustre's fault. I know at one time there was a related problem 
> in the l_wait_event() macro that was improperly masking signals, but I 
> thought it was fixed by 1.8.5. 

Setting aside the moral question of which calls should be interruptible,
I think that the handling of the LUSTRE_FATAL_SIGS (defined in
lustre_lib.h to be SIGKILL, SIGINT, SIGTERM, SIGQUIT, SIGALRM) is
slightly broken.  Under certain situations, Lustre will return -EINTR
although no signals were delivered.  That's probably not the end of the
world for most applications, but OTOH I don't think anybody assumes that
-EINTR will be delivered spuriously.

Consider the following sequence:

1) Process P has a Lustre file F open.

2) P has SIGALRM pending (but blocked).

3) P starts to writing to F and ends up sleeping in (something like):

  sys_write()
   ...
    ll_extent_lock()
     ...
      osc_enqueue()
       ...
        ptlrpc_queue_wait().

4) The OST does not respond to the request before the deadline, so
l_wait_event() replaces the signal mask of P with the LUSTRE_FATAL_SIGS,
notices that SIGALRM is now deliverable, restores the signal mask of P,
and ptlrpc_queue_wait() returns -EINTR.

5) P is exiting from sys_write(), SIGALRM is blocked (but still pending)
so it doesn't get delivered.

6) P spuriously returns -EINTR from sys_write().

I can reproduce this on 1.8.5/RHEL 5.5.  If the goal is to emulate NFS's
interruptibility during congestion then returning -ERESTARTSYS would be
more appropriate.  Also, it might be worthwhile to make this extra
interruptibility a mount flag, as NFS does.

Best,

John

-- 
John L. Hammond, Ph.D.
TACC, The University of Texas at Austin
[email protected]
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to