We saw an issue in a production server on a customer deployment where
DLM 4.0.7 gets "stuck" and unable to join new lockspaces.

See - https://lists.clusterlabs.org/pipermail/users/2019-January/016054.html

This was forwarded off list to David Teigland who responded thusly.

Hi, thanks for the debugging info.  You've spent more time looking at
this than I have, but from a first glance it seems to me that the
initial problem (there may be multiple) is that in the kernel,
lockspace.c do_event() does not sensibly handle the ERESTARTSYS error
from wait_event_interruptible().  I think do_event() should continue
waiting for a uevent result from userspace until it gets one, because
the kernel can't do anything sensible until it gets that.


This change does that. We have it running in automation with no problems
so far but comments welcome.

Mark Syms (1):
  Retry wait_event_interruptible in event of ERESTARTSYS

 fs/dlm/lockspace.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)


Reply via email to