On Fri, Nov 14, 2008 at 10:00:13AM +0000, Nuno Fernandes wrote:
> 22236 [dlm_recoverd]              dlm_wait_function
> 25097 [dlm_recoverd]              dlm_wait_function

dlm recovery appears to be stuck; this is usually due to a problem at the
network level.  The recovery seems to be caused by a node starting clvmd.

sysrq-t backtraces from all the nodes could confirm some of this, and
adding <dlm log_debug="1"/> to cluster.conf would give us more information
the next time it happens.

Dave

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to