Bryn M. Reeves wrote:

I have a 2-node cluster with Open Shared Root on GFS on DRBD. A single
node mounts GFS OK and works, but after a while seems to just block
for disk.
[...]
State:  D (disk sleep)
SleepAVG:       97%
[...]

You can find out what it's sleeping on either by via a sysrq or by
getting ps to display the wchan field of the stat data, e.g.:

ps ac -opid,comm,wchan

And see what symbol appears in the 3rd field.

And this is what comes out for all the stuck processes:

ps ax -opid,comm,wchan | grep ssh
 9250 sshd            -
 9507 sshd            gdlm_plock
 9642 sshd            gdlm_plock

They are all stuck in the gdlm_plock function. I figured it'd be something like this. How do I debug this further?

Gordan

--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to