Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-03-28 Thread Kadlecsik Jozsef
Hi, On Fri, 27 Mar 2009, Wendy Cheng wrote: I should get some sleep - but can't it be that I hit the potential deadlock mentioned here: commit  4787e11dc7831f42228b89ba7726fd6f6901a1e3 gfs-kmod: workaround for potential deadlock. Prefault user pages [...] file.

Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-03-28 Thread Wendy Cheng
Kadlecsik Jozsef wrote: I don't see a strong evidence of deadlock (but it could) from the thread backtraces However, assuming the cluster worked before, you could have overloaded the e1000 driver in this case. There are suspicious page faults but memory is very ok. So one possibility is that GFS

Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-03-28 Thread Wendy Cheng
Wendy Cheng wrote: . [snip] ... There are many foot-prints of spin_lock - that's worrisome. Hit a couple of sysrq-w next time when you have hangs, other than sysrq-t. This should give traces of the threads that are actively on CPUs at that time. Also check your kernel change log (to see