/proc/sys/kernel/panic_on_oops was alreay set to 1 We configured netconsole on all nodes and a syslog-ng server to log the messages.
I will come back with the traces next time the problem occur. Thank you Sunil Mushran wrote: > Setup a netconsole server to catch the oops trace. Have you set > /sys/kernel/panic_on_oops to 1? > > On Tue, Mar 24, 2009 at 06:09:22PM +0200, Cristian Gae wrote: >> Hello >> >> We have a 9 nodes ocfs2 cluster used for http serving. >> >> Sometimes when we want to reboot one of the nodes, it happens to kernel >> panic at ocfs2 unmount, and after this on all the other nodes some httpd >> processes goes to "D" state, and the system gets loaded. >> >> ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN >> >> 10187 D httpd ocfs2_wait_for_mask >> >> 10241 D httpd ocfs2_wait_for_mask >> >> 10254 D httpd ocfs2_wait_for_mask >> >> 10255 D httpd ocfs2_wait_for_mask >> >> 10272 D httpd ocfs2_wait_for_mask >> >> 10273 D httpd ocfs2_wait_for_mask >> >> 10274 D httpd ocfs2_wait_for_mask >> >> 10398 D httpd ocfs2_wait_for_mask >> >> 10441 D httpd ocfs2_wait_for_mask >> >> 10452 D httpd ocfs2_wait_for_mask >> >> >> >> Please tell me what info do you need to provide help to us. >> >> All nodes are CentOS 5.2, ocfs2 1.4.1, we use an DS4700 IBM Storage, >> with Qlogic HBA and MPP driver. >> >> >> >> -- >> Cristian Gae >> Director IT >> Netbridge Services >> cristian....@netbridge.ro >> 0749 018 817 >> >> -- >> Acest mesaj impreuna cu fisierele transmise constituie o informatie >> confidentiala si se adreseaza numai persoanei/persoanelor fizice sau >> juridice mentionata/e ca destinatar. Daca nu sunteti destinatarul >> acestui mesaj si ati primit e-mailul din greseala, va rugam anuntati >> administratorul de sistem. Va aducem la cunostinta ca opiniile exprimate >> in acest e-mail reprezinta punctul de vedere al autorului si nu cel al >> intregii societati. Primitorul trebuie sa verifice existenta unor virusi >> in acest e-mail si in continutul fisierele atasate. Societatea Netbridge >> Services SRL nu este responsabila pentru transmiterea necorespunzatoare >> a informatiei cauzate de un virus. >> >> >> _______________________________________________ >> Ocfs2-users mailing list >> Ocfs2-users@oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs2-users > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users -- Cristian Gae Director IT Netbridge Services cristian....@netbridge.ro 0749 018 817 -- Acest mesaj impreuna cu fisierele transmise constituie o informatie confidentiala si se adreseaza numai persoanei/persoanelor fizice sau juridice mentionata/e ca destinatar. Daca nu sunteti destinatarul acestui mesaj si ati primit e-mailul din greseala, va rugam anuntati administratorul de sistem. Va aducem la cunostinta ca opiniile exprimate in acest e-mail reprezinta punctul de vedere al autorului si nu cel al intregii societati. Primitorul trebuie sa verifice existenta unor virusi in acest e-mail si in continutul fisierele atasate. Societatea Netbridge Services SRL nu este responsabila pentru transmiterea necorespunzatoare a informatiei cauzate de un virus. _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users