On Oct 27, 2008 10:16 -0600, Craig Tierney wrote: > Andreas Dilger wrote: >> Note that soft lockups are only a warning. It shouldn't mean that the >> node is completely dead, only that some thread was hogging the CPU. > > The two soft lockup messages (one in kswapd0 and the other in the user > process convert_emiss) repeated their messages for 6 hours before I rebooted > the node. I don't recall if I could login to the node or not.
Ah, then the spewing of the "warning" messages is likely what caused the node to be unusable :-(. Console messages are printed with all interrupts disabled and can be a problem in such cases. Unfortunately, this printing is outside of the Lustre code so we can't fix it without patching the kernel. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
