We add the "options libcfs libcfs_panic_on_lbug=1" in modprobe.conf to make the server kernel panic ASAP the LBUG happened. Is there some way to make the server dead a few seconds after the LBUG? We are also puzzled with the message lost during the LBUG happened.
On Mon, Nov 22, 2010 at 10:42 AM, Kevin Van Maren <[email protected]> wrote: > Sure, but I think for engineering to make progress on this bug, they are > going to want a crash dump. If you can enable crash dumps and panic on lbug > (and if HA, increase dead timeout so it can complete the dump before being > shot in the head) it would provide more info for the bug report. > > That being said, there are quite a few other bugs that have been fixed since > 1.8.0, so you really should upgrade ASAP to 1.8.4. > > Kevin > > > On Nov 21, 2010, at 6:59 PM, Larry <[email protected]> wrote: > >> We had a LBUG several days ago on our lustre 1.8.0. One OSS reported >> >> kernel: LustreError: >> 24669:0:(service.c:1311:ptlrpc_server_handle_request()) >> ASSERTION(atomic_read(&(export)->exp_refcount) < 0x5a5a5a) failed >> kernel: LustreError: >> 24669:0:(service.c:1311:ptlrpc_server_handle_request()) LBUG >> kernel: Lustre: 24669:0:(linux-debug.c:222:libcfs_debug_dumpstack()) >> showing stack for process 24669 >> ...... >> >> I google for this, and find little information about it. It seems to >> be a race condition on OSS, right? Should I open a bugzilla for this >> LBUG? >> Thanks. >> _______________________________________________ >> Lustre-discuss mailing list >> [email protected] >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
