Re: [lustre-discuss] MDT LBUG after every restart

Thomas Roth Mon, 26 Mar 2018 05:10:48 -0700

For the record:

This was triggered by one job script using a working directory in a directory tree under MDT0 andredirecting stderr of gzip commands to a directory in a treen under MDT1.

Once the user put all of it in either one or the other tree, the problem 
disappeared.


Thomas

On 03/12/2018 09:54 AM, Thomas Roth wrote:

Hi all,

our production system running Lustre 2.5.3 has broken down, and I'm quite 
clueless.

The second (of two) MDTs crashed and after reboot + recovery LBUGs again with:
Mar 11 20:02:37 lxmds15 kernel: Lustre: nyx-MDT0001: Recovery over after 1:36, of 720 clients 720recovered and 0 were evicted.
Mar 11 20:02:37 lxmds15 kernel: LustreError:6705:0:(osp_precreate.c:719:osp_precreate_cleanup_orphans()) nyx-OST0001-osc-MDT0001: cannot cleanup
orphans: rc = -108
Mar 11 20:02:37 lxmds15 kernel: LustreError:6705:0:(osp_precreate.c:719:osp_precreate_cleanup_orphans()) Skipped 74 previous similar messages
Mar 11 20:02:37 lxmds15 kernel: LustreError: 6574:0:(mdt_handler.c:2706:mdt_object_lock0()) ASSERTION(!(ibits & (MDS_INODELOCK_UPDATE |
MDS_INODELOCK_PERM)) ) failed: nyx-MDT0001: wrong bit 0x2 for remote obj 
[0x5100027c70:0x17484:0x0]

Mar 11 20:02:37 lxmds15 kernel: LustreError: 
6574:0:(mdt_handler.c:2706:mdt_object_lock0()) LBUG
This seems to be LU-6071, but I am wondering what actually causes it - there should be no ongoingattempts from a client to create a directory on the second MDT.
After doing an e2fsck on the MDT, it mounts and then crashes with a different FID each time. (Ifmounted without fsck, the crashing FID remains the same.)
Is there any way we can find out more about the cause?

If it is a finite number of troubling inodes, is there a trick to 
manipulate/clear these?


Regards,
Thomas

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] MDT LBUG after every restart

Reply via email to