Hello Stephane, Thanks for your feedback.
Why did you run e2fsck? I was suspecting some errors but the e2fsck didn't see anything Did e2fsck fix something? no What version of e2fsprogs are you using? e2fsprogs-1.46.2.wc3-0.el7.x86_64 The device had no free i-nodes anymore so I mounted the device with mount -t ldiskfs mdtdevice /mnt to be able to free up some space. But after we still could not mount the mdt Mar 6 11:23:51 mds01 kernel: LDISKFS-fs (dm-19): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc Mar 6 11:23:52 mds01 kernel: LustreError: 11-0: data-MDT0001-osp-MDT0000: operation mds_connect to node 0@lo failed: rc = -114 Mar 6 11:23:52 mds01 kernel: LustreError: Skipped 9 previous similar messages Mar 6 11:23:52 mds01 kernel: LustreError: 79765:0:(genops.c:556:class_register_device()) data-OST0000-osc-MDT0000: already exists, won't add Mar 6 11:23:52 mds01 kernel: LustreError: 79765:0:(obd_config.c:1835:class_config_llog_handler()) MGC1@tcp14: cfg command failed: rc = -17 Mar 6 11:23:52 mds01 kernel: Lustre: cmd=cf001 0:data-OST0000-osc-MDT0000 1:osp 2:data-MDT0000-mdtlov_UUID Mar 6 11:23:52 mds01 kernel: LustreError: 15c-8: MGC@tcp14: The configuration from log 'data-MDT0000' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. Mar 6 11:23:52 mds01 kernel: LustreError: 79753:0:(obd_mount_server.c:1397:server_start_targets()) failed to start server data-MDT0000: -17 Mar 6 11:23:52 mds01 kernel: LustreError: 79753:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: -17 Mar 6 11:23:52 mds01 kernel: Lustre: Failing over data-MDT0000 Mar 6 11:23:52 mds01 kernel: Lustre: data-MDT0000: Not available for connect from @o2ib4 (stopping) Mar 6 11:23:53 mds01 kernel: Lustre: server umount data-MDT0000 complete Mar 6 11:23:53 mds01 kernel: LustreError: 79753:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount (-17) Robin On Sun, Mar 5, 2023 at 2:07 AM Stephane Thiell <[email protected]> wrote: > Hi Robin, > > Sorry to hear about your problem. > > A few questions… > > Why did you run e2fsck? > Did e2fsck fix something? > What version of e2fsprogs are you using? > > errno 28 is ENOSPC, what does dumpe2fs say about available space? > > You can check the values of "Free blocks” and "Free inodes” using this > command: > > dumpe2fs -h /dev/mapper/****-MDT0000 > > > Best, > Stephane > > > > On Mar 2, 2023, at 2:08 AM, Teeninga, Robin via lustre-discuss < > [email protected]> wrote: > > > > Hello, > > > > I've did an e2fsck on my MDT and after that I could not mount the MDT > anymore > > It gives me this error when I've tried to mount the filesystem > > any ideas how to resolve this? > > > > We are running Lustre server 2.12.7 on CentOS 7.9 > > mount.lustre: mount /dev/mapper/****-MDT0000 at /lustre/****-MDT0000 > failed: File exists > > > > > > Mar 2 10:58:35 mds01 kernel: LDISKFS-fs (dm-19): mounted filesystem > with ordered mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc > > Mar 2 10:58:35 mds01 kernel: LustreError: > 160060:0:(llog.c:1398:llog_backup()) MGC****@tcp14: failed to open backup > logfile ****-MDT0000T: rc = -28 > > Mar 2 10:58:35 mds01 kernel: LustreError: > 160060:0:(mgc_request.c:1879:mgc_llog_local_copy()) MGC****@tcp14: failed > to copy remote log ****-MDT0000: rc = -28 > > Mar 2 10:58:35 mds01 kernel: LustreError: 137-5: ****-MDT0001_UUID: not > available for connect from 0@lo (no target). If you are running an HA > pair check that the target is mounted on the other server. > > Mar 2 10:58:35 mds01 kernel: LustreError: Skipped 4 previous similar > messages > > Mar 2 10:58:35 mds01 kernel: LustreError: > 160127:0:(genops.c:556:class_register_device()) *****-OST0000-osc-MDT0000: > already exists, won't add > > Mar 2 10:58:35 mds01 kernel: LustreError: > 160127:0:(obd_config.c:1835:class_config_llog_handler()) MGC****@tcp14: cfg > command failed: rc = -17 > > Mar 2 10:58:36 mds01 kernel: Lustre: cmd=cf001 > 0:****-OST0000-osc-MDT0000 1:osp 2:****-MDT0000-mdtlov_UUID > > Mar 2 10:58:36 mds01 kernel: LustreError: 15c-8: MGC****@tcp14: The > configuration from log '****-MDT0000' failed (-17). This may be the result > of communication errors between this node and the MGS, a bad configuration, > or other errors. See the syslog for more information. > > Mar 2 10:58:36 mds01 kernel: LustreError: > 160060:0:(obd_mount_server.c:1397:server_start_targets()) failed to start > server ****-MDT0000: -17 > > Mar 2 10:58:36 mds01 kernel: LustreError: > 160060:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start > targets: -17 > > Mar 2 10:58:36 mds01 kernel: Lustre: Failing over ****-MDT0000 > > Mar 2 10:58:37 mds01 kernel: Lustre: server umount ****-MDT0000 complete > > Mar 2 10:58:37 mds01 kernel: LustreError: > 160060:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount (-17) > > > > > > Regards, > > > > Robin > > _______________________________________________ > > lustre-discuss mailing list > > [email protected] > > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
