Hi,

A reboot of the MDS stalled and got forced reset. After that the MDS would not start. The syslog is attached.

I'm not sure what the "class_register_device()) astro-OST0002-osc-MDT0000" part is supposed to do but astro-OST0002 is not mounted at this time. I guess this comes from the MGS.

Cheers,
Hans Henrik
Mar 10 10:03:49 mds02 kernel: Lustre: MGS: Connection restored to 
d8787407-db0d-ccfb-e5ab-adeb41b86c1d (at 0@lo)
Mar 10 10:03:49 mds02 kernel: Lustre: Skipped 197 previous similar messages
Mar 10 10:03:59 mds02 kernel: LustreError: 137-5: astro-MDT0000_UUID: not 
available for connect from 10.21.207.78@o2ib (no target). If you are running an 
HA pair check that the target is mounted on the other server.
Mar 10 10:03:59 mds02 kernel: LustreError: Skipped 155 previous similar messages
Mar 10 10:04:00 mds02 kernel: LustreError: 
8923:0:(genops.c:556:class_register_device()) astro-OST0002-osc-MDT0000: 
already exists, won't add
Mar 10 10:04:00 mds02 kernel: LustreError: 
8923:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.21.10.102@o2ib: 
cfg command failed: rc = -17
Mar 10 10:04:00 mds02 kernel: Lustre:    cmd=cf001 0:astro-OST0002-osc-MDT0000  
1:osp  2:astro-MDT0000-mdtlov_UUID  
Mar 10 10:04:00 mds02 kernel: LustreError: 15c-8: MGC10.21.10.102@o2ib: The 
configuration from log 'astro-MDT0000' failed (-17). This may be the result of 
communication errors between this node and the MGS, a bad configuration, or 
other errors. See the syslog for more information.
Mar 10 10:04:00 mds02 kernel: LustreError: 
7016:0:(obd_mount_server.c:1397:server_start_targets()) failed to start server 
astro-MDT0000: -17
Mar 10 10:04:00 mds02 kernel: LustreError: 
7016:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: 
-17
Mar 10 10:04:00 mds02 kernel: Lustre: Failing over astro-MDT0000
Mar 10 10:04:01 mds02 kernel: Lustre: astro-MDT0000: Not available for connect 
from 10.21.208.26@o2ib (stopping)
Mar 10 10:04:01 mds02 kernel: Lustre: Skipped 129 previous similar messages
Mar 10 10:04:15 mds02 kernel: LustreError: 137-5: astro-MDT0000_UUID: not 
available for connect from 172.20.2.101@tcp1 (no target). If you are running an 
HA pair check that the target is mounted on the other server.
Mar 10 10:04:15 mds02 kernel: LustreError: 137-5: astro-MDT0000_UUID: not 
available for connect from 172.20.2.101@tcp1 (no target). If you are running an 
HA pair check that the target is mounted on the other server.
Mar 10 10:04:15 mds02 kernel: LustreError: Skipped 35 previous similar messages
Mar 10 10:04:15 mds02 kernel: LustreError: Skipped 1 previous similar message
Mar 10 10:04:20 mds02 kernel: Lustre: server umount astro-MDT0000 complete
Mar 10 10:04:20 mds02 kernel: LustreError: 
7016:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-17)
Mar 10 10:04:37 mds02 kernel: Lustre: MGS: Connection restored to  (at 
10.21.207.58@o2ib)

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to