Hello, Replying to myself. No we couldn't get lustre up again and had to reinstall from scratch. :-( Keeping fingers crossed now we are running the productive system ....
What bugs us is this part of the message on the MDS: Aug 13 11:18:54 sadosrd20 LustreError: 15c-8: [EMAIL PROTECTED]: The configuration from log 'scia-OST0004' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. Unfortunatly there are no more infos in the syslog. Regards Heiko Hello again, any idea what can be done in such a case ? Regards Heiko Hello, after a crash (hardware failure) of an OST with two lustre partitions one partition (/dev/sdb) cannot be remounted after restart. The second (/dev/sdc) partition mounts fine. What needs to be done in such a case ? I tried to move the mountpoint because of the "file exists" message but that does not help. Any pointers welcome. Heiko OST messages after mount command: mount -t lustre /dev/sdb /mnt/data/ost3 <snip> Aug 13 11:18:53 sadosrd20 kjournald starting. Commit interval 5 seconds Aug 13 11:18:53 sadosrd20 LDISKFS FS on sdb, internal journal Aug 13 11:18:53 sadosrd20 LDISKFS-fs: mounted filesystem with ordered data mode. Aug 13 11:18:53 sadosrd20 LDISKFS-fs: file extents enabled Aug 13 11:18:53 sadosrd20 LDISKFS-fs: mballoc enabled Aug 13 11:18:54 sadosrd20 LustreError: 7247:0:(genops.c:246:class_newdev()) Device scia-OST0004 already exists, won't add Aug 13 11:18:54 sadosrd20 LustreError: 7247:0: (obd_config.c:180:class_attach()) Cannot create device scia-OST0004 of type obdfilter : -17 Aug 13 11:18:54 sadosrd20 LustreError: 7247:0: (obd_config.c:1070:class_config_llog_handler()) Err -17 on cfg command: Aug 13 11:18:54 sadosrd20 Lustre: cmd=cf001 0:scia-OST0004 1:obdfilter 2:scia-OST0004_UUID Aug 13 11:18:54 sadosrd20 LustreError: 15c-8: [EMAIL PROTECTED]: The configuration from log 'scia-OST0004' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. Aug 13 11:18:54 sadosrd20 LustreError: 7247:0: (obd_mount.c:1091:server_start_targets()) failed to start server scia-OST0004: -17 Aug 13 11:18:54 sadosrd20 LustreError: 7247:0: (obd_mount.c:1597:server_fill_super()) Unable to start targets: -17 Aug 13 11:18:54 sadosrd20 LustreError: 7247:0: (obd_mount.c:1382:server_put_super()) no obd scia-OST0004 Aug 13 11:18:55 sadosrd20 LDISKFS-fs: mballoc: 1 blocks 1 reqs (0 success) Aug 13 11:18:55 sadosrd20 LDISKFS-fs: mballoc: 1 extents scanned, 0 goal hits, 1 2^N hits, 0 breaks, 0 lost Aug 13 11:18:55 sadosrd20 LDISKFS-fs: mballoc: 1 generated and it took 7512 Aug 13 11:18:55 sadosrd20 LDISKFS-fs: mballoc: 256 preallocated, 0 discarded Aug 13 11:18:55 sadosrd20 Lustre: server umount scia-OST0004 complete Aug 13 11:18:55 sadosrd20 LustreError: 7247:0: (obd_mount.c:1951:lustre_fill_super()) Unable to mount (-17) <snap> OST parameter: mkfs.lustre --param="failover.mode=failout" --fsname scia --ost --mkfsoptions='-i 2097152 -E stride=16 -b 4096' [EMAIL PROTECTED] /dev/sdb mkfs.lustre --param="failover.mode=failout" --fsname scia --ost --mkfsoptions='-i 2097152 -E stride=16 -b 4096' [EMAIL PROTECTED] /dev/sdc MDS parameter: mkfs.lustre --fsname=scia --mdt --mgs --failnode=mds2 /dev/drbd0 Just for your info the OST output of the ok partition after mounting: Aug 13 11:26:58 sadosrd20 (fs/jbd/recovery.c, 255): journal_recover: JBD: recovery, exit status 0, recovered transactions 72449 to 74105 Aug 13 11:26:58 sadosrd20 (fs/jbd/recovery.c, 257): journal_recover: JBD: Replayed 7548 and revoked 0/0 blocks Aug 13 11:27:00 sadosrd20 kjournald starting. Commit interval 5 seconds Aug 13 11:27:00 sadosrd20 LDISKFS FS on sdc, internal journal Aug 13 11:27:00 sadosrd20 LDISKFS-fs: recovery complete. Aug 13 11:27:00 sadosrd20 LDISKFS-fs: mounted filesystem with ordered data mode. Aug 13 11:27:01 sadosrd20 kjournald starting. Commit interval 5 seconds Aug 13 11:27:01 sadosrd20 LDISKFS FS on sdc, internal journal Aug 13 11:27:01 sadosrd20 LDISKFS-fs: mounted filesystem with ordered data mode. Aug 13 11:27:01 sadosrd20 LDISKFS-fs: file extents enabled Aug 13 11:27:01 sadosrd20 LDISKFS-fs: mballoc enabled Aug 13 11:27:01 sadosrd20 Lustre: 7267:0:(filter.c:1732:filter_common_setup()) scia-OST0005: recovery disabled Aug 13 11:27:01 sadosrd20 Lustre: 7267:0: (filter.c:744:filter_init_server_data()) scia-OST0005: recovery support OFF Aug 13 11:27:01 sadosrd20 Lustre: OST scia-OST0005 now serving dev (scia-OST0005/ca6d322c-65d4-968c-4f25-3f37937678a8) with recovery disabled Aug 13 11:27:01 sadosrd20 Lustre: Server scia-OST0005 on device /dev/sdc has started Aug 13 11:27:06 sadosrd20 Lustre: scia-OST0005: received MDS connection from [EMAIL PROTECTED] Aug 13 11:27:06 sadosrd20 Lustre: 6414:0: (filter.c:2774:filter_destroy_precreated()) scia-OST0005: deleting orphan objects from 3073 to 3180 _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss ------------------------------------------------------- -- ----------------------------------------------------------------------- Dipl.-Ing. Heiko Schröter Institute of Environmental Physics (IUP) phone: ++49-(0)421-218-4080 Institute of Remote Sensing (IFE) fax: ++49-(0)421-218-4555 University of Bremen (FB1) P.O. Box 330440 email: [EMAIL PROTECTED] Otto-Hahn-Allee 1 28359 Bremen Germany ----------------------------------------------------------------------- _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
