Hi everyone, We have 2 OSS's each with 5 1TB OST's that share lun's on on our san.
OST0-4 are on server3 OST5-9 are on server4 Each ost is 1TB with an external journal Server3 crashed HARD (as in it wouldn't post upon power off, wait 30 seconds, power on) and we were told by the vendor that the motherboard died. In the meanwhile I attempted to mount the OSTs up on server4. Server3 was powered off before attempting this (STONITH theory, right?) I ended up with lots of problems and did end up hitting a few lbug's, specifically: LustreError: 11283:0: (tracefile.c:431:libcfs_assertion_failed()) LBUG LustreError: 8095:0: (tracefile.c:431:libcfs_assssertion_failed()) LBUG We are running an older lustre version (lustre-1.6.4.3-2.6.18_53.1.13.el5_lustre.1.6.4.3smp) on Centos 5.2 boxes, with the appropriate matching e2fsck, utilities, etc from the appropriate download page on the Sun website. I had major problems getting the remaining lustre server to mount the new OSTs because of apparent journal problems. I kept hitting "LDISKFS: failed to claim external journal device" when trying to mount the OST's as type ldiskfs. Trying to mount them as type lustre gave me an error -22. The way I fixed it was by taking the following steps: * fsck /path/to/block/device/of/ost-data (this seemed to pick up the journal correctly) * ls -la /path/to/block/device/of/journal-dev of ost-data which gives output such as: Brw-rw---- 1 root disk 253, 7 Feb 24 20:31 /path/to/block/device/of/journal-dev * mount -t ldiskfs -o journal_dev=0xFD07 /path/to/block/device/of/ost-data /mnt/tmp-mt-pt (FD=253 in hex, 07 = 7 in hex) * unmount /mnt/tmp-mt-pt * mount -t lustre /path/to/block/device/of/ost-data /mnt/normal-mountpoint-of-ost My questions: 1) Since the mds did not crash, but half the OST's did, do I need to make any changes to the mds? 2) Any idea why e2fsck can figure out the journal device automatically but Lustre cannot ? (at least until I manually mount/unmount as type ldiskfs and manually specify the journal major/minor dev numbers) 3) Is the LBUG above fixed in a newer version of lustre? If there is not enough information, what steps should I take next time to get you everything you need? Thanks, Rob The information contained in this message and its attachments is intended only for the private and confidential use of the intended recipient(s). If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e- mail is strictly prohibited. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
