I notice the following behaviour on a test system while setting up init.d scripts to allow clean shutdown and re-start of an OSS (lustre 1.8.4)
format an OST with an external journal. The external journal is located on a raid10 array connected to one of the network interfaces (aoe). the OST was tested in a filesystem and appears to operate normally. I can cleanly mount and unmount the OST with mount / umount however the following sequence leads to an unmountable filesystem on startup sync umount the OST sync (again for certainty) stop the internal and external raid arrays stop network interface restart network interface bring up the internal and external raid arrays now an attempt to re-mount the OST fails with LDISKFS-fs (md14): failed to open journal device unknown-block(152,225): -6 an e2fsck fixes this external superblock [root@OST2 ~]# e2fsck -j /dev/etherd/e9.24p1 /dev/md14 e2fsck 1.41.10.sun2 (24-Feb-2010) Superblock hint for external superblock should be 0x409802. Fix<y>? yes the subsequent OST is marked clean and can be re-mounted. I am not sure why this is happening. The external raid partitions of relevance are all presented under the same /dev/ device when they are stopped/re-started. any ideas? the workaround is maybe to put an e2fsck into the startup scripts (maybe not a bad idea?) but seems like a kludge. Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
