Hi, you can use ll_recover_lost_found_objs to recover the files in lost+found to their original location. I think this should be the first step.
Also these messages look a bit scary to me: Oct 7 13:02:04 OSS50 kernel: LustreError: 0-0: Trying to start OBD Lustre-OST003b_UUID using the wrong disk <85>. Were the /dev/ assignments rearranged? ... Oct 7 13:02:04 OSS50 kernel: LustreError: 15b-f: MGC172.16.0.251@tcp: The configuration from log 'Lustre-OST003b'failed from the MGS (-22). Make sure this client and the MGS are running compatible versions of Lustre. Oct 7 13:02:05 OSS50 kernel: LustreError: 15c-8: MGC172.16.0.251@tcp: The configuration from log 'Lustre-OST003b' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. before actually instructing tunefs.lustre to do the writeconf I would check the configuration, parameters etc. with --dryrun. Maybe you also have to put --erase-params and re-configure the OST. Or other CONFIG files (e.g. mountdata) got screwed up on this OST (or was moved to lost+found by the e2fsck?). If you have lost some important ones, some data exists in a copy on the MGT (basically, the writeconf is the mechanism, which transfers it to the MGS). It's a bit difficult to give a good advice by looking at the syslog messages only. Anyhow, recovering the files from lost+found should be the first step, maybe followed by a closer look at the OST on the ldiskfs level. regards, Martin
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
