John R. Dunning wrote:
    From: Nathaniel Rutman <[EMAIL PROTECTED]>
    Date: Fri, 17 Nov 2006 12:11:25 -0800
John R. Dunning wrote:
    >     From: Nathaniel Rutman <[EMAIL PROTECTED]>
    >     Date: Fri, 17 Nov 2006 11:39:59 -0800
> > Deactivate the device on the MDT side for a currently-running server
    >     e.g. 13 UP osc lustre-OST0001-osc lustre-mdtlov_UUID 5
    >     lctl --device 13 deactivate
    >
    > Ok, did that.  It still shows as UP when I lctl dl, though.
> Yes, it does. Your question prompted me to take a look at changing that... For now, you can get to it here:
    cfs21:~# cat /proc/fs/lustre/lov/lustre-mdtlov/target_obd
    0: lustre-OST0000_UUID ACTIVE
    1: lustre-OST0001_UUID INACTIVE

Ok.
> > To start a client or MDT with a known down OST:
    >     mount -t lustre -o exclude=lustre-OST0001 ...
> > Ah, ok. So there isn't any way to say "Remove all traces of this OST from the
    > system so that nobody knows it was ever there" ?
> That is an eventual planned feature, but isn't implemented yet.

Ok.
You could --writeconf the MDT to nuke the config logs, then restart the servers,
Example?
See the wiki:
https://mail.clusterfs.com/wikis/lustre/MountConf#head-18c689130e5184035dcec1e6e2b49597afdab189
I just noticed a regression in my current code (and updated the wiki) - you'll have to tunefs.lustre --writeconf every server disk, not only the MDT, to regen the logs. I have now fixed that so you only need to --writeconf the MDT,
but it is always safe to do them all.  (Not sure when that regressed.)

             and
that will truly erase all traces of OSTs that don't restart. Beware, any file that has
    stripes on such an erased OST will be very confusing to Lustre...

Sure, of course.  I suppose to do it really right, you'd want some kind of
tool that could examine the MD and gripe about anything that had stripes on
the OST in question.  But that would be pretty slow.

    Beware #2: I don't claim to have tried this myself.
Understood. Perhaps I'll try this next week, or perhaps I'll just blow it
away and rebuild it without the offending unit.
I just tried it myself, and it works like a charm.
Files on lost OSTs don't actually seem to confuse Lustre at all, they just act corrupted:
cfs21:~/cfs/b1_5/lustre/tests# ll /mnt/lustre
total 4
?---------  ? ?    ?       ?            ? p2
-rw-r--r--  1 root root 1699 Nov 17 12:53 passwd

Adding a new OST that reuses the old index results in a valid but truncated file:
cfs21:~/cfs/b1_5/lustre/tests# ll /mnt/lustre
total 4
-rw-r--r--  1 root root    0 Nov 17 13:31 p2
-rw-r--r--  1 root root 1699 Nov 17 12:53 passwd





_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to