Hi Malcolm, thanks for replying.
On Tue, Jun 19, 2018 at 08:54:53PM +0000, Cowe, Malcolm J wrote: >Would using hard links work, instead of mv? hmm, interesting idea, but no: # ln some_file /lustre/shadow/some_file ln: failed to access 'some_file' Cannot send after transport endpoint shutdown ln is trying to lstat() which fails. I think almost all client operations are going to fail with a deactivated/down OST. things like 'lfs getstripe' (pure MDS ops) work ok. or did you mean doing hard links on the MDT? unless there's a purely MDS lustre tool to do a mv/rename operation on the MDT, then I think the only option is to mess around with the low level suff on the MDT when it's mounted as ldiskfs and hope I don't break too much... there used to be a 'lfs mv' (now 'lfs migrate') but that isn't quite the mv operations I'm after. any advice or war stories (especially "this is a waste of your time - it will never work because of X,Y,Z") would be much appreciated :) time to read more of the lustre manual now... cheers, robin >Malcolm. > > >???On 20/6/18, 1:34 am, "lustre-discuss on behalf of Robin Humble" ><[email protected] on behalf of >[email protected]> wrote: > > Hi, > > so we've maybe lost 1 OST out of a filesystem with 115 OSTs. we may > still be able to get the OST back, but it's been a month now so > there's pressure to get the cluster back and working and leave the > files missing for now... > > the complication is that because the OST might come back to life we > would like to avoid the users rm'ing their broken files and potentially > deleting them forever. > > lustre is 2.5.41 ldiskfs centos6.x x86_64. > > ideally I think we'd move all the ~2M files on the OST to a root access > only "shadow" directory tree in lustre that's populated purely with > files from the dead OST. > if we manage to revive the OST then these can magically come back to > life and we can mv them back into their original locations. > > but currently > mv: cannot stat 'some_file': Cannot send after transport endpoint > shutdown > the OST is deactivated on the client. the client hangs if the OST isn't > deactivated. the OST is still UP & activated on the MDS. > > is there a way to mv files when their OST is unreachable? > > seems like mv is an MDT operation so it should be possible somehow? > > > the only thing I've thought of seems pretty out there... > mount the MDT as ldiskfs and mv the affected files into the shadow > tree at the ldiskfs level. > ie. with lustre running and mounted, create an empty shadow tree of > all dirs under eg. /lustre/shadow/, and then at the ldiskfs level on > the MDT: > for f in <list_of_2m_files>; do > mv /mnt/mdt0/ROOT/$f /mnt/mdt0/ROOT/shadow/$f > done > > would that work? > maybe we'd also have to rebuild OI's and lfsck - something along the > lines of the MDT restore procedure in the manual. hopefully that would > all work with an OST deactivated. > > > alternatively, should we just unlink all the currently dead files from > lustre now, and then if the OST comes back can we reconstruct the paths > and filenames from the FID in xattrs's on the revived OST? > I suspect unlink is final though and this wouldn't work... ? > > we can also take an lvm snapshot of the MDT and refer to that later I > suppose, but I'm not sure how that might help us. > > as you can probably tell I haven't had to deal with this particular > situation before :) > > thanks for any help. > > cheers, > robin > _______________________________________________ > lustre-discuss mailing list > [email protected] > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > >_______________________________________________ >lustre-discuss mailing list >[email protected] >http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
