So, that is at least not a syntax for abort_recovery I'm familiar with. To take an example from last time I did this, I first determined which device wasn't completing the recovery, then logged in on the server (an OST in this case) and ran:
# lctl dl|grep obdfilter|grep fouo5-OST0000 3 UP obdfilter fouo5-OST0000 fouo5-OST0000_UUID 629 # lctl --device 3 abort_recovery Attached is a script that you can invoke with "lustre_watch_recovery <servername>" that will give you the status of recovery on the named server updated once per second. I find it useful for keeping track of how things are working out while doing restarts. Regards, -- Peter Bortas, NSC On Fri, Oct 19, 2018 at 4:42 PM Marion Hakanson <[email protected]> wrote: > > Thanks for the feedback. You're both confirming what we've learned so far, > that we had to unmount all the clients (which required rebooting most of > them), then reboot all the storage servers, to get things unstuck until the > problem recurred. > > I tried abort_recovery on the clients last night, before rebooting the MDS, > but that did not help. Could well be I'm not using it right: > > - look up the MDT in "lctl dl" list. > - run "lctl abort_recovery $mdt" on all clients > - reboot the MDS. > > The MDS still reported recovering all 259 clients at boot time. > > BTW, we have a separate MGS from the MDS. Could it be we need to reboot both > MDS & MGS to clear things? > > Thanks and regards, > > Marion > > > > On Oct 19, 2018, at 07:28, Peter Bortas <[email protected]> wrote: > > > > That should fix it, but I'd like to advocate for using abort_recovery. > > Compared to unmounting thousands of clients abort_recovery is a quick > > operation that takes a few minutes to do. Wouldn't say it gets used a > > lot, but I've done it on NSCs live environment six times since 2016, > > solving the deadlocks each time. > > > > Regards, > > -- > > Peter Bortas > > Swedish National Supercomputer Centre > > > >> On Fri, Oct 19, 2018 at 3:04 PM Patrick Farrell <[email protected]> wrote: > >> > >> > >> Marion, > >> > >> You note the deadlock reoccurs on server reboot, so you’re really stuck. > >> This is most likely due to recovery where operations from the clients are > >> replayed. > >> > >> If you’re fine with letting any pending I/O fail in order to get the > >> system back up, I would suggest a client side action: unmount (-f, and be > >> patient) and /or shut down all of your clients. That will discard things > >> the clients are trying to replay, (causing pending I/O to fail). Then > >> shut down your servers and start them up again. With no clients, there’s > >> (almost) nothing to replay, and you probably won’t hit the issue on > >> startup. (There’s also the abort_recovery option covered in the manual, > >> but I personally think this is easier.) > >> > >> There’s no guarantee this avoids your deadlock happening again, but it’s > >> highly likely it’ll at least get you running. > >> > >> If you need to save your pending I/O, you’ll have to install patched > >> software with a fix for this (sounds like WC has identified the bug) and > >> then reboot. > >> > >> Good luck! > >> - Patrick > >> ________________________________ > >> From: lustre-discuss <[email protected]> on behalf > >> of Marion Hakanson <[email protected]> > >> Sent: Friday, October 19, 2018 1:32:10 AM > >> To: [email protected] > >> Subject: [lustre-discuss] LU-11465 OSS/MDS deadlock in 2.10.5 > >> > >> This issue is really kicking our behinds: > >> https://jira.whamcloud.com/browse/LU-11465 > >> > >> While we're waiting for the issue to get some attention from Lustre > >> developers, are there suggestions on how we can recover our cluster from > >> this kind of deadlocked, stuck-threads-on-the-MDS (or OSS) situation? > >> Rebooting the storage servers does not clear the hang-up, as upon reboot > >> the MDS quickly ends up with the same number of D-state threads (around > >> the same number as we have clients). It seems to me like there is some > >> state stashed away in the filesystem which restores the deadlock as soon > >> as the MDS comes up. > >> > >> Thanks and regards, > >> > >> Marion > >> > >> _______________________________________________ > >> lustre-discuss mailing list > >> [email protected] > >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
lustre_watch_recovery
Description: Binary data
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
