What is the normal amount of time I should expect e2fsck --mdsdb to be running (1T MDT)? (So far it's running quite a few hours) Thanks, Eli
On Thu, Aug 11, 2016 at 12:42 PM, E.S. Rosenberg <esr+lus...@mail.hebrew.edu > wrote: > Hi all, > Our MDT suffered a kernel panic (which I will post separately), the OSSs > stayed alive but the MDT was out for some time while nodes still tried to > interact with lustre. > > So I have several questions: > a. what happens to processes/reading writing during such an event (if they > already have handles on the OSS for instance that makes a difference)? I > noticed several of our compute-nodes ended up filling their swap/RAM so I > assume some level of caching is happening until the MDT returns.... > > b. what is the best/proper procedure now to ensure filesystem integrity? > Should I take the filesystem offline and run an lfsck first on the MDT > then on the OSS? > > Most documents I can find with google on the subject are spread over the > various old wikis so it is not clear to me how relevant they are.... > Thanks, > Eli > > Specs: > Server OS: CentOS 6.4 + lustre 2.5.3 from RPMs (1 MGS/MDS + 3 OSS) > Clients: Debian testing/unstable, kernel 4.2.8 + lustre 2.8.0 built from > source. > Network: Infiniband FDR (o2ib) >
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org