It is more of a cautious thing. The MDS/MGS kernel panicked a few times in as 
many days. The first couple were under heavy load due to a user. But when I was 
bringing it back up, I ran e2fsk on all the targets and had some corruption 
that was fixed. But then the MGS/MDS kernel panicked as soon as I mounted the 
MGT and MDT. Hadn't even mounted any OSTs.
So to be careful, I have the filesystem offline and started running the e2fsck 
--mdsdb on the MDT 
It is writing to local disk, so the slowness shouldn't be due to that. It's 
even an SSD.
It is pretty confusing that it is taking so long tho. I see one CPU that is 
pretty much pegged at >90% and the mdsdb file does grow, albeit very slowly 
(like 6 hours before a few bytes are written to it).

Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238



-----Original Message-----
From: Dilger, Andreas [mailto:[email protected]] 
Sent: Monday, May 26, 2014 12:15 PM
To: Andrus, Brian Contractor
Cc: [email protected]
Subject: Re: [Lustre-discuss] E2fsck running for a week so far...

On 2014/05/26, 9:03 AM, "Andrus, Brian Contractor" 
<[email protected]<mailto:[email protected]>> wrote:

Is it normal for e2fsck running on an MDT with --msdb to take over a week?
The entire MDT is only 500GB.

This is limited by the performance of the database that e2fsck is using for the 
mdsdb.  If this is stored on e.g. NFS, and the database is large, then it will 
slow to a crawl.

Typically I don't recommend users to run the old lfsck unless there is a huge 
amount of corruption that needs to be fixed.  Most of the problems it fixes can 
also be fixed in a different manner.

What problem are you having?

Cheers, Andreas

So far it has only output:


e2fsck 1.42.7.wc2 (07-Nov-2013)
WORK=MDT0000 lustre database creation, check forced.
Pass 1: Checking inodes, blocks, and sizes
MDS: ost_idx 0 max_id 6351370
MDS: ost_idx 1 max_id 5766664
MDS: ost_idx 2 max_id 5821326
MDS: ost_idx 3 max_id 5720490
MDS: ost_idx 4 max_id 2889092
MDS: ost_idx 5 max_id 2654116
MDS: ost_idx 6 max_id 2805220
MDS: ost_idx 7 max_id 2895847
MDS: ost_idx 8 max_id 2932156
MDS: ost_idx 9 max_id 2777382
MDS: ost_idx 10 max_id 2764932
MDS: ost_idx 11 max_id 2655203
MDS: ost_idx 12 max_id 2742542
MDS: ost_idx 13 max_id 2856457
MDS: got 112 bytes = 14 entries in lov_objids
MDS: max_files = 32837426
MDS: num_osts = 14
mds info db file written
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information Pass 6: Acquiring MDT information 
for lfsck


Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238




Cheers, Andreas
--
Andreas Dilger
Lustre Software Architect
Intel High Performance Data Division
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to