Re: [Lustre-discuss] MDT extremely slow after restart

Thomas Roth Mon, 04 Apr 2011 00:08:10 -0700

Hi Cliff,

no, the configuration as such did not change. The hardware is quite  different, 
though. The old box
had a Raid10 on 16 built-in Raptor 150GB SATA disks, the new one a Raid10 on 24 
Cheetah 300GB SAS disks in an
fibre-channel attached external enclosure.
Actually we are more concerned that the 48 AMD cores of the new box might not  
have been the best idea.


But atm, the system is running fine and fast again!
After the last MDT restart, I started several ls-jobs crawling through the 
entire clustre. Obviously, after writeconfing all servers, Lustre really
has to "learn" again about the whereabouts of its files. And I found 
experimentally that it is about the knownlegde of the OSTs: in fact we have 
tried
our very old, repaired hardware as MDT while copying the MDT to yet another, 
third type of machine. The effect of "first very slow, then very fast
'ls'" was there. Then we shut down and started the third hardware, tried on new 
directories - same effect, tried on some already checked directories -
very fast. So using the old hardware had refreshed the memory of the OSTs about 
these directories.

All of this is to be expected to some degree, but the difference of minutes vs. 
mseconds is quite astonishing.

Ah well,  this cluster is also full to the brim, and the last time we had to 
writeconf the servers, there were certainly 20%-30% less files.

Cheers,
Thomas

On 04/04/2011 06:11 AM, Cliff White wrote:
> What is the underlying disk, did that hardware/RAID config change 
> when you switched hardware? 
> The 'still busy' message is a bug, may be fixed in 1.8.5
> cliffw
> 
> 
> On Sat, Apr 2, 2011 at 1:01 AM, Thomas Roth <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>     Hi all,
> 
>     we are suffering from a sever metadata performance degradation on our 
> 1.8.4 cluster and are pretty clueless.
>     - We moved the MDT to a new hardware, since the old one was failing
>     - We increased the size of the MDT with 'resize2fs' (+ mounted it and saw 
> all the files)
>     - We found the performance of the new mds dreadful
>     - We restarted the MDT on the old hardware with the failed RAID 
> controller replaced, but without doing anything with OSS or clients
>     The machine crashed three minutes after recovery was over
>     - Moved back to the new hardware, but the system was now pretty messed 
> up: persistent  "still busy with N RPCs" and some "going back to sleep"
>     messages (by the way, there is no way to find out what these RPCs are, 
> and how to kill them? Of course I wouldn't mind switching off some clients or
>     even rebooting some OSS if I only new which ones...)
>     - Shut down the entire cluster, writeconf, restart without any client 
> mounts - worked fine
>     - Mounted Lustre and tried to "ls" a directory with 100 files:   takes 
> several minutes(!)
>     - Being patient and then trying the same on a second client:     takes 
> msecs.
> 
>     I have done complete shutdowns before, lastly to upgrade from 1.6 to 1.8, 
> then without writeconf and without performance loss. Before to change the
>     IPs of all servers (moving into a subnet), with writeconf, but without 
> recollection of the metadata behavior afterwards.
>     It is clear that after writeconf some information has to be regenerated, 
> but this is really extreme - also normal?
> 
>     The MDT now behaves more like an xrootd master which makes first contact 
> to its file servers and has to read in the entire database (would be nice to
>     have in Lustre to regenerate the MDT in case of desaster ;-) ).
>     Which caches are being filled now when I ls through the cluster? May I 
> expect the MDT to explode once it has learned about a certain percentage of 
> the
>     system? ;-) I mean, we have 100 mio files now and the current MDT 
> hardware has just 32GB memory...
>     In any case this is not the Lustre behavior we are used to.
> 
>     Thanks for any hints,
>     Thomas
> 
>     _______________________________________________
>     Lustre-discuss mailing list
>     [email protected] <mailto:[email protected]>
>     http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> 
> 
> 
> -- 
> cliffw
> Support Guy
> WhamCloud, Inc. 
> www.whamcloud.com <http://www.whamcloud.com>
> 
> 

-- 
--------------------------------------------------------------------
Thomas Roth
Department: Informationstechnologie
Location: SB3 1.262
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1
64291 Darmstadt
www.gsi.de

Gesellschaft mit beschränkter Haftung
Sitz der Gesellschaft: Darmstadt
Handelsregister: Amtsgericht Darmstadt, HRB 1528

Geschäftsführung: Professor Dr. Dr. h.c. Horst Stöcker,
Dr. Hartmut Eickhoff

Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph
Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] MDT extremely slow after restart

Reply via email to