Re: [Lustre-discuss] ost's reporting full

2010-09-11 Thread Malcolm Cowe
  On 11/09/2010 19:27, Robin Humble wrote:
> Hey Dr Stu,
>
> On Sat, Sep 11, 2010 at 04:27:43PM +0800, Stuart Midgley wrote:
>> We are getting jobs that fail due to no space left on device.
>> BUT none of our lustre servers are full (as reported by lfs df -h on a 
>> client and by df -h on the oss's).
>> They are all close to being full, but are not actually full (still have 
>> ~300gb of space left)
> sounds like a grant problem.
>
>> I've tried playing around with tune2fs -m {0,1,2,3} and tune2fs -r 1024 etc 
>> and nothing appears to help.
>> Anyone have a similar problem?  We are running 1.8.3
> there are a couple of grant leaks that are fixed in 1.8.4 eg.
>https://bugzilla.lustre.org/show_bug.cgi?id=22755
> or see the 1.8.4 release notes.
>
> however the overall grant revoking problem is still unresolved AFAICT
>https://bugzilla.lustre.org/show_bug.cgi?id=12069
> and you'll hit that issue more frequently with many clients and small
> OSTs, or when any OST starts getting full.
>
> in your case 300g per OST should be enough headroom unless you have
> ~4k clients now (assuming 32-64m grants per client), so it's probably
> grant leaks. there's a recipe for adding up client grants and comparing
> them to server grants to see if they've gone wrong in bz 22755.
>
Per BZ 22755, comment #96 
(https://bugzilla.lustre.org/show_bug.cgi?id=22755#c96), you can arrest 
the grant leak by changing the "grant shrink interval" to a large value 
(if you want to reset the server side grant reservation, you will have 
to remount the OSTs). We have applied this work-around to our system 
with good results. We have been monitoring our file systems with Nagios 
and have not encountered a repeat of this problem.

Malcolm.

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Large directory performance

2010-09-11 Thread Michael Robbert

On Sep 10, 2010, at 5:32 PM, Bernd Schubert wrote:

> On Saturday, September 11, 2010, Andreas Dilger wrote:
>> On 2010-09-10, at 12:11, Michael Robbert wrote:
>>> Create performance is a flat line of ~150 files/sec across the board.
>>> Delete performance is all over the place, but no higher than 3,000
>>> files/sec... Then yesterday I was browsing the Lustre Operations Manual
>>> and found section 33.8 that says Lustre is tested with directories as
>>> large as 10 million files in a single directory and still get lookups at
>>> a rate of 5,000 files/sec. That leaves me wondering 2 things. How can we
>>> get 5,000 files/sec for anything and why is our performance dropping off
>>> so suddenly at after 20k files?
>>> 
>>> Here is our setup:
>>> All IO servers are Dell PowerEdge 2950s. 2 8-core sockets with X5355  @
>>> 2.66GHz and 16Gb of RAM. The data is on DDN S2A 9550s with 8+2 RAID
>>> configuration connected directly with 4Gb Fibre channel.
>> 
>> Are you using the DDN 9550s for the MDT?  That would be a bad
>> configuration, because they can only be configured with RAID-6, and would
>> explain why you are seeing such bad performance.  For the MDT you always
> 
> Unfortunately, we failed to copy the scratch MDT in a reasonable time so far. 
> Copying several hundreds of million files turned out to take ages ;) But I 
> guess Mike did the benchmarks for the other filesystem with an EF3010.

The benchmarks listed above are for our scratch filesystem, whose MDT is on the 
9550. I don't know why I didn't mention the benchmarks that I also ran on our 
home filesystem whose MDT was recently moved to the EF3010 with RAID 1+0 on 6 
SAS disks. The other 6 disks in the EF3010 are waiting for when we can move the 
scratch MDT there. Anyways, the benchmarks on home were actually worse. Create 
performance was about the same, but read performance was in the low hundreds. 
The command line was:
./bonnie++ -d $dir -s 0 -n $size:4:4:1
Where $dir was a directory on the filesystem being tested and $size was the 
number of files in thousands (5, 10, 20, 30)

A dd of the MDT wasn't possible because the original LUN was nearly 5Tb (only 
35Gb used), but the new LUN is just over 1Tb.

> 
>>> We have as many as 1.4 million files in a single directory and we now
>>> have half a billion files that we need to deal with in one way or
>>> another.
> 
> Mike, is there a chance you can try which rate acp reports?
> 
> http://oss.oracle.com/~mason/acp/
> 
> Also could you please send me your exact bonnie line or script? We could try 
> to reproduce it on and idle test 9550 with a 6620 for metada (the 6620 is 
> slower for that than the ef3010).

I have downloaded and compiled acp. I have started a copy of one of 1.6 million 
file directories. After 1 hour it is still reading files from a top level 
directory with only 122k files and hasn't written anything. The only option 
used on the command line was -v so I could see what it was doing. 


Thanks,
Mike

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] ost's reporting full

2010-09-11 Thread Robin Humble
Hey Dr Stu,

On Sat, Sep 11, 2010 at 04:27:43PM +0800, Stuart Midgley wrote:
>We are getting jobs that fail due to no space left on device.
>BUT none of our lustre servers are full (as reported by lfs df -h on a client 
>and by df -h on the oss's).
>They are all close to being full, but are not actually full (still have ~300gb 
>of space left)

sounds like a grant problem.

>I've tried playing around with tune2fs -m {0,1,2,3} and tune2fs -r 1024 etc 
>and nothing appears to help.
>Anyone have a similar problem?  We are running 1.8.3

there are a couple of grant leaks that are fixed in 1.8.4 eg.
  https://bugzilla.lustre.org/show_bug.cgi?id=22755
or see the 1.8.4 release notes.

however the overall grant revoking problem is still unresolved AFAICT
  https://bugzilla.lustre.org/show_bug.cgi?id=12069
and you'll hit that issue more frequently with many clients and small
OSTs, or when any OST starts getting full.

in your case 300g per OST should be enough headroom unless you have
~4k clients now (assuming 32-64m grants per client), so it's probably
grant leaks. there's a recipe for adding up client grants and comparing
them to server grants to see if they've gone wrong in bz 22755.

cheers,
robin
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] ost's reporting full

2010-09-11 Thread Stuart Midgley
We are getting jobs that fail due to no space left on device.

BUT none of our lustre servers are full (as reported by lfs df -h on a client 
and by df -h on the oss's).

They are all close to being full, but are not actually full (still have ~300gb 
of space left)

I've tried playing around with tune2fs -m {0,1,2,3} and tune2fs -r 1024 etc and 
nothing appears to help.

Anyone have a similar problem?  We are running 1.8.3


-- 
Dr Stuart Midgley
sdm...@gmail.com



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss