Re: [gpfsug-discuss] Mitigating Poor Small-file I/O Performance

Alex Chekholko Wed, 21 May 2014 13:06:13 -0700

Hi Stewart,

First, a good simple reproducible benchmark is fdtree:
https://computing.llnl.gov/?set=code&page=sio_downloads


Something simple like this should take a min or two:
  bash fdtree.bash  -l 3 -s 64

Or the same exact small run can take up to hours on a slow system.

For GPFS, since it's a clustered filesystem, first you need to make sureyou're looking at the aggregate performance and not just on one client.Perhaps your filesystem is performing great, but it's maxed out atthat moment when you run your test from your single client. So you needto be able to monitor the disk system.

In general, the answer to your question is, in order of simplicity: addmore spindles, possibly also separate the metadata out to separatestorage, possibly make your filesystem block size smaller.

The first you can do by adding more hardware, the second is easier whenyou design your whole system, though possible to do on a runningfilesystem. The third can only be done at filesystem creation.

For "small files", how "small" is "small". I guess generally we meansmaller than filesystem block size.


Regards,
Alex


On 5/20/14, 7:17 AM, Howard, Stewart Jameson wrote:

Hi All,

My name is Stewart Howard and I work for Indiana University as an admin
on a two-site replicated GPFS cluster.  I'm a new member of this mailing
list and this is my first post  :)

Recently, we've discovered that small-file performance on our system is
pretty lack-luster.  For comparison, here are some numbers:

1)  When transferring large files (~2 GB), we get outstanding
performance and can typically saturate the client's network connection.
We generally see about 490 MB/s over a 10Gb line, which should be about
right, given that we lose half of our bandwidth to replication.

2)  When transferring a large number of small files, we get a very poor
transfer rate, generally on the order of 2 MB/s, writing from a client
node *inside* the GPFS cluster.

I'm wondering if anyone else has experience with similar performance
issues and what ended up being the cause/solution.  Also, I would be
interested in hearing any general rules-of-thumb that the group has
found helpful in balancing performance between large-file and small-file
I/O.

We have gathered some diagnostic information while performing various
small-file I/O operations, as well as a variety of metadata operations
in quick succession.  I'd be happy to share results of the diagnostics,
if it would help provide context.

Thank you so much for all of your help!

Stewart Howard
Indiana University




_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


--
[email protected] 347-401-4860
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] Mitigating Poor Small-file I/O Performance

Reply via email to