On Sun, Sep 12, 2010 at 3:20 AM, Henrique Haas <[email protected]> wrote: > Hello Jacob, > > Greater block sizes gave me much much better results, about *58MB/s* on a > 1GigE !!!! > So.. my concern now is about smaller files be shared using Gluster. > Any tunning tips for these kind of files (I'm using Ext4 and Gluster 3.0.2)?
"dd" won't give you accurate results for testing file copies. Your slow writes with small block sizes are more likely to high I/O and read starve on the client side than the server/write side. You should test something more real world instead. For instance: for i in `seq 1 1000000` ; do dd if=/dev/urandom of=$i bs=1K count=1 ; done That will create 1,000,000 1KB files (1GB of information) with random data on your local hard disk in the current directory. Most file systems store 4K blocks, so actual disk usage will be 4GB. Now copy/rsync/whatever these files to your Gluster storage. (use a command like "time cp /blah/* /mnt/gluster/" to wallclock it). Now tar up all the files, and do the copy again using the single large tar file. Compare your results. >From here, tune your performance translators: http://www.gluster.com/community/documentation/index.php/Translators/performance/stat-prefetch http://www.gluster.com/community/documentation/index.php/Translators/performance/quick-read http://www.gluster.com/community/documentation/index.php/Translators/performance/io-cache http://www.gluster.com/community/documentation/index.php/Translators/performance/quick-read http://www.gluster.com/community/documentation/index.php/Translators/performance/writebehind http://www.gluster.com/community/documentation/index.php/Translators/performance/readahead http://www.gluster.com/community/documentation/index.php/Translators/performance/io-threads Some of these translators will aggregate smaller I/Os into larger blocks to improve read/write performance. The links above explain what each one does. My advice is to take the defaults created by glusterfs-volgen and increment the values slowly on the relevant translators (note that bigger doesn't always equal better - you'll find a sweet spot where performance maxes out, and then most likely reduces again once values get too big). And then continue testing. Repeat for 4K, 16K, 32K files if you like (or a mix of them) to match what sort of data you'd expect on your file system (or better yet, use real world data if you have it lying around already). Also, if you don't need atime (last access time) information on your files, consider mounting the ext4 file system on the storage bricks with the "noatime" option. This can save unnecessary I/O on regularly accessed files (I use this a lot on both clustered file systems as well as virtual machine disk images and database files that get touched all the time by multiple systems to reduce I/O). Hope that helps. -Dan _______________________________________________ Gluster-users mailing list [email protected] http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
