Hi,

Using glusterfs 3.1.1 with a 4 node striped volume:
# gluster volume info
 
Volume Name: testvol
Type: Stripe
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: node20.storage.xx.nl:/data1
Brick2: node30.storage.xx.nl:/data1
Brick3: node40.storage.xx.nl:/data1
Brick4: node50.storage.xx.nl:/data1

To do some performance test I copied /usr to the gluster volume:
[[email protected] ~]# time rsync -avzx --quiet  /usr /gluster
real    5m54.453s
user    2m1.026s
sys     0m9.979s
[[email protected] ~]#

To see whether this operation was successful I check on the storage bricks the 
number of files, and used blocks. I expected these to be the same on all the 
bricks, because I use a striped configuration. The results are:

Number of files seen on the client:
[[email protected] ~]# find /gluster/usr -ls| wc -l
57517

Number of files seen on the storage bricks:
# mpssh -f s2345.txt 'find /data1/usr -ls | wc -l'                              
   
  [*] read (4) hosts from the list
  [*] executing "find /data1/usr -ls | wc -l" as user "root" on each
  [*] spawning 4 parallel ssh sessions
 
node20 -> 57517
node30 -> 55875
node40 -> 55875
node50 -> 55875

Why has node20 all the files, but the others seem to miss quiet a lot.

The same but now for the real used storage blocks:
On the client:
[[email protected] ~]# du -sk /gluster/usr                                    
                                  
1229448 /gluster/usr

On the storage bricks:
# mpssh -f s2345.txt 'du -sk /data1/usr'                                        
   
  [*] read (4) hosts from the list
  [*] executing "du -sk /data1/usr" as user "root" on each
  [*] spawning 4 parallel ssh sessions
 
node20 -> 1067784       /data1/usr
node30 -> 535124        /data1/usr
node40 -> 437896        /data1/usr
node50 -> 405920        /data1/usr

In total: 2446724

My conclusions:
- all data is written to the first brick. If files are smaller than the chunk 
size then there is nothing more to stripe. So the first storage brick fills up 
with all the small files. Question: Does the filesystem stop working if the 
volume of the first brick is full?

- when using striping, the overhead seems to be almost 50%. This can get worse 
when the first node fills up. Question: what is the size of the stripe chunk 
and can this be tuned for the average size of the files?

All in all, glusterfs seems to be better for "big" files. Is there an "average" 
file size for which glusterfs is a better choice?

Greetings

Peter Gotwalt

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to