Re: [Gluster-users] Horrible performance with small files (DHT/AFR)

Benjamin Krein Thu, 04 Jun 2009 13:25:34 -0700

Here are some more details with different configs:

* Only AFR between cfs1 & cfs2:
r...@dev1# time cp -rp * /mnt/


real    16m45.995s
user    0m1.104s
sys     0m5.528s

* Single server - cfs1:
r...@dev1# time cp -rp * /mnt/

real    10m33.967s
user    0m0.764s
sys     0m5.516s

* Stats via bmon on cfs1 during above copy:

# Interface RX Rate RX # TXRate TX #──────────────────────────────────────────────────────────────────────────────────────

cfs1 (source: local)

0 eth1 951.25KiB 1892254.00KiB 1633

It gets progressively better, but that's still a *long* way from <2min times with scp & <1 min times with rsync! And, I have noredundancy or distributed hash whatsoever.


* Client config for the last test:
-----
# Webform Flat-File Cache Volume client configuration

volume srv1
        type protocol/client
        option transport-type tcp
        option remote-host cfs1
        option remote-subvolume webform_cache_brick
end-volume

volume writebehind
        type performance/write-behind
        option cache-size 4mb
        option flush-behind on
        subvolumes srv1
end-volume

volume cache
        type performance/io-cache
        option cache-size 512mb
        subvolumes writebehind
end-volume
-----

Ben

On Jun 3, 2009, at 4:33 PM, Vahriç Muhtaryan wrote:

For better understanding issue did you try 4 servers DHT only or 2serversDHT only or two servers replication only for find out real problemmaybe

replication or dht could have a bug ?

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Benjamin Krein
Sent: Wednesday, June 03, 2009 11:00 PM
To: Jasper van Wanrooy - Chatventure
Cc: [email protected]

Subject: Re: [Gluster-users] Horrible performance with small files(DHT/AFR)


The current boxes I'm using for testing are as follows:

 * 2x dual-core Opteron ~2GHz (x86_64)
 * 4GB RAM
 * 4x 7200 RPM 73GB SATA - RAID1+0 w/3ware hardware controllers

The server storage directories live in /home/clusterfs where /home is
an ext3 partition mounted with noatime.

These servers are not virtualized.  They are running Ubuntu 8.04 LTS
Server x86_64.

The files I'm copying are all <2k javascript files (plain text) stored
in 100 hash directories in each of 3 parent directories:

/home/clusterfs/
  + parentdir1/
  |   + 00/
  |   | ...
  |   + 99/
  + parentdir1/
  |   + 00/
  |   | ...
  |   + 99/
  + parentdir1/
      + 00/
      | ...
      + 99/

There are ~10k of these <2k javascript files distributed throughout
the above directory structure totaling approximately 570MB.  My tests
have been copying that entire directory structure from a client
machine into the glusterfs mountpoint on the client.

Observing IO on both the client box & all the server boxes via iostat
shows that the disks are doing *very* little work.  Observing the CPU/
memory load with top or htop shows that none of the boxes are CPU or
memory bound.  Observing the bandwidth in/out of the network interface
shows <1MB/s throughput (we have a fully gigabit LAN!) which usually
drops down to <150KB/s during the copy.

scp'ing the same directory structure from the same client to one of
the same servers will work at ~40-50MB/s sustained as a comparison.
Here is the results of copying the same directory structure using
rsync to the same partition:

# time rsync -ap * b...@cfs1:~/cache/
b...@cfs1's password:

real    0m23.566s
user    0m8.433s
sys     0m4.580s

Ben

On Jun 3, 2009, at 3:16 PM, Jasper van Wanrooy - Chatventure wrote:

Hi Benjamin,

That's not good news. What kind of hardware do you use? Is it
virtualised? Or do you use real boxes?
What kind of files are you copying in your test? What performance do
you have when copying it to a local dir?

Best regards Jasper

----- Original Message -----
From: "Benjamin Krein" <[email protected]>
To: "Jasper van Wanrooy - Chatventure" <[email protected]>
Cc: "Vijay Bellur" <[email protected]>, [email protected]
Sent: Wednesday, 3 June, 2009 19:23:51 GMT +01:00 Amsterdam /
Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: [Gluster-users] Horrible performance with small files
(DHT/AFR)

I reduced my config to only 2 servers (had to donate 2 of the 4 to
another project).  I now have a single server using DHT (for future
scaling) and AFR to a mirrored server.  Copy times are much better,
but still pretty horrible:

# time cp -rp * /mnt/

real    21m11.505s
user    0m1.000s
sys     0m6.416s

Ben

On Jun 3, 2009, at 3:13 AM, Jasper van Wanrooy - Chatventure wrote:

Hi Benjamin,

Did you also try with a lower thread-count. Actually I'm using 3
threads.

Best Regards Jasper


On 2 jun 2009, at 18:25, Benjamin Krein wrote:

I do not see any difference with autoscaling removed.  Current
server config:

# webform flat-file cache

volume webform_cache
type storage/posix
option directory /home/clusterfs/webform/cache
end-volume

volume webform_cache_locks
type features/locks
subvolumes webform_cache
end-volume

volume webform_cache_brick
type performance/io-threads
option thread-count 32
subvolumes webform_cache_locks
end-volume

<<snip>>

# GlusterFS Server
volume server
type protocol/server
option transport-type tcp
subvolumes dns_public_brick dns_private_brick webform_usage_brick
webform_cache_brick wordpress_uploads_brick subs_exports_brick
option auth.addr.dns_public_brick.allow 10.1.1.*
option auth.addr.dns_private_brick.allow 10.1.1.*
option auth.addr.webform_usage_brick.allow 10.1.1.*
option auth.addr.webform_cache_brick.allow 10.1.1.*
option auth.addr.wordpress_uploads_brick.allow 10.1.1.*
option auth.addr.subs_exports_brick.allow 10.1.1.*
end-volume

# time cp -rp * /mnt/

real    70m13.672s
user    0m1.168s
sys     0m8.377s

NOTE: the above test was also done during peak hours when the LAN/
dev server were in use which would cause some of the extra time.
This is still WAY too much, though.

Ben


On Jun 1, 2009, at 1:40 PM, Vijay Bellur wrote:

Hi Benjamin,

Could you please try by turning autoscaling off?

Thanks,
Vijay

Benjamin Krein wrote:

I'm seeing extremely poor performance writing small files to a
glusterfs DHT/AFR mount point. Here are the stats I'm seeing:

* Number of files:
r...@dev1|/home/aweber/cache|# find |wc -l
102440

* Average file size (bytes):
r...@dev1|/home/aweber/cache|# ls -lR | awk '{sum += $5; n++;}
END {print sum/n;}'
4776.47

* Using scp:
r...@dev1|/home/aweber/cache|# time scp -rp * b...@cfs1:~/cache/

real 1m38.726s
user 0m12.173s
sys 0m12.141s

* Using cp to glusterfs mount point:
r...@dev1|/home/aweber/cache|# time cp -rp * /mnt

real 30m59.101s
user 0m1.296s
sys 0m5.820s

Here is my configuration (currently, single client writing to 4
servers (2 DHT servers doing AFR):

SERVER:

# webform flat-file cache

volume webform_cache
type storage/posix
option directory /home/clusterfs/webform/cache
end-volume

volume webform_cache_locks
type features/locks
subvolumes webform_cache
end-volume

volume webform_cache_brick
type performance/io-threads
option thread-count 32
option max-threads 128
option autoscaling on
subvolumes webform_cache_locks
end-volume

<<snip>>

# GlusterFS Server
volume server
type protocol/server
option transport-type tcp
subvolumes dns_public_brick dns_private_brick webform_usage_brick
webform_cache_brick wordpress_uploads_brick subs_exports_brick
option auth.addr.dns_public_brick.allow 10.1.1.*
option auth.addr.dns_private_brick.allow 10.1.1.*
option auth.addr.webform_usage_brick.allow 10.1.1.*
option auth.addr.webform_cache_brick.allow 10.1.1.*
option auth.addr.wordpress_uploads_brick.allow 10.1.1.*
option auth.addr.subs_exports_brick.allow 10.1.1.*
end-volume

CLIENT:

# Webform Flat-File Cache Volume client configuration

volume srv1
type protocol/client
option transport-type tcp
option remote-host cfs1
option remote-subvolume webform_cache_brick
end-volume

volume srv2
type protocol/client
option transport-type tcp
option remote-host cfs2
option remote-subvolume webform_cache_brick
end-volume

volume srv3
type protocol/client
option transport-type tcp
option remote-host cfs3
option remote-subvolume webform_cache_brick
end-volume

volume srv4
type protocol/client
option transport-type tcp
option remote-host cfs4
option remote-subvolume webform_cache_brick
end-volume

volume afr1
type cluster/afr
subvolumes srv1 srv3
end-volume

volume afr2
type cluster/afr
subvolumes srv2 srv4
end-volume

volume dist
type cluster/distribute
subvolumes afr1 afr2
end-volume

volume writebehind
type performance/write-behind
option cache-size 4mb
option flush-behind on
subvolumes dist
end-volume

volume cache
type performance/io-cache
option cache-size 512mb
subvolumes writebehind
end-volume

Benjamin Krein
www.superk.org




_______________________________________________
Gluster-users mailing list
[email protected]
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users







_______________________________________________
Gluster-users mailing list
[email protected]
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users



_______________________________________________
Gluster-users mailing list
[email protected]
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users



_______________________________________________
Gluster-users mailing list
[email protected]
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] Horrible performance with small files (DHT/AFR)

Reply via email to