Is there a FAQ/document somewhere with optimal mkfs and mount options
for ext4 and xfs? Is xfs still the 'desired' filesystem for gluster bricks?
On 3/15/12 3:22 AM, Brian Candler wrote:
On Wed, Mar 14, 2012 at 11:09:28PM -0500, D. Dante Lorenso wrote:
get 50-60 MB/s transfer speeds tops when sending large files (> 2GB)
to gluster. When copying a directory of small files, we get<= 1
MB/s performance!
My question is ... is this right? Is this what I should expect from
Gluster, or is there something we did wrong? We aren't using super
expensive equipment, granted, but I was really hoping for better
performance than this given that raw drive speeds using dd show that
we can write at 125+ MB/s to each "brick" 2TB disk.
Unfortunately I don't have any experience with replicated volumes, but the
raw glusterfs protocol is very fast: a single brick which is a 12-disk raid0
stripe can give 500MB/sec easily over 10G ethernet without any tuning.
I would expect a distributed volume to work fine too, as it just sends each
request to one of N nodes.
Striped volumes are unfortunately broken on top of XFS at the moment:
http://oss.sgi.com/archives/xfs/2012-03/msg00161.html
Replicated volumes, from what I've read, need to touch both servers even for
read operations (for the self-healing functionality), and that could be a
major bottleneck.
But there are a few basic things to check:
(1) Are you using XFS for the underlying filesystems? If so, did you mount
them with the "inode64" mount option? Without this, XFS performance sucks
really badly for filesystems>1TB
Without inode64, even untarring files into a single directory will make XFS
distribute them between AGs, rather than allocating contiguous space for
them.
This is a major trip-up and there is currently talk of changing the default
to be inode64.
(2) I have this in /etc/rc.local:
for i in /sys/block/sd*/bdi/read_ahead_kb; do echo 1024>"$i"; done
for i in /sys/block/sd*/queue/max_sectors_kb; do echo 1024>"$i"; done
If I can't get gluster to work, our fail-back plan is to convert
these 8 servers into iSCSI targets and mount the storage onto a
Win2008 head and continue sharing to the network as before.
Personally, I would rather us continue moving toward CentOS 6.2 with
Samba and Gluster, but I can't justify the change unless I can
deliver the performance.
Optimising replicated volumes I can't help with.
However if you make a simple RAID10 array on each server, and then join the
servers into a distributed gluster volume, I think it will rock. What you
lose is the high-availability, i.e. if one server fails a proportion of
your data becomes unavailable until you fix it - but that's no worse than
your iSCSI proposal (unless you are doing something complex, like drbd
replication between pairs of nodes and HA failover of the iSCSI target)
BTW, Linux md RAID10 with 'far' layout is really cool; for reads it performs
like a RAID0 stripe, and it reduces head seeking for random access.
Regards,
Brian.
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users