Hi All,
Just writing to pick everyone's brains in designing HA storage for virtual
machines using Gluster. What I'm hoping I'm able to achieve is a single
namespace where IOPS can be increased linearly by adding servers whilst still
being able to survive a single storage node going down. Basically I want to be
able to do a distributed Raid10 where I can increase the stripe as I increase
the number of storage servers to increase the IOPS.
What I've come up with so far is -
All Gluster communications done over 40Gbit Infiniband using the RDMA transport
(All of our servers are currently Infiniband enabled already)
Virtual hosts running KVM and storing VM images on a Gluster namespace using
the native client
Storage nodes contain 24x 7200rpm sata disks with SSD Read and Write Caches
- Disks configured as 2x Raid 6 arrays
- SSD cache to make up for the RAID 6 arrays lack of write speed
(highly experimental on linux, I haven't tested this yet so Raid10 may be a
better option)
Initially I am starting with two storage servers but I'd like to retain the
ability to scale out one at a time so each server would need to contain two
bricks to maintain replicas. The config would look something like -
Replicate sets
Server1:/brick1
Server2:/brick1
And
Server1:/brick2
Server2:/brick2
Then stripe the replicas sets to improve read and write performance. The
namespace should have the read speed of 4x striped Raid6 arrays and the write
speed of 2x striped Raid6 arrays.
The above seems to work in theory until you add another server. To do so you
would need to break the replication to re-replicate to a different brick and
also increase the stripe size to 3. Is this possible to do on the fly or at all?
The config would look like-
Replicate sets
Server1:/brick1
Server2:/brick1
And
Server2:/brick2
Server3:/brick2
And
Server3:/brick1
Server1:/brick2
If I have to add servers in multiples of two it probably won't be the end of
the world. It just means that IOPS won't scale for a single VM (given that the
stripe is always set at 2) but will scale for many VM's being spread over the
cluster.
Can anyone see any problems or improvements that could be made to the design?
Thanks
-Matt
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users