Re: [Gluster-users] How to correctly distribute OpenStack VM files...

Gowrishankar Rajaiyan Mon, 05 Aug 2013 06:02:40 -0700

On 08/02/2013 06:22 AM, Xavier Trilla wrote:

Hi,
We have been playing for a while with GlusterFS (Now with ver 3.4). Weare running tests and playing with it to check if GlusterFS can bereally used as the distributed storage for OpenStack block storage(Cinder) as new features in KVM, GlusterFS and OpenStack are pointingto GlusterFS as the future of OpenStack open source block and objectstorage.
But we've found a problem just when we started playing withGlusterFS... The way distribute translator (DHT) balances the load. Imean, we understand and see the benefits of metadata less setup. Usinghashes based on filenames and assigning a hash range to each brick isclever, reliable and fast, but from our understanding there is a bigproblem when it comes to storing VM images of a OpenStack deployment.
I mean, OpenStack Block Storage (Cinder) assigns a name to each volumeit creates (GUID), so GlusterFS does a hash of the filename anddecides in which brick it should be stored. But as in this scenario wedon't have many files (I mean, we would just have one big file per VM)we may end with a really unbalanced storage.
Let's say we have a 4 bricks setup with DHT distribute, and we want tostore 100 VMs there, so the ideal scenario would be:
Brick1: 25 VMs

Brick2: 25 VMs

Brick3: 25 VMs

Brick4: 25 VMs
As VMs are IO intensive it's really important to correctly balance theload, as each brick has a limited amount of IOPS, but as DHT is justbased on a filename HASH, we could end with something like thefollowing scenario (Or even worse):
Brick1: 50 VMs

Brick2: 10 VMs

Brick3: 35 VMs

Brick4: 5 VMs
And if we scale this out, things may get even worse. I mean, we mayend with almost all VM file in one or two bricks and all the otherbricks almost empty. And if we use growing VM disk image files likeqcow2 the option "min-free-disk" will not prevent all VMs disk imagefiles being stored in the same brick. So, I understand DHT works wellfor large amount of small files, but for few big IO intensive filesdoesn't seem to be a really good solution... (I mean, we are lookingfor a solution able to handle around 32 bricks and around 1500 VM forthe initial deployment and able to scale up to 256 bricks and 12000VMs :/ )
So, anybody has a suggestion about how to handle this? I mean so farwe only see two options: Either using legacy unify translator with ALUscheduler or either use cluster/stripe translator with a bigblock-size so at least load gets balanced across all bricks in someway. But obviously we don't like unify as it needs a namespace brick,and using stripping seems to have an impact on performance and reallycomplicates backup/restore/recovery strategies.

Another suggestion that you may want to try is, have your GlusterFS nodealso serve as OpenStack Cinder and use NUFA[1]


~shanks

[1]http://gluster.org/community/documentation/index.php/Translators/cluster/nufa

_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to correctly distribute OpenStack VM files...

Reply via email to