I have 3 servers with replica 3 volumes, 4 bricks per server on lvm partitions that are placed on each of 4 hard drives, 15 volumes resulting in 60 bricks per server. One of my servers is also a kvm host running (only) 24 vms.

Each vm image is only 6 gig, enough for the operating system and applications and is hosted on one volume. The data for each application is hosted on its own GlusterFS volume.

For mysql, I set up my innodb store to use 4 files (I don't do 1 file per table), each file distributes to each of the 4 replica subvolumes. This balances the load pretty nicely.

I don't really do anything special for anything else, other than the php app recommendations I make on my blog (http://joejulian.name) which all have nothing to do with the actual filesystem.

The thing that I think some people (even John Mark) miss apply is that this is just a tool. You have to engineer a solution using the tools you have available. If you feel the positives that GlusterFS provides outweigh the negatives, then you will simply have to engineer a solution that suits your end goal using this tool. It's not a question of whether it works, it's whether you can make it work for your use case.

On 12/27/2012 03:00 PM, Miles Fidelman wrote:
Ok... now that's diametrically the opposite response from Dan Cyr's of a few minutes ago.

Can you say just a bit more about your configuration - how many nodes, do you have storage and processing combined or separated, how do you have your drives partitioned, and so forth?

Thanks,

Miles


Joe Julian wrote:
Trying to return to the actual question, the way I handle those is to mount gluster volumes that host the data for those tasks from within the vm. I've done that successfully since 2.0 with all of those services.

The limitations that others are expressing have as much to do with limitations placed on their own designs as with their hardware. Sure, there are other less stable and/or scalable systems that are faster, but with proper engineering you should be able to build a system that meets those design requirements.

The one piece that wasn't there before but now is in 3.3 is the "locking and performance problems during disk rebuilds" which is now done at a much more granular level and I have successfully self-healed several vm images simultaneously while doing it on all of them without any measurable delays.

Miles Fidelman <[email protected]> wrote:

    Joe Julian wrote:

        It would probably be better to ask this with end-goal
        questions instead of with a unspecified "critical feature"
        list and "performance problems".


Ok... I'm running a 2-node cluster that's essentially a mini cloud stack - with storage and processing combined on the same boxes. I'm running a
    production VM that hosts a mail server, list server, web server, and
    database; another production VM providing a backup server for the
cluster and for a bunch of desktop machines; and several VMs used for a
    variety of development and testing purposes. It's all backed by a
    storage stack consisting of linux raid10 -> lvm -> drbd, and uses
    pacemaker for high-availability failover of the
    production VMs.  It all
performs reasonably well under moderate load (mail flows, web servers
    respond, database transactions complete, without notable user-level
delays; queues don't back up; cpu and io loads stay within reasonable
    bounds).

    The goals are to:
- add storage and processing capacity by adding two more nodes - each
    consisting of several CPU cores and 4 disks each
    - maintain the flexibility to create/delete/migrate/failover virtual
    machines - across 4 nodes instead of 2
- avoid having to play games with pairwise DRBD configurations by moving
    to a clustered filesystem
- in essence, I'm looking to do what Sheepdog purports to do, except in
    a Xen environment

    Earlier versions of gluster had reported problems with:
    - supporting databases
    - supporting VMs
    - locking and performance problems during disk rebuilds
    - and... most of the gluster documentation implies that it's
    preferable
    to separate storage nodes from processing nodes

It looks like Gluster 3.2 and 3.3 have addressed some of these issues, and I'm trying to get a general read on whether it's worth putting in the effort of moving forward with some experimentation, or whether this
    is a non-starter.  Is there anyone out there who's tried to run this
    kind of mini-cloud with gluster?  What kind of results have you had?



        On 12/26/2012 08:24 PM, Miles Fidelman wrote:

            Hi Folks, I find myself trying to expand a 2-node
            high-availability cluster from to a 4-node cluster. I'm
            running Xen virtualization, and currently using DRBD to
            mirror data, and pacemaker to failover cleanly. The thing
            is, I'm trying to add 2 nodes to the cluster, and DRBD
            doesn't scale. Also, as a function of rackspace limits,
            and the hardware at hand, I can't separate storage nodes
            from compute nodes - instead, I have to live with 4 nodes,
            each with 4 large drives (but also w/ 4 gigE ports per
            server). The obvious thought is to use Gluster to assemble
            all the drives into one large storage pool, with
            replication. But.. last time I looked at this (6 months or
            so back), it looked like some of the critical features
            were brand new, and performance seemed to be a problem in
            the configuration I'm thinking of. Which leads me to my
            question: Has the situation improved to the point that I
            can use Gluster this way? Thanks very much, Miles Fidelman

------------------------------------------------------------------------
        Gluster-users mailing list [email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users





_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to