To answer Franco's post, my guess is that accesses will be pretty evenly 
balanced among the bricks.   There may be a slight bias to newer data, but our 
experience is that it's only a slight bias.

I really appreciate all of the great posts from everyone today.  I will 
consider what you have all said.  I'll continue to watch to see if there are 
any more comments.

Thanks again

Scott

-----Original Message-----
From: Franco Broi [mailto:[email protected]] 
Sent: Wednesday, December 11, 2013 10:09 PM
To: Scott Smith
Cc: [email protected]
Subject: Re: [Gluster-users] Is Gluster the wrong solution for us?

If you are only adding disk space and don't necessarily need to increase 
bandwidth, then you wont need to rebalance. It's only a problem if you are 
adding clients and your most frequently accessed files are all on the same 
brick.

On Thu, 2013-12-12 at 04:49 +0000, Scott Smith wrote: 
> Pretty much, our files are never deleted.  We just keep adding more 
> information.  Think of them as write once, read multiple, delete never.
> 
> -----Original Message-----
> From: Franco Broi [mailto:[email protected]]
> Sent: Wednesday, December 11, 2013 7:31 PM
> To: Scott Smith
> Cc: [email protected]
> Subject: Re: [Gluster-users] Is Gluster the wrong solution for us?
> 
> 
> How long-lived are your files? We have 400TB and are just about to double 
> that but have decided not to rebalance the data, instead we are hoping that 
> the disks will rebalance naturally through attrition and not waste any 
> valuable time or bandwidth moving data around.
> 
> On Thu, 2013-12-12 at 01:15 +0000, Scott Smith wrote: 
> > We are about to abandon GlusterFS as a solution for our object 
> > storage needs.  I’m hoping to get some feedback to tell me whether 
> > we have missed something and are making the wrong decision.  We’re 
> > already a year into this project after evaluating a number of 
> > solutions.  I’d like not to abandon GlusterFS if we just misunderstand how 
> > it works.
> > 
> >  
> > 
> > Our use case is fairly straight forward.  We need to save a bunch of 
> > somewhat large files (1MB-100MB).  For the most part, these files 
> > are write once, read several times.  Our initial store is 80TB, but 
> > we expect to go to roughly 320TB fairly quickly.  After that, we 
> > expect to be adding another 80TB every few months.  We are using 
> > some COTS servers which we add in pairs; each server has 40TB of usable 
> > storage.
> > We intend to keep two copies of each file.  We currently run 4TB 
> > bricks
> > 
> >  
> > 
> > In our somewhat limited test environment, GlusterFS seemed to work 
> > well.  And, our initial introduction of GlusterFS into our 
> > production environment went well.  We had our initial 2 server 
> > (80TB) cluster about 50% full and things seemed to be going well.
> > 
> >  
> > 
> > Then we added another pair of servers (for a total of 160TB).  This 
> > went fine until we did the rebalance.  We were running 3.3.1.  We 
> > ran into the handle leak problem (which unfortunately we didn’t know 
> > about beforehand).  We also found that if any of the bricks went 
> > offline while the rebalance was going on, then files were lost or 
> > they lost their permissions.  We still don’t know why some of the 
> > bricks went offline, but they did and we have verified in our test 
> > environment that this is sufficient to cause the corruption problem.
> > 
> >  
> > 
> > The good news is that we think both of these problems got fixed in 
> > 3.4.1.  So why are we leaving?
> > 
> >  
> > 
> > In trying to figure out what was going on with our GlusterFS system 
> > after the disastrous rebalance, we ran across two posts.  The first 
> > one was 
> > http://hekafs.org/index.php/2012/03/glusterfs-algorithms-distribution/.  If 
> > we understand it correctly, anytime you add new storage servers to your 
> > cluster, you have to do a rebalance and that rebalance will require a 
> > minimum of 50% of the data in the cluster to be moved to make the hashing 
> > algorithms work.  This means that when we have a 320TB cluster and add 
> > another 80TB, we have to move at least 160TB just to get things back into 
> > balance.  Our estimate is that that will take months.  It probably won’t 
> > finish before we need to add another 80TB.
> > 
> >  
> > 
> > The other post we ran across was
> > http://www.gluster.org/community/documentation/index.php/Planning34/ElasticBrick.
> >   This post seems to confirm our understanding of the rebalance.  It 
> > appears to be a discussion of the rebalance problem and a possible 
> > solution.  It was apparently discussed for 3.4, but didn’t make the cut.  
> > 
> >  
> > 
> > I’d be happy to find out that we just got it wrong.  Tell me that 
> > rebalancing doesn’t work the way we think.  Or maybe we should 
> > configure things different or something.
> > 
> >  
> > 
> > My problem is that if GlusterFS isn’t good for starting with a small 
> > cluster (80TB) and growing over time to half a petabyte, what is the 
> > use case it is intended for?  Do you really have to start out with 
> > the amount of storage you think you’ll need in the long-run and just 
> > fill it up as you go?  That’s why I’m nervous about our 
> > understanding of the rebalance.  It’s hard to believe it works this 
> > way (at least from our perspective).
> > 
> >  
> > 
> > We have a lot of man hours into writing code and putting 
> > infrastructure in for GlusterFS.  We can likely reuse much of it for 
> > another system.  I would just like to know that we really do 
> > understand the rebalance and that it really works the way I 
> > described it before we start evaluating other object store solutions.
> > 
> >  
> > 
> > Comments?
> > 
> >  
> > 
> > Scott
> > 
> >  
> > 
> >  
> > 
> > 
> > _______________________________________________
> > Gluster-users mailing list
> > [email protected]
> > http://supercolony.gluster.org/mailman/listinfo/gluster-users
> 
> 


_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to