If you can handle the load without the three machines, and you are still 
meeting your redundancy requirements removing them may make your life easier. 
Otherwise you have to consider that your cluster is made up of machines with 
the worst parts from all of the nodes (i.e. lowest memory, slowest cpu etc). 

Cheers
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/01/2012, at 5:42 AM, Daning Wang wrote:

> Thank you guys. very appreciated.
> 
> How about just pulling the slow machines out of cluster? I think the most of 
> reads should already from fast machine right now because of dynamic snitch. 
> so removing two machines should not add much loads on the remaining nodes.
> 
> How do you think?
> 
> Thanks,
> 
> Daning
> 
> On Wed, Jan 11, 2012 at 1:34 PM, Antonio Martinez <antyp...@gmail.com> wrote:
> There is another possible approach that I reference from the original Dynamo 
> paper. Instead of trying to manage a heterogeneous cluster at the cassandra 
> level, it might be possible to take the approach Amazon took. Find the 
> smallest common denominator of resource for your nodes(most likely your 
> smallest node) and virtualize the others to that level. For example, say you 
> have 3 physical computers, one with one processor and 2gb of memory, one with 
> 2 processors and 4gb, and one with 4 and 8gb. You could make the smallest one 
> your basic block and then put two one processor 2gb vm's on the second 
> machine and 4 of those on the third and largest machine. Then instead of 
> managing the three of them separately and worrying about them being different 
> you instead manage a ring of 7 equal nodes with equal portions of the ring. 
> This allows you to give smaller machines a lesser load compared to the more 
> powerful ones. The amazon paper on dynamo has more information on how they 
> did it and some of the tricks they use for reliability.  
> http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
> 
> Hope this helps somewhat
> 
> On Wed, Jan 11, 2012 at 2:00 PM, aaron morton <aa...@thelastpickle.com> wrote:
> I have good news and bad. 
> 
> The good news is I have a nice coffee. The bad news is it's pretty difficult 
> to have some nodes with less load. 
> 
> In a cluster with 5 nodes and RF 3 each node holds the following token 
> ranges. 
> 
> node1: node 1, 5 and 4
> node 2: node 2, 1, 5
> node 3: node 3, 2, 1
> node 4: node 4, 3, 2
> node 5: node 5, 4, 3
> 
> The load on each node is it's token range, and those of the preceding RF-1 
> nodes. e.g. In a balanced ring of 5 nodes with RF 3 each node has 20 % of the 
> token ring and 60% of the total load. 
> 
> if you split the token ring is split like this below each node has the total 
> load shown after the /
> 
> node 1: 12.5 %  / 50%
> node 2: 25 % / 62.5%
> node 3:  25 % / 62.5%
> node 4: 12.5 % / 62.5%
> node 5: 25% / 62.5 %
> 
> Only node 1 gets a small amount less. Try a different approach…
> 
> node 1: 12.5 %  / 62.5%
> node 2: 12.5 % / 50%
> node 3: 25 % / 50%
> node 4: 25 % / 62.5%
> node 5: 25 % / 75.5 %
> 
> That's even worse. 
> 
> David is right to use nodetool move. It's a good idea to update the initial 
> tokens in the yaml (or your ops condif) after the fact even though they are 
> not used. 
> 
> Hope that helps.
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 12/01/2012, at 8:41 AM, David McNelis wrote:
> 
>> Daning,
>> 
>> You can see how to do this basic sort of thing on the Wiki's operations page 
>> ( http://wiki.apache.org/cassandra/Operations )
>> 
>> In short, you'll want to run:
>> nodetool -h hostname move newtoken
>> 
>> Then, once you've update each of your tokens that you want to move, you'll 
>> want to run
>> nodetool -h hostname cleanup
>> 
>> That will remove the no-longer necessary tokens from your smaller machines.
>> 
>> Please note that someone else may have some better insights than I into 
>> whether or not  your strategy is going to be effective.  On the surface I 
>> think what you are doing is logical, but I'm unsure of the  actual 
>> performance gains you'll see.
>> 
>> David
>> 
>> On Wed, Jan 11, 2012 at 1:32 PM, Daning Wang <dan...@netseer.com> wrote:
>> Hi All,
>> 
>> We have 5 nodes cluster(on 0.8.6), but two machines are slower and have less 
>> memory, so the performance was not good  on those two machines for large 
>> volume traffic.I want to move some data from slower machine to faster 
>> machine to ease some load, the token ring will not be equally balanced.
>> 
>> I am thinking the following steps,
>> 
>> 1. modify cassandra.yaml to change the initial token.
>> 2. restart cassandra(don't need to auto-bootstrap, right?)
>> 3. then run nodetool repair,(or nodetool move?, not sure which one to use)
>> 
>> 
>> Is there any doc that has detailed steps about how to do this?
>> 
>> Thanks in advance,
>> 
>> Daning
>> 
>> 
> 
> 
> 
> 
> -- 
> Antonio Perez de Tejada Martinez
> 
> 

Reply via email to