On Monday 04 of April 2011, Jonas Borgström wrote:
> I have a 6 node 0.7.4 cluster with replication_factor=3 where "nodetool
> repair keyspace" behaves really strange.
I think I am observing similar issue.
I have three 0.7.4 nodes with RF=3. 

After compaction I see about 7GB load in node but after running repair few 
times I see load rising [1] and enormouos data (larger few times than whole 
data) is being transferred [2].
Also I see lot of compactions, compacting a lot ( 33% or less) which should 
not happen, as my data loading pattern are only insertions (no updates, no 
deletions).
I have about 200k rows representing observable and columns representing daily 
values (now around 250 days maximum). Once once per day I insert new values to 
few CF.

Not sure if matters but:
Two nodes are in DC1 while 3rd in DC2. DCs are separated by slow internet/VPN 
link and thus size of data transferred is very important to me.
I see no problems with transferring inserted data over that link during 
insertions, only repair makes transfers so big, that repair of single node 
single CF is not able to finish in day.

I do not know how generation of merkle tree works. Is Cassandra implementation 
able to detect and transfer only whole rows ? Or row with new column added 
must be transfered as whole ? 


[1]:
192.168.3.5     Up     Normal  7.85 GB         50.00%  0
10.20.1.66      Up     Normal  13.7 GB         25.00%  
42535295865117307932921825928971026432
192.168.3.4     Up     Normal  12.13 GB        25.00%  
85070591730234615865843651857942052864

[2]:
 like twenty or more:
         progress=0/2086059022 - 0%
         progress=0/2085191380 - 0%


-- 
Mateusz Korniak

Reply via email to