Hello,

We have a 5-node cluster runing cassandra 1.2.16, with a significant amount of 
data:


Address        Rack        Status State   Load            Owns                
Token

                                                                              
6783174585269344219

10.198.xx.xx1  rack1       Up     Normal  2.59 TB         60.00%              
-9223372036854775808

10.198.xx.xx2  rack1       Up     Normal  1.49 TB         40.00%              
-5534023222112865485

10.198.xx.xx3  rack1       Up     Normal  2.18 TB         53.23%              
-1844674407370955162

10.198.xx.xx4  rack1       Up     Normal  2.86 TB         80.00%              
5534023222112865484

10.198.xx.xx5  rack1       Up     Moving  2.32 TB         66.77%              
6783174585269344219



The first three nodes (.xx1 - .xx3 above) were at the desired tokens, so I 
issued a move on .xx4:

nodetool move 1844674407370955161


That was about 40hrs ago!


When I do nodetool netstats, I do see apparent progress:


jatyler@xx4:~$ nodetool netstats

Mode: MOVING

Not sending any streams.

Streaming from: /10.198.xx.xx2

   SyncCore: /var/cassandra/data/SyncCore/file-ic-31475-Data.db sections=1 
progress=0/77699597 - 0%

…

   SyncCore: /var/cassandra/data/SyncCore/anotherFile-ic-32252-Data.db 
sections=1 progress=0/1254063427 - 0%

Read Repair Statistics:

Attempted: 8047367

Mismatch (Blocking): 97327

Mismatch (Background): 74369

Pool Name                    Active   Pending      Completed

Commands                        n/a         0      472255111

Responses                       n/a         1      749751322



I wrote 'apparent progress' because it reports “MOVING” and the Pending 
Commands/Responses are changing over time.  However, I haven’t seen the 
individual .db files progress go above 0%.

Meanwhile, the system appears to have plenty of unused bandwidth, from 'iostat 
-x -m 1':


Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
avgqu-sz   await  svctm  %util

sda               0.00    56.00 1338.00  171.00    57.59     0.89    79.36     
0.57    0.38   0.17  25.30


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          22.77    1.82    2.35    0.20    0.00   72.86


Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
avgqu-sz   await  svctm  %util

sda               0.00     0.00  785.00    0.00    33.80     0.00    88.17     
0.27    0.35   0.18  14.10


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          20.16    2.05    2.22    0.20    0.00   75.37




Is 40 hours too long for this move?  Should I be seeing individual .db files 
report more progress?  Should I start with the first box (even though the token 
appears correct)?


Any thoughts would be greatly appreciated.

THX


Cheers,

~Jason
*******

Reply via email to