We running a pretty consistent load on our cluster and added a new node to a 6 
node cluster Friday(QA worked great, but production not so much).  One mistake 
that was made was starting up the new node, then disabling the firewall :( 
which allowed nodes to discover it BEFORE the node bootstrapped itself.  We 
shutdown the node and booted him up and he bootstrapped himself streaming all 
the data in.

After that though, all the ndoes have really really high load numbers now.  We 
are trying to figure out what is going on still.

Is there any way to get the number of reads/second and writes/second through 
JMX or something?  The only way I can see of on doing this is manually 
calculating it by timing the read count and dividing by my manual stop watches 
start/stop times(timerange).

Also, while my load is load average: 20.31, 19.10, 19.72 , what does a normal 
iostat look like?  My iostat await time is 13.66 ms which I think is kind of 
bad, but not that bad to cause a load of 20.31?

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
sda               0.02     0.07   11.70    1.96  1353.67   702.88   150.58     
0.19   13.66   3.61   4.93
sdb               0.00     0.02    0.11    0.46    20.72    97.54   206.70     
0.00    1.33   0.67   0.04

Thanks,
Dean

Reply via email to