Re: Load balancing issue with virtual nodes

2014-04-29 Thread DuyHai Doan
Thanks you Ben for the links




On Tue, Apr 29, 2014 at 3:40 AM, Ben Bromhead b...@instaclustr.com wrote:

 Some imbalance is expected and considered normal:

 See http://wiki.apache.org/cassandra/VirtualNodes/Balance

 As well as

 https://issues.apache.org/jira/browse/CASSANDRA-7032

 Ben Bromhead
 Instaclustr | www.instaclustr.com | 
 @instaclustrhttp://twitter.com/instaclustr |
 +61 415 936 359

 On 29 Apr 2014, at 7:30 am, DuyHai Doan doanduy...@gmail.com wrote:

 Hello all

  Some update about the issue.

  After wiping completely all sstable/commitlog/saved_caches folder and
 restart the cluster from scratch, we still experience weird figures. After
 the restart, nodetool status does not show an exact balance of 50% of data
 for each node :


 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 -- Address  Load Tokens Owns (effective) Host ID Rack
 UN host1 48.57 KB 256 *51.6%*  d00de0d1-836f-4658-af64-3a12c00f47d6 rack1
 UN host2 48.57 KB 256 *48.4%*  e9d2505b-7ba7-414c-8b17-af3bbe79ed9c rack1


 As you can see, the % is very close to 50% but not exactly 50%

  What can explain that ? Can it be network connection issue during token
 initial shuffle phase ?

 P.S: both host1 and host2 are supposed to have exactly the same hardware

 Regards

  Duy Hai DOAN


 On Thu, Apr 24, 2014 at 11:20 PM, Batranut Bogdan batra...@yahoo.comwrote:

 I don't know about hector but the datastax java driver needs just one ip
 from the cluster and it will discover the rest of the nodes. Then by
 default it will do a round robin when sending requests. So if Hector does
 the same the patterb will againg appear.
 Did you look at the size of the dirs?
 That documentation is for C* 0.8. It's old. But depending on your boxes
 you might reach CPU bottleneck. Might want to google for write path in
 cassandra..  According to that, there is not much to do when writes come
 in...
   On Friday, April 25, 2014 12:00 AM, DuyHai Doan doanduy...@gmail.com
 wrote:
  I did some experiments.

  Let's say we have node1 and node2

 First, I configured Hector with node1  node2 as hosts and I saw that
 only node1 has high CPU load

 To eliminate the client connection issue, I re-test with only node2
 provided as host for Hector. Same pattern. CPU load is above 50% on node1
 and below 10% on node2.

 It means that node2 is playing as coordinator and forward many write/read
 request to node1

  Why did I look at CPU load and not iostat  al ?

  Because I have a very intensive write work load with read-only-once
 pattern. I've read here (
 http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning)
 that heavy write in C* is more CPU bound but maybe the info may be outdated
 and no longer true

  Regards

  Duy Hai DOAN


 On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler 
 mich...@pbandjelly.orgwrote:

 On 04/24/2014 10:29 AM, DuyHai Doan wrote:

   Client used = Hector 1.1-4
   Default Load Balancing connection policy
   Both nodes addresses are provided to Hector so according to its
 connection policy, the client should switch alternatively between both
 nodes


 OK, so is only one connection being established to one node for one bulk
 write operation? Or are multiple connections being made to both nodes and
 writes performed on both?

 --
 Michael









Re: Load balancing issue with virtual nodes

2014-04-28 Thread DuyHai Doan
Hello all

 Some update about the issue.

 After wiping completely all sstable/commitlog/saved_caches folder and
restart the cluster from scratch, we still experience weird figures. After
the restart, nodetool status does not show an exact balance of 50% of data
for each node :


Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address  Load Tokens Owns (effective) Host ID Rack
UN host1 48.57 KB 256 *51.6%*  d00de0d1-836f-4658-af64-3a12c00f47d6 rack1
UN host2 48.57 KB 256 *48.4%*  e9d2505b-7ba7-414c-8b17-af3bbe79ed9c rack1


As you can see, the % is very close to 50% but not exactly 50%

 What can explain that ? Can it be network connection issue during token
initial shuffle phase ?

P.S: both host1 and host2 are supposed to have exactly the same hardware

Regards

 Duy Hai DOAN


On Thu, Apr 24, 2014 at 11:20 PM, Batranut Bogdan batra...@yahoo.comwrote:

 I don't know about hector but the datastax java driver needs just one ip
 from the cluster and it will discover the rest of the nodes. Then by
 default it will do a round robin when sending requests. So if Hector does
 the same the patterb will againg appear.
 Did you look at the size of the dirs?
 That documentation is for C* 0.8. It's old. But depending on your boxes
 you might reach CPU bottleneck. Might want to google for write path in
 cassandra..  According to that, there is not much to do when writes come
 in...
   On Friday, April 25, 2014 12:00 AM, DuyHai Doan doanduy...@gmail.com
 wrote:
  I did some experiments.

  Let's say we have node1 and node2

 First, I configured Hector with node1  node2 as hosts and I saw that only
 node1 has high CPU load

 To eliminate the client connection issue, I re-test with only node2
 provided as host for Hector. Same pattern. CPU load is above 50% on node1
 and below 10% on node2.

 It means that node2 is playing as coordinator and forward many write/read
 request to node1

  Why did I look at CPU load and not iostat  al ?

  Because I have a very intensive write work load with read-only-once
 pattern. I've read here (
 http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning)
 that heavy write in C* is more CPU bound but maybe the info may be outdated
 and no longer true

  Regards

  Duy Hai DOAN


 On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler 
 mich...@pbandjelly.orgwrote:

 On 04/24/2014 10:29 AM, DuyHai Doan wrote:

   Client used = Hector 1.1-4
   Default Load Balancing connection policy
   Both nodes addresses are provided to Hector so according to its
 connection policy, the client should switch alternatively between both
 nodes


 OK, so is only one connection being established to one node for one bulk
 write operation? Or are multiple connections being made to both nodes and
 writes performed on both?

 --
 Michael







Re: Load balancing issue with virtual nodes

2014-04-28 Thread Ben Bromhead
Some imbalance is expected and considered normal:

See http://wiki.apache.org/cassandra/VirtualNodes/Balance

As well as

https://issues.apache.org/jira/browse/CASSANDRA-7032

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359

On 29 Apr 2014, at 7:30 am, DuyHai Doan doanduy...@gmail.com wrote:

 Hello all
 
  Some update about the issue.
 
  After wiping completely all sstable/commitlog/saved_caches folder and 
 restart the cluster from scratch, we still experience weird figures. After 
 the restart, nodetool status does not show an exact balance of 50% of data 
 for each node :
 
 
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 -- Address  Load Tokens Owns (effective) Host ID Rack
 UN host1 48.57 KB 256 51.6%  d00de0d1-836f-4658-af64-3a12c00f47d6 rack1
 UN host2 48.57 KB 256 48.4%  e9d2505b-7ba7-414c-8b17-af3bbe79ed9c rack1
 
 
 As you can see, the % is very close to 50% but not exactly 50%
 
  What can explain that ? Can it be network connection issue during token 
 initial shuffle phase ?
 
 P.S: both host1 and host2 are supposed to have exactly the same hardware
 
 Regards
 
  Duy Hai DOAN
 
 
 On Thu, Apr 24, 2014 at 11:20 PM, Batranut Bogdan batra...@yahoo.com wrote:
 I don't know about hector but the datastax java driver needs just one ip from 
 the cluster and it will discover the rest of the nodes. Then by default it 
 will do a round robin when sending requests. So if Hector does the same the 
 patterb will againg appear.
 Did you look at the size of the dirs?
 That documentation is for C* 0.8. It's old. But depending on your boxes you 
 might reach CPU bottleneck. Might want to google for write path in 
 cassandra..  According to that, there is not much to do when writes come 
 in...  
 On Friday, April 25, 2014 12:00 AM, DuyHai Doan doanduy...@gmail.com wrote:
 I did some experiments.
 
  Let's say we have node1 and node2
 
 First, I configured Hector with node1  node2 as hosts and I saw that only 
 node1 has high CPU load
 
 To eliminate the client connection issue, I re-test with only node2 
 provided as host for Hector. Same pattern. CPU load is above 50% on node1 and 
 below 10% on node2.
 
 It means that node2 is playing as coordinator and forward many write/read 
 request to node1
 
  Why did I look at CPU load and not iostat  al ?
 
  Because I have a very intensive write work load with read-only-once pattern. 
 I've read here 
 (http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning) that 
 heavy write in C* is more CPU bound but maybe the info may be outdated and no 
 longer true
 
  Regards
 
  Duy Hai DOAN
 
 
 On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler mich...@pbandjelly.org 
 wrote:
 On 04/24/2014 10:29 AM, DuyHai Doan wrote:
   Client used = Hector 1.1-4
   Default Load Balancing connection policy
   Both nodes addresses are provided to Hector so according to its
 connection policy, the client should switch alternatively between both nodes
 
 OK, so is only one connection being established to one node for one bulk 
 write operation? Or are multiple connections being made to both nodes and 
 writes performed on both?
 
 -- 
 Michael
 
 
 
 



Load balancing issue with virtual nodes

2014-04-24 Thread DuyHai Doan
Hello all

 I'm facing a rather weird issue with virtual nodes.

 My customer has a cluster with 2 nodes only. I've set virtual nodes so
future addition of new nodes will be easy.

 Now, after some benching tests with massive data insert, I can see with
htop that one node has its CPU occupation up to 60% and the other only
around 10%

 The massive insertion workload consist of random data and random string
(20 chars) as partition key so in theory, the Murmur3 partitioner should
dispatch then evenly between both nodes...

 Is there any existing JIRA related to load balancing issue with vnodes ?

 Regards

 Duy Hai DOAN


Re: Load balancing issue with virtual nodes

2014-04-24 Thread Michael Shuler

On 04/24/2014 09:14 AM, DuyHai Doan wrote:

  My customer has a cluster with 2 nodes only. I've set virtual nodes so
future addition of new nodes will be easy.


with RF=?


  Now, after some benching tests with massive data insert, I can see
with htop that one node has its CPU occupation up to 60% and the other
only around 10%


This would make sense if the keyspace is RF=1 and the client is 
connecting to one node. This also makes sense if the KS is RF=2 and the 
client is connecting to one node.



  The massive insertion workload consist of random data and random
string (20 chars) as partition key so in theory, the Murmur3 partitioner
should dispatch then evenly between both nodes...

  Is there any existing JIRA related to load balancing issue with vnodes ?


vnode != node

Clients connect to nodes - real servers with a network connection. 
Vnodes are internal to how C* distributes data in the cluster and have 
nothing to do with network connections of the nodes. Load balancing 
client connections is at the node/network level and how the 
client/driver is configured to distribute network connections - has 
nothing to do with vnodes.


--
Michael



Re: Load balancing issue with virtual nodes

2014-04-24 Thread DuyHai Doan
Hello Michael

 RF = 1

 Client used = Hector 1.1-4
 Default Load Balancing connection policy
 Both nodes addresses are provided to Hector so according to its connection
policy, the client should switch alternatively between both nodes

 Regards

 Duy Hai DOAN


On Thu, Apr 24, 2014 at 4:37 PM, Michael Shuler mich...@pbandjelly.orgwrote:

 On 04/24/2014 09:14 AM, DuyHai Doan wrote:

   My customer has a cluster with 2 nodes only. I've set virtual nodes so
 future addition of new nodes will be easy.


 with RF=?


Now, after some benching tests with massive data insert, I can see
 with htop that one node has its CPU occupation up to 60% and the other
 only around 10%


 This would make sense if the keyspace is RF=1 and the client is connecting
 to one node. This also makes sense if the KS is RF=2 and the client is
 connecting to one node.


The massive insertion workload consist of random data and random
 string (20 chars) as partition key so in theory, the Murmur3 partitioner
 should dispatch then evenly between both nodes...

   Is there any existing JIRA related to load balancing issue with vnodes ?


 vnode != node

 Clients connect to nodes - real servers with a network connection. Vnodes
 are internal to how C* distributes data in the cluster and have nothing to
 do with network connections of the nodes. Load balancing client connections
 is at the node/network level and how the client/driver is configured to
 distribute network connections - has nothing to do with vnodes.

 --
 Michael




Re: Load balancing issue with virtual nodes

2014-04-24 Thread Michael Shuler

On 04/24/2014 10:29 AM, DuyHai Doan wrote:

  Client used = Hector 1.1-4
  Default Load Balancing connection policy
  Both nodes addresses are provided to Hector so according to its
connection policy, the client should switch alternatively between both nodes


OK, so is only one connection being established to one node for one bulk 
write operation? Or are multiple connections being made to both nodes 
and writes performed on both?


--
Michael


Re: Load balancing issue with virtual nodes

2014-04-24 Thread Batranut Bogdan
Htop is not the only tool for this . Cassandra will hit io bottlnecks before 
cpu (on faster cpus) . A simple solution is to check the size of the data dir 
on the boxes. If you have aprox the same size then cassandra is wrinting in the 
whole cluster. Check how the data dir size changes when importing use iostat to 
see how loaded the hdds are. In my experience I rarely look at cpu load. Since 
i have hdds i/o is the thing that i keep an eye on. Also try secvential strings 
when inserting but this should behave aprox the same as for  random ones.a 
href=https://overview.mail.yahoo.com?.src=iOS;br/br/Sent from Yahoo Mail 
for iPhone/a

Re: Load balancing issue with virtual nodes

2014-04-24 Thread DuyHai Doan
I did some experiments.

 Let's say we have node1 and node2

First, I configured Hector with node1  node2 as hosts and I saw that only
node1 has high CPU load

To eliminate the client connection issue, I re-test with only node2
provided as host for Hector. Same pattern. CPU load is above 50% on node1
and below 10% on node2.

It means that node2 is playing as coordinator and forward many write/read
request to node1

 Why did I look at CPU load and not iostat  al ?

 Because I have a very intensive write work load with read-only-once
pattern. I've read here (
http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning)
that heavy write in C* is more CPU bound but maybe the info may be outdated
and no longer true

 Regards

 Duy Hai DOAN


On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler mich...@pbandjelly.orgwrote:

 On 04/24/2014 10:29 AM, DuyHai Doan wrote:

   Client used = Hector 1.1-4
   Default Load Balancing connection policy
   Both nodes addresses are provided to Hector so according to its
 connection policy, the client should switch alternatively between both
 nodes


 OK, so is only one connection being established to one node for one bulk
 write operation? Or are multiple connections being made to both nodes and
 writes performed on both?

 --
 Michael



Re: Load balancing issue with virtual nodes

2014-04-24 Thread Batranut Bogdan
I don't know about hector but the datastax java driver needs just one ip from 
the cluster and it will discover the rest of the nodes. Then by default it will 
do a round robin when sending requests. So if Hector does the same the patterb 
will againg appear.
Did you look at the size of the dirs?
That documentation is for C* 0.8. It's old. But depending on your boxes you 
might reach CPU bottleneck. Might want to google for write path in cassandra..  
According to that, there is not much to do when writes come in...  
On Friday, April 25, 2014 12:00 AM, DuyHai Doan doanduy...@gmail.com wrote:
 
I did some experiments.

 Let's say we have node1 and node2

First, I configured Hector with node1  node2 as hosts and I saw that only 
node1 has high CPU load

To eliminate the client connection issue, I re-test with only node2 provided 
as host for Hector. Same pattern. CPU load is above 50% on node1 and below 10% 
on node2.

It means that node2 is playing as coordinator and forward many write/read 
request to node1

 Why did I look at CPU load and not iostat  al ?

 Because I have a very intensive write work load with read-only-once pattern. 
I've read here 
(http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning) that 
heavy write in C* is more CPU bound but maybe the info may be outdated and no 
longer true

 Regards

 Duy Hai DOAN




On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler mich...@pbandjelly.org wrote:

On 04/24/2014 10:29 AM, DuyHai Doan wrote:

  Client used = Hector 1.1-4
  Default Load Balancing connection policy
  Both nodes addresses are provided to Hector so according to its
connection policy, the client should switch alternatively between both nodes


OK, so is only one connection being established to one node for one bulk write 
operation? Or are multiple connections being made to both nodes and writes 
performed on both?

-- 
Michael