What client Cassandra driver are you using? Java?
Java driver 2.1.8
Is there only a single thread in each client or are there multiple threads
Multi in parallel.
What does your connection code look like
It’s a very large class based on config files, but I believe you’re interested
in this line
cluster.withLoadBalancingPolicy(
new
DCAwareRoundRobinPolicy(config.getString(ConfigurationKeys.CassandraDataCenterName),
config.getInt(ConfigurationKeys.CassandraFailoverDataCenterNodesToLookAt),true))
with each of our application having a different (local) Datacenter name.
Steve
From: Jack Krupansky [mailto:[email protected]]
Sent: 04 December 2015 16:46
To: [email protected]
Subject: Re: cassandra reads are unbalanced
Thanks for the elaboration. A few more questions...
Is there only a single thread in each client or are there multiple threads
doing reading in parallel? IOW, does a read need to complete before the next
read is issued.
What client Cassandra driver are you using? Java?
What does your connection code look like, say compared to the example in the
doc:
http://docs.datastax.com/en/developer/java-driver/2.0/java-driver/quick_start/qsSimpleClientCreate_t.html
Just to make sure it really is connecting only to the local cluster and using
round robin and whether it is token aware.
-- Jack Krupansky
On Fri, Dec 4, 2015 at 10:51 AM, Walsh, Stephen
<[email protected]<mailto:[email protected]>> wrote:
Thanks for your input, but I think I’ve already answered most of your questions.
How many clients do you have performing reads?
------------------
On Wed, Dec 2, 2015 at 6:44 PM, Walsh, Stephen
<[email protected]<mailto:[email protected]>> wrote
….
There are 2 application (1 for each DC) who read and write at the same rate to
their local DC
….
--------------------
Is your load balancer in front of your clients or between your clients and
Cassandra?
------------------
On Thu, Dec 3, 2015 at 4:58 AM, Walsh, Stephen
<[email protected]<mailto:[email protected]>> wrote:
…
our production applications are behind a round robin load balancer
…
------------------
No Load Balancers talk to cassandra – I’m only mentioning this to show that the
writes / read are evenly distributed over the 2 DC’s
Does Node1 of DC2 have the exact same configuration of hardware of the other
nodes
Yes
Is it in the same rack
It’s in AWS – but we have it configured via the GossipProperytFileSnitch that
they are all on unique racks
Maybe your load balancer thinks that node is more capable and handles requests
faster so that it looks less loaded than the other two nodes
Unlikely, it’s all TCP SSL pass though connections. It doesn’t balance on load,
it just round robins each request
You might also check the read counts after a very short interval of time to see
if Node1 is uniformly getting more requests or just occasionally
------------------
On Wed, Dec 2, 2015 at 3:36 PM, Walsh, Stephen
<[email protected]<mailto:[email protected]>> wrote …
We monitor the number of reads / writes of every table via the cassandra JMX
metrics. (cassandra.db.read_count)
…
------------------
We can only monitor in 1 hour moving window
Maybe the other two nodes are in a different rack that occasionally has net
connectivity issues
Unlikely seems its AWS
From: Jack Krupansky
[mailto:[email protected]<mailto:[email protected]>]
Sent: 03 December 2015 16:11
To: [email protected]<mailto:[email protected]>
Subject: Re: cassandra reads are unbalanced
How many clients do you have performing reads?
Is your load balancer in front of your clients or between your clients and
Cassandra?
Does Node1 of DC2 have the exact same configuration of hardware of the other
nodes? Is it in the same rack? Maybe your load balancer thinks that node is
more capable and handles requests faster so that it looks less loaded than the
other two nodes.
You might also check the read counts after a very short interval of time to see
if Node1 is uniformly getting more requests or just occasionally. Maybe the
other two nodes are in a different rack that occasionally has net connectivity
issues so that the requests get diverted by the client/load balancer to Node1
during those times.
-- Jack Krupansky
On Thu, Dec 3, 2015 at 4:58 AM, Walsh, Stephen
<[email protected]<mailto:[email protected]>> wrote:
Thanks but keep in mind that both DC should be getting the same load, our
production applications are behind a round robin load balancer – so each one
our local application talk to its local Cassandra DataCenter.
It took about 4 hours but the nodetool cleanup eventually balanced all nodes
From: DuyHai Doan [mailto:[email protected]<mailto:[email protected]>]
Sent: 02 December 2015 16:27
To: [email protected]<mailto:[email protected]>
Subject: Re: cassandra reads are unbalanced
If you're using the Java driver with LOCAL_ONE and the default load balancing
strategy (TokenAware wrapped on DCAwareRoundRobin), the driver will always
select the primary replica. To change this behavior and introduce some
randomness so that non primary replicas get a chance to serve a read:
new TokenAwarePolicy(new DCAwareRoundRobinPolicy("local_DC"), true).
The second parameter (true) asks the TokenAware policy to "shuffle" replica on
each request to avoid always returning the primary replica.
On Wed, Dec 2, 2015 at 6:44 PM, Walsh, Stephen
<[email protected]<mailto:[email protected]>> wrote:
Very good questions.
We have reads and writes at LOCAL_ONE.
There are 2 application (1 for each DC) who read and write at the same rate to
their local DC
(All reads / writes started all perfectly even and degraded over time)
We use DCAwareRoundRobin policy
On update on the nodetool cleanup – it has help but hasn’t balanced all nodes.
Node 1 on DC2 is still quite high
Node 1 (DC1) = 1.35k (seeder)
Node 2 (DC1) = 1.54k
Node 3 (DC1) = 1.45k
Node 1 (DC2) = 2.06k (seeder)
Node 2 (DC2) = 1.38k
Node 3 (DC2) = 1.43k
From: DuyHai Doan [mailto:[email protected]<mailto:[email protected]>]
Sent: 02 December 2015 14:22
To: [email protected]<mailto:[email protected]>
Subject: Re: cassandra reads are unbalanced
Which Consistency level do you use for reads ? ONE ? Are you reading from only
DC1 or from both DC ?
What is the LoadBalancingStrategy you have configured for your driver ?
TokenAware wrapped on DCAwareRoundRobin ?
On Wed, Dec 2, 2015 at 3:36 PM, Walsh, Stephen
<[email protected]<mailto:[email protected]>> wrote:
Hey all,
Thanks for taking the time to help.
So we have 6 cassandra nodes in 2 Data Centers.
Both Data Centers have a replication of 3 – so all nodes have all the data.
Over the last 2 days we’ve noticed that data reads / writes has shifted from
balanced to unbalanced
(Nodetool status still shows 100% ownership on every node, with similar sizes)
For Example
We monitor the number of reads / writes of every table via the cassandra JMX
metrics. (cassandra.db.read_count)
Over the last hour of this run
Reads
Node 1 (DC1) = 1.79k (seeder)
Node 2 (DC1) = 1.92k
Node 3 (DC1) = 1.97k
Node 1 (DC2) = 2.90k (seeder)
Node 2 (DC2) = 1.76k
Node 3 (DC2) = 1.19k
As you see on DC1, everything is pretty well balanced, but on DC2 the reads
favour Node1 over Node 3.
I ran a nodetool repair yesterday – ran for 6 hours and when completed didn’t
change the read balance.
Write levels are similar on DC2, but not as bad a reads.
Anyone any suggestion on how to rebalance? I’m thinking maybe running a
nodetool cleanup in case some of the keys have shifted?
Regards
Stephen Walsh
This email (including any attachments) is proprietary to Aspect Software, Inc.
and may contain information that is confidential. If you have received this
message in error, please do not read, copy or forward this message. Please
notify the sender immediately, delete it from your system and destroy any
copies. You may not further disclose or distribute this email or its
attachments.
This email (including any attachments) is proprietary to Aspect Software, Inc.
and may contain information that is confidential. If you have received this
message in error, please do not read, copy or forward this message. Please
notify the sender immediately, delete it from your system and destroy any
copies. You may not further disclose or distribute this email or its
attachments.
This email (including any attachments) is proprietary to Aspect Software, Inc.
and may contain information that is confidential. If you have received this
message in error, please do not read, copy or forward this message. Please
notify the sender immediately, delete it from your system and destroy any
copies. You may not further disclose or distribute this email or its
attachments.
This email (including any attachments) is proprietary to Aspect Software, Inc.
and may contain information that is confidential. If you have received this
message in error, please do not read, copy or forward this message. Please
notify the sender immediately, delete it from your system and destroy any
copies. You may not further disclose or distribute this email or its
attachments.
This email (including any attachments) is proprietary to Aspect Software, Inc.
and may contain information that is confidential. If you have received this
message in error, please do not read, copy or forward this message. Please
notify the sender immediately, delete it from your system and destroy any
copies. You may not further disclose or distribute this email or its
attachments.