Re: Cassandra Clients: Stale Cluster Topology

Jeff Jirsa Fri, 17 Oct 2025 17:32:20 -0700

Also: nodetool getendpoints just hashes the key you provide against the cluster 
topology / schema definition, which tells you which nodes WOULD own the data if 
it exists. It does NOT guarantee that it exists.




On 2025/10/10 17:48:32 Jeff Jirsa wrote:
> You're using a 9 year old release. There have been literally hundreds of 
> correctness fixes over those 9 years. You need to upgrade.
> 
> The rest of your answers inline.
> 
> 
> 
> On 2025/10/10 12:56:58 FMH wrote:
> > Few times a week, our developers report that Cassandra retrieves are
> > coming back with zero rows. No error messages.
> > 
> > Using the same item ID's, a CQLSH SELECT statement returns a single row as
> > expected. Furthermore, the NODETOOL GETENDPOINTS returns three IP's as we
> > expect.
> > 
> > This confirms these ItemID's do exist in Cassandra, it is just the Java
> > clients are not retrieving it.
> > 
> > We noticed this issue to present itself more when nodes are replaced in the
> > cluster as a result of EC2 node deprecation.
> 
> Are you using EBS or ephemeral disk? Don't use ephemeral disk unless you are 
> much better at running cassandra and know how to replace a node without data 
> loss (which you do not seem to know how to do).
> 
> 
> > 
> > Once the developers restarted the Java client apps, it was now able to
> > retrieve these ItemID's.
> 
> That sounds weird. It may be that they read repaired or normal-repaired, or 
> it may be that the java apps were pointing to the wrong thing/cluster. 
> 
> > 
> > 1- Is this what is called the 'empty' read' behavior?
> > 2- Is this caused by clients topology metadata getting out of sync with the
> > cluster?
> 
> Could be cluster scaling unsafely due to ec2 events. 
> Could be low consistency level
> Could be any number of hundreds of topology bugs fixed since 2016.
> 
> If it's a client bug, I assume it's an old client bug I've never seen before. 
> Well functioning cassandra clients shouldn't care about the topology, the 
> coordinating server will forward the request anyway. 
> 
> > 3- How can this be detected? Should we have client drivers return 'metadata
> > = cluster.metadata' and compare it to 'nodetool gossipinfo'?
> 
> Upgrade your cluster.
> Use EBS so when nodes change, they don't change data ownership. 
> 
> > 4- Other than restarting the clients, is there a way to have client apps to
> > force to refresh their ring metadata?
> > 
> > The client apps are using 'com.datastax.oss:java-driver-core:4.13.0'
> > driver.
> > 
> > Google returns little information about this and GenAI's chat model even
> > though useful, they tend to hallucinate with confidence often.
> > 
> > Thanks
> > 
> > ----------------------------------------
> > Thank you
> > 
>

Re: Cassandra Clients: Stale Cluster Topology

Reply via email to