rustyrazorblade opened a new pull request, #216: URL: https://github.com/apache/cassandra-analytics/pull/216
During S3 bulk writes, `CassandraTopologyMonitor` polls the topology every 5 seconds and cancels the job if it differs from the topology captured at job start. The comparison bottoms out in `RingInstance.equals`, which included the node's IP address. A node that goes down and rejoins with a different IP — routine in Kubernetes, where a rescheduled pod keeps its hostname, host ID, and data but gets a new IP — is the same logical instance with the same token ownership. The write remains correct, but the monitor reported "Topology changed during bulk write" and failed the job. On clusters with hundreds of nodes across several DCs, this makes long-running jobs fail routinely. The same equality is also used by `RecordWriter.validateTaskTokenRangeMappings` on every executor task (both transports) and by the per-instance consistency accounting in `ReplicaAwareFailureHandler` and `ImportCompletionCoordinator`, which counted the old-IP and new-IP instance as two distinct replicas. This patch removes the IP address from `RingInstance.equals`/`hashCode`. Instance identity is now clusterId, token, fqdn, rack, port, and datacenter — sufficient to distinguish nodes, since two live nodes cannot share fqdn + port + datacenter. Sidecar resolves fqdn via reverse DNS and falls back to the IP string when resolution fails, so deployments without DNS see no behavior change. Real topology changes (nodes added, removed, joining, leaving) are still detected through instance membership and pending-state comparison. The IP address was originally added to the equality in CASSANDRA-18852, the same change that introduced building `RingInstance` from `ReplicaMetadata`, which carries no token — leaving the IP as a stand-in discriminator. JIRA: https://issues.apache.org/jira/browse/CASSANALYTICS-175 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
