[ 
https://issues.apache.org/jira/browse/CASSANALYTICS-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18088191#comment-18088191
 ] 

Yifan Cai commented on CASSANALYTICS-175:
-----------------------------------------

+1

> Bulk write jobs fail when a node returns with a different IP address
> --------------------------------------------------------------------
>
>                 Key: CASSANALYTICS-175
>                 URL: https://issues.apache.org/jira/browse/CASSANALYTICS-175
>             Project: Apache Cassandra Analytics
>          Issue Type: Bug
>          Components: Writer
>            Reporter: Jon Haddad
>            Assignee: Jon Haddad
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> During S3 bulk writes, CassandraTopologyMonitor polls the cluster topology 
> every 5 seconds and cancels the job if the current topology is not equal to 
> the topology captured at job start. The comparison is 
> TokenRangeMapping.equals, which compares instance sets using 
> RingInstance.equals — and RingInstance equality includes the node's IP 
> address.
> A node that goes down and rejoins with a different IP address (routine in 
> Kubernetes, where a rescheduled pod keeps its hostname and host ID but gets a 
> new IP) is the same logical instance, with the same token ownership. The 
> write remains correct and safe to continue. But because the IP participates 
> in equality, the monitor reports "Topology changed during bulk write" and 
> fails the job. On clusters with hundreds of nodes across multiple DCs, the 
> probability of at least one pod replacement during a long-running job makes 
> this a frequent, spurious failure mode.
> The monitor is not the only affected path:
>  - {{RecordWriter.validateTaskTokenRangeMappings}} performs the same 
> instance-set comparison on every executor task (both the direct and S3 
> transports), so an IP change mid-job also fails task-level validation.
>  - {{ReplicaAwareFailureHandler}} and {{ImportCompletionCoordinator}} key 
> per-instance state by {{{}RingInstance{}}}; an instance observed under an old 
> IP and a new IP is counted as two distinct replicas, skewing 
> consistency-level accounting.
> History: {{RingInstance.equals}} originally compared token, fqdn, port, and 
> datacenter. CASSANDRA-18852 added the IP address in the same change that 
> introduced building {{RingInstance}} from {{{}ReplicaMetadata{}}}, which 
> carries no token — leaving the IP as a stand-in discriminator.
> Fix: remove the IP address from RingInstance.equals/hashCode. Instance 
> identity becomes clusterId, token, fqdn, rack, port, and datacenter. The 
> remaining fields are sufficient to distinguish nodes: two live nodes cannot 
> share fqdn + port + datacenter. Note that Sidecar resolves fqdn via reverse 
> DNS and falls back to the IP string when resolution fails, so deployments 
> without DNS see no behavior change; real topology changes (nodes added, 
> removed, joining, leaving) are still detected through instance membership and 
> pending-state comparison.
> One thing to be aware of: in a DNS-less environment the sidecar's fqdn 
> fallback is the IP string, so this fix only helps deployments where reverse 
> DNS gives stable names — which K8s does. Real topology changes (scale 
> up/down, decommission, move) are still caught exactly as before.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to