[
https://issues.apache.org/jira/browse/CASSANDRA-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tania S Engel updated CASSANDRA-13608:
--------------------------------------
Fix Version/s: (was: 3.10)
> Connection closed/reopened during join causes Cassandra stream to close
> -----------------------------------------------------------------------
>
> Key: CASSANDRA-13608
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13608
> Project: Cassandra
> Issue Type: Bug
> Components: Streaming and Messaging
> Environment: Cassandra 3.10. Windows Server 2016, 32GB ram, 2TB hard
> disk, RAID10 with 4 spindles, 8 Cores
> Reporter: Tania S Engel
> Attachments: Cassandra 3.10 Join with lots GC collection leads to
> socket closure and join hang.mht, Cassandra 3.10 Join with lots GC collection
> leads to socket closure and join hang.pdf, Cassandra 3.10 Join with lots GC
> collection leads to socket closure and join hang.txt
>
>
> We start a JOIN bootstrap. Primary seed node streams to the replica. The
> replica requires some GC cleanup and experiences frequent pauses including a
> 12 second old gen cleanup following a memTable flush. Both replica and
> primary show _MessagingService IOException: An existing connection was
> forcibly closed by the remote host_. The replica MessagingService-Outgoing
> reestablishes the connection immediately but the primary
> StreamKeepAliveExecutor throws a _java.RuntimeException: Outgoing stream
> handler has been closed_. >From that point forward, the replica stays in JOIN
> mode, sending keeping alive to the primary. The primary receives the keep
> alive, but does not send its own and it repeatedly fails to send a hints file
> to the replica. It seems this limping condition would continue indefinitely,
> but stops as we stop the replica Cassandra. If we restart the replica
> Cassandra the JOIN picks up again but fails with _java.io.IOException:
> Corrupt value length 355151036 encountered, as it exceeds the maximum of
> 268435456, which is set via max_value_size_in_mb in cassandra.yaml_. We have
> not increased this value as we do not have values that large in our data so
> we presume it is indeed corrupt and moving past it would not be a good idea.
> Please see the attachment for details.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]