[
https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099346#comment-16099346
]
Stanislav Vishnevskiy commented on CASSANDRA-13687:
---------------------------------------------------
This cluster does not have any materialized views.
Our last few nights the repair has finished successfully, but the heap and CPU
usage is still higher than other nodes and it seems like the norm now.
> Abnormal heap growth and CPU usage during repair.
> -------------------------------------------------
>
> Key: CASSANDRA-13687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
> Project: Cassandra
> Issue Type: Bug
> Reporter: Stanislav Vishnevskiy
> Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png,
> 3.0.9heap.png, 3.0.9.png
>
>
> We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004
> Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying
> on us. We currently don't have any data to help reproduce this, but maybe
> since there aren't many commits between the 2 versions it might be obvious.
> Basically we trigger a parallel incremental repair from a single node every
> night at 1AM. That node will sometimes start allocating a lot and keeping the
> heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This
> effectively destroys the whole cluster due to timeouts to this node.
> The only solution we currently have is to drain the node and restart the
> repair, it has worked fine the second time every time.
> I attached heap charts from 3.0.9 and 3.0.14 during repair.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]