[jira] [Commented] (CASSANDRA-17787) Full repair on a keyspace with a large amount of tables causes OOM

Jira Wed, 31 Jan 2024 03:51:04 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-17787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812677#comment-17812677
 ]


Andres de la Peña commented on CASSANDRA-17787:
-----------------------------------------------

CASSANDRA-19336 has found the same issue for repairs targeting more replicas 
than the replication factor. This can happen when the repair doesn't use 
{{\-\-partitioner-range}} or when using virtual nodes. That ticket proposes a 
patch limiting the number of simultaneous repair jobs. 

Reducing the Merkle tree depths seems risky for the case with multiple nodes 
reported here because of over-streaming. However, I think smaller trees might 
be used for the vnodes case reported by CASSANDRA-19336 because many small 
ranges imply fewer rows per range.

> Full repair on a keyspace with a large amount of tables causes OOM
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-17787
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17787
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair
>            Reporter: Brandon Williams
>            Priority: Normal
>              Labels: lhf
>             Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> Running a nodetool repair -pr --full on a keyspace with a few hundred tables 
> will cause a direct memory OOM, or lots of heap pressure with 
> use_offheap_merkle_trees: false.  Adjusting repair_session_space_in_mb does 
> not seem to help.  From an initial look at a heap dump, it appears to node is 
> holding many _remote_ trees in memory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-17787) Full repair on a keyspace with a large amount of tables causes OOM

Reply via email to