[jira] [Commented] (CASSANDRA-14834) Avoid keeping StreamingTombstoneHistogramBuilder.Spool in memory during the whole compaction

Adam Holmberg (Jira) Mon, 14 Dec 2020 12:53:06 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249297#comment-17249297
 ]


Adam Holmberg commented on CASSANDRA-14834:
-------------------------------------------

I got a chance to look at this and I think the comment above may be 
overreaching. I was hoping that the buffer could be released on finalize, but 
it's not quite that simple. I did find that MetricsCollector#finalizeMetadata 
(which calls build on the histo) is called several times in the lifecycle, 
which is makes the "finalize" semantics a bit confusing. I looked into 
memoizing the result, but found that the metrics can actually be 
[updated|https://github.com/apache/cassandra/blob/4c103447af3c4829e3a1c733bed3952fd059af08/src/java/org/apache/cassandra/io/compress/CompressedSequentialWriter.java#L355]
 between the first and last call. I did not, however, find any instance of the 
histogram being updated after the first call to finalize/build. Therefore, my 
proposal is to drop the buffers as soon as we switch to a new writer (and will 
have no further samples to update on the previous writer), but leave the 
collector in a state that things can still be updated and metrics rebuilt 
repeatedly. The patch does this by adding a simple call chain to explicitly 
"release" the metadata overhead when writing is done.

[patch|https://github.com/aholmberg/cassandra/pull/24]
[ci|https://app.circleci.com/pipelines/github/aholmberg/cassandra?branch=CASSANDRA-14834]

> Avoid keeping StreamingTombstoneHistogramBuilder.Spool in memory during the 
> whole compaction
> --------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-14834
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14834
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Compaction
>            Reporter: Marcus Eriksson
>            Assignee: Adam Holmberg
>            Priority: Low
>             Fix For: 4.0, 4.0-beta
>
>
> Since CASSANDRA-13444 {{StreamingTombstoneHistogramBuilder.Spool}} is 
> allocated to keep around an array with 131072 * 2 * 2 integers *per written 
> sstable* during the whole compaction. With LCS at times creating 1000s of 
> sstables during a compaction it kills the node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-14834) Avoid keeping StreamingTombstoneHistogramBuilder.Spool in memory during the whole compaction

Reply via email to