[jira] [Comment Edited] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table

Dimitar Dimitrov (JIRA) Wed, 06 Dec 2017 11:34:17 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280752#comment-16280752
 ]


Dimitar Dimitrov edited comment on CASSANDRA-13801 at 12/6/17 7:33 PM:
-----------------------------------------------------------------------

It turns out that the problem does not necessarily require altering the 
compaction strategy.
It seems to be rooted in a potential problem with counting the CF compaction 
requests, that can eventually lead to a skipped background compaction.

The wrong counting can happen if the counting multiset increment 
[here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L197]
 gets delayed and happens after the corresponding counting multiset decrement 
already happened 
[here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L284].

Here are the branches with the proposed changes, as well as a Byteman test that 
can be used to demonstrate the issue.
testall results look good (3.0 and trunk each have 1 seemingly unrelated, flaky 
test failing).
dtest results will be added soon.

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13801-2.2]
 | [testall|^c13801-2.2-testall.png] |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13801-3.0]
 | [testall|^c13801-3.0-testall.png] |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13801-3.11]
 | [testall|^c13801-3.11-testall.png] |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13801-trunk]
 | [testall|^c13801-trunk-testall.png] |



was (Author: dimitarndimitrov):
It turns out that the problem does not necessarily require altering the 
compaction strategy.
It seems to be rooted in a potential problem with counting the CF compaction 
requests, that can eventually lead to a skipped background compaction.

The wrong counting can happen if the counting multiset increment 
[here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L197]
 gets delayed and happens after the corresponding counting multiset decrement 
already happened 
[here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L284].

Here are the branches with the proposed changes, as well as a Byteman test that 
can be used to demonstrate the issue.
testall results look good (3.0 and trunk each have 1 seemingly unrelated, flaky 
test failing).
dtest results will be added soon.

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13801-2.2]
 | [testall|^c13801-2.2-testall.png] |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13801-3.0]
 | [testall|^c13801-3.0-testall.png] |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13801-3.11]
 | [testall|^c13801-3.11-testall.png] |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13801-trunk]
 | [testall|^c13801-2.2-testall.png] |


> CompactionManager sometimes wrongly determines that a background compaction 
> is running for a particular table
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13801
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13801
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Dimitar Dimitrov
>            Assignee: Dimitar Dimitrov
>            Priority: Minor
>         Attachments: c13801-2.2-testall.png, c13801-3.0-testall.png, 
> c13801-3.11-testall.png, c13801-trunk-testall.png
>
>
> Sometimes after writing different rows to a table, then doing a blocking 
> flush, if you alter the compaction strategy, then run background compaction 
> and wait for it to finish, {{CompactionManager}} may decide that there's an 
> ongoing compaction for that same table.
> This may happen even though logs don't indicate that to be the case 
> (compaction may still be running for system_schema tables).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table

Reply via email to