[ 
https://issues.apache.org/jira/browse/CASSANDRA-18507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721234#comment-17721234
 ] 

Tobias Lindaaker edited comment on CASSANDRA-18507 at 5/10/23 8:23 AM:
-----------------------------------------------------------------------

[~dcapwell] that's odd. When I run the test with the source unchanged 
(/reverted) I get:

{code:java}
[junit-timeout] Testcase: 
shouldNotRemoveTombstonesShadowingDataExcludedFromCompaction(org.apache.cassandra.db.compaction.PartialCompactionsTest):
 FAILED
[junit-timeout] remaining live rows after compaction expected:<100> but 
was:<105>
[junit-timeout] junit.framework.AssertionFailedError: remaining live rows after 
compaction expected:<100> but was:<105>
[junit-timeout] at 
org.apache.cassandra.db.compaction.PartialCompactionsTest.shouldNotRemoveTombstonesShadowingDataExcludedFromCompaction(PartialCompactionsTest.java:90)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit-timeout] at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit-timeout] 
[junit-timeout] 
[junit-timeout] Test org.apache.cassandra.db.compaction.PartialCompactionsTest 
FAILED
{code}
 
And with the source changes in place:

{code:java}
[junit-timeout] INFO  [main] 2023-05-10 09:01:05,369 
ColumnFamilyStore.java:2655 - Truncate of 
PartialCompactionsTest.shouldNotRemoveTombstonesShadowingDataExcludedFromCompaction
 is complete
[junit-timeout] ------------- ---------------- ---------------
   [delete] Deleting directory .../cassandra-4.1/build/test/cassandra/commitlog
   [delete] Deleting directory .../cassandra-4.1/build/test/cassandra/data
   [delete] Deleting directory 
.../cassandra-4.1/build/test/cassandra/saved_caches
   [delete] Deleting directory .../cassandra-4.1/build/test/cassandra/hints


BUILD SUCCESSFUL
Total time: 24 seconds
{code}
Same exact result on both the {{cassandra-4.0}} branch and the 
{{cassandra-4.1}} branch.

Maybe I'm doing something non-standard when running tests? I've primarily been 
running with:

{code:java}
ant test -Dtest.name=PartialCompactionsTest -Duse.jdk11=true {code}
But in order to rule out the possibility of this being Java version related, I 
also tried with Java 8 now, and that had the exact same outcome as well.

So next I'm wondering if the difference you are experiencing is platform 
related. I've been running on MacOS. So I tried running the test in a docker 
container ({{{}eclipse-temurin:8-jdk-jammy{}}}). That also works in the same 
way as when running natively on my machine, and is able to detect the issue 
when the change to source is reverted.

Even if I run the test as part of a larger set of tests it works the way it is 
intended to.

Please let me know if I should be running the test in a different way, or what 
I should be doing to reproduce the failure you are experiencing.

 


was (Author: JIRAUSER300240):
[~dcapwell] that's odd. When I run the test with the source unchanged 
(/reverted) I get:

 
{code:java}
[junit-timeout] Testcase: 
shouldNotRemoveTombstonesShadowingDataExcludedFromCompaction(org.apache.cassandra.db.compaction.PartialCompactionsTest):
 FAILED
[junit-timeout] remaining live rows after compaction expected:<100> but 
was:<105>
[junit-timeout] junit.framework.AssertionFailedError: remaining live rows after 
compaction expected:<100> but was:<105>
[junit-timeout] at 
org.apache.cassandra.db.compaction.PartialCompactionsTest.shouldNotRemoveTombstonesShadowingDataExcludedFromCompaction(PartialCompactionsTest.java:90)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit-timeout] at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit-timeout] 
[junit-timeout] 
[junit-timeout] Test org.apache.cassandra.db.compaction.PartialCompactionsTest 
FAILED
{code}
 

And with the source changes in place:

 
{code:java}
[junit-timeout] INFO  [main] 2023-05-10 09:01:05,369 
ColumnFamilyStore.java:2655 - Truncate of 
PartialCompactionsTest.shouldNotRemoveTombstonesShadowingDataExcludedFromCompaction
 is complete
[junit-timeout] ------------- ---------------- ---------------
   [delete] Deleting directory .../cassandra-4.1/build/test/cassandra/commitlog
   [delete] Deleting directory .../cassandra-4.1/build/test/cassandra/data
   [delete] Deleting directory 
.../cassandra-4.1/build/test/cassandra/saved_caches
   [delete] Deleting directory .../cassandra-4.1/build/test/cassandra/hints


BUILD SUCCESSFUL
Total time: 24 seconds {code}
Same exact result on both the {{cassandra-4.0}} branch and the 
{{cassandra-4.1}} branch.

Maybe I'm doing something non-standard when running tests? I've primarily been 
running with:

 
{code:java}
ant test -Dtest.name=PartialCompactionsTest -Duse.jdk11=true {code}
But in order to rule out the possibility of this being Java version related, I 
also tried with Java 8 now, and that had the exact same outcome as well.

 

So next I'm wondering if the difference you are experiencing is platform 
related. I've been running on MacOS. So I tried running the test in a docker 
container ({{{}eclipse-temurin:8-jdk-jammy{}}}). That also works in the same 
way as when running natively on my machine, and is able to detect the issue 
when the change to source is reverted.

Please let me know if I should be running the test in a different way, or what 
I should be doing to reproduce the failure you are experiencing.

 

> Partial compaction can resurrect deleted data
> ---------------------------------------------
>
>                 Key: CASSANDRA-18507
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18507
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Compaction
>            Reporter: Tobias Lindaaker
>            Assignee: Tobias Lindaaker
>            Priority: Normal
>
> If there isn't enough disk space available to compact all existing sstables, 
> Cassandra will attempt to perform a partial compaction by removing sstables 
> from the set of candidate sstables to be compacted, starting with the largest 
> one. It is possible that the sstable removed from the set of sstables to 
> compact contains data for which there are tombstones in another (more recent) 
> sstable. Since the overlaps between sstables is computed when the 
> {{CompactionController}} is created, and the {{CompactionController}} is 
> created before the removal of any sstables from the set of sstables to be 
> compacted this computed overlap will be outdated when checking which sstables 
> are covered by certain tombstones. This leads to the faulty conclusion that 
> the tombstones can be pruned during the compaction, causing the data to be 
> resurrected.
> The issue is present in Cassandra 4.0 and 4.1. Cassandra 3.11 creates the 
> {{CompactionController}} after the set of sstables to compact has been 
> reduced, and is thus not affected. {{trunk}} does not appear to support 
> partial compactions at all, but instead refuses to compact when the disk is 
> full.
> This regression appears to have been introduced by CASSANDRA-13068.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to