[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] C. Scott Andreas updated CASSANDRA-14861: - Severity: Critical (was: Normal) Complexity: Challenging Discovered By: Fuzz Test Bug Category: Parent values: Correctness(12982)Level 1 values: Response Corruption / Loss(12987) Since Version: 3.0.0 > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Urgent > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14861: - Component/s: (was: Legacy/Local Write-Read Paths) Local/SSTable > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-14861: Resolution: Fixed Status: Resolved (was: Ready to Commit) committed as {{d60c78358b6f599a83f3c112bfd6ce72c1129c9f}}, thanks > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] C. Scott Andreas updated CASSANDRA-14861: - Component/s: Local Write-Read Paths > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-14861: Status: Ready to Commit (was: Patch Available) > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-14861: Reviewers: Benedict, Sam Tunnicliffe (was: Benedict) > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14861: - Reviewers: Benedict > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-14861: Status: Patch Available (was: Open) |[3.0|https://github.com/bdeggleston/cassandra/tree/14861-3.0]|[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/cci%2F14861-3.0]| |[3.11|https://github.com/bdeggleston/cassandra/tree/14861-3.11]|[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/cci%2F14861-3.11]| |[trunk|https://github.com/bdeggleston/cassandra/tree/14861-trunk]|[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/14861-trunk]| This adds a minor sstable version to 3.x and changes 2 behaviors. First, when reading metadata for pre-md sstables, only the first clustering value is loaded into the min/max values and the rest are discarded. When writing new sstables, the size of the min/max values written are limited by the length of the shortest RT clustering. > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14861: - Summary: sstable min/max metadata can cause data loss (was: Inaccurate sstable min/max metadata can cause data loss) > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org