[ 
https://issues.apache.org/jira/browse/CASSANDRA-15363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-15363:
----------------------------------------
    Test and Documentation Plan: adds an in-jvm upgrade dtest
                         Status: Patch Available  (was: Open)

When reading from the memtable we create a SearchIterator to only grab the 
columns requested. If there is a range tombstone covering any of these columns, 
we create a fake RT which covers only that column: 
[AbstractBTreePartition|https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/partitions/AbstractBTreePartition.java#L158].
 problem is that if there is an active partition deletion, we also create an RT 
with the same timestamps as the partition deletion. We should only create these 
range tombstones if the partition deletion supersedes all the range tombstones.

Patch [here|https://github.com/krummas/cassandra/commits/marcuse/15363] (with 
dtest changes needed in 2.2 
[here|https://github.com/krummas/cassandra/commits/marcuse/15363-2.2]

[circleci|https://circleci.com/gh/krummas/workflows/cassandra/tree/marcuse%2F15363]

> Read repair in mixed mode between 2.1 and 3.0 on COMPACT STORAGE tables 
> causes unreadable sstables after upgrade
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15363
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15363
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>            Priority: Normal
>
> if we have a table like this:
> {{CREATE TABLE tbl (pk ascii, b boolean, v blob, PRIMARY KEY (pk)) WITH 
> COMPACT STORAGE}}
> with a cluster where node1 is 2.1 and node2 is 3.0 (during upgrade):
> * node2 coordinates a delete {{DELETE FROM tbl WHERE pk = 'something'}} which 
> node1 does not get
> * node1 coordinates a quorum read {{SELECT * FROM tbl WHERE id = 
> 'something'}} which causes a read repair
> * this makes node1 flush an sstable like this:
> {code}
> [
> {"key": "something",
>  "metadata": {"deletionInfo": 
> {"markedForDeleteAt":1571388944364000,"localDeletionTime":1571388944}},
>  "cells": [["b","b",1571388944364000,"t",1571388944],
>            ["v","v",1571388944364000,"t",1571388944]]}
> ]
> {code}
> (It has range tombstones which are covered by the partition deletion)
> Then, when we upgrade this node to 3.0 and try to read or run 
> upgradesstables, we get this:
> {code}
> ERROR [node1_CompactionExecutor:1] node1 2019-10-18 10:44:11,325 
> DebuggableThreadPoolExecutor.java:242 - Error in ThreadPoolExecutor
> java.lang.UnsupportedOperationException: null
>       at 
> org.apache.cassandra.db.LegacyLayout.extractStaticColumns(LegacyLayout.java:779)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.readStaticRow(SSTableSimpleIterator.java:120)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:57)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator$1.initializeIterator(BigTableScanner.java:362)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.isReverseOrder(LazilyInitializedUnfilteredRowIterator.java:65)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$1.reduce(UnfilteredPartitionIterators.java:103)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$1.reduce(UnfilteredPartitionIterators.java:94)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:442)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.hasNext(UnfilteredPartitionIterators.java:144)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:92)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:227)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:190)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:675)
>  ~[dtest-3.0.19.jar:na]
>       at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[dtest-3.0.19.jar:na]
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
>       at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:83)
>  [dtest-3.0.19.jar:na]
>       at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to