[jira] [Comment Edited] (CASSANDRA-15830) Invalid version value: 4.0~alpha4 during startup

2020-05-27 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118111#comment-17118111
 ] 

Michael Semb Wever edited comment on CASSANDRA-15830 at 5/27/20, 9:32 PM:
--

Reproduced with
{code}
cd /tmp
wget 
https://downloads.apache.org/cassandra/redhat/40x/cassandra-4.0~alpha4-1.noarch.rpm
rpm2cpio cassandra-4.0\~alpha4-1.noarch.rpm | cpio -idmv
jar xvf ./usr/share/cassandra/apache-cassandra-4.0~alpha4.jar 
org/apache/cassandra/config/version.properties
cat org/apache/cassandra/config/version.properties
{code}

The version used is wrong when building the project artifacts inside the 
[build-rpms.sh|https://github.com/apache/cassandra-builds/blob/master/docker/build-rpms.sh#L75]
 script, either the ${deb_release} or $CASSANDRA_VERSION.



was (Author: michaelsembwever):
Reproduced with
{code}
cd /tmp
wget 
https://downloads.apache.org/cassandra/redhat/40x/cassandra-4.0~alpha4-1.noarch.rpm
rpm2cpio cassandra-4.0\~alpha4-1.noarch.rpm| cpio -idmv
jar xvf ./usr/share/cassandra/apache-cassandra-4.0~alpha4.jar 
org/apache/cassandra/config/version.properties
cat org/apache/cassandra/config/version.properties
{code}

The version used is wrong when building the project artifacts inside the 
[build-rpms.sh|https://github.com/apache/cassandra-builds/blob/master/docker/build-rpms.sh#L75]
 script, either the ${deb_release} or $CASSANDRA_VERSION.


> Invalid version value: 4.0~alpha4 during startup
> 
>
> Key: CASSANDRA-15830
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15830
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build, Packaging
>Reporter: Eric Wong
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Hi:
> We are testing the latest cassandra-4.0 on Centos 7 using a clean database.  
> When we started cassandra the first time, everything is fine.  However, when 
> we stop and restart cassandra, we got the following error and the db refuses 
> to startup:
> {code}
> ERROR [main] 2020-05-22 05:58:18,698 CassandraDaemon.java:789 - Exception 
> encountered during startup
> java.lang.IllegalArgumentException: Invalid version value: 4.0~alpha4
>  at 
> org.apache.cassandra.utils.CassandraVersion.(CassandraVersion.java:64)
>  at 
> org.apache.cassandra.io.sstable.SSTableHeaderFix.fixNonFrozenUDTIfUpgradeFrom30(SSTableHeaderFix.java:84)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:250)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:650)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:767)
> {code}
> The only way to get the node up and running again is by deleting all data 
> under /var/lib/cassandra.
>  
> Is that a known issue?
> Thanks, Eric
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15830) Invalid version value: 4.0~alpha4 during startup

2020-05-27 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-15830:
---
Platform: Linux  (was: All)

> Invalid version value: 4.0~alpha4 during startup
> 
>
> Key: CASSANDRA-15830
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15830
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build, Packaging
>Reporter: Eric Wong
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Hi:
> We are testing the latest cassandra-4.0 on Centos 7 using a clean database.  
> When we started cassandra the first time, everything is fine.  However, when 
> we stop and restart cassandra, we got the following error and the db refuses 
> to startup:
> {code}
> ERROR [main] 2020-05-22 05:58:18,698 CassandraDaemon.java:789 - Exception 
> encountered during startup
> java.lang.IllegalArgumentException: Invalid version value: 4.0~alpha4
>  at 
> org.apache.cassandra.utils.CassandraVersion.(CassandraVersion.java:64)
>  at 
> org.apache.cassandra.io.sstable.SSTableHeaderFix.fixNonFrozenUDTIfUpgradeFrom30(SSTableHeaderFix.java:84)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:250)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:650)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:767)
> {code}
> The only way to get the node up and running again is by deleting all data 
> under /var/lib/cassandra.
>  
> Is that a known issue?
> Thanks, Eric
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15830) Invalid version value: 4.0~alpha4 during startup

2020-05-27 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-15830:
---
 Bug Category: Parent values: Correctness(12982)Level 1 values: API / 
Semantic Implementation(12988)
   Complexity: Low Hanging Fruit
  Component/s: Packaging
   Build
Discovered By: User Report
Fix Version/s: 4.0-alpha
 Severity: Critical
   Status: Open  (was: Triage Needed)

> Invalid version value: 4.0~alpha4 during startup
> 
>
> Key: CASSANDRA-15830
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15830
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build, Packaging
>Reporter: Eric Wong
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Hi:
> We are testing the latest cassandra-4.0 on Centos 7 using a clean database.  
> When we started cassandra the first time, everything is fine.  However, when 
> we stop and restart cassandra, we got the following error and the db refuses 
> to startup:
> {code}
> ERROR [main] 2020-05-22 05:58:18,698 CassandraDaemon.java:789 - Exception 
> encountered during startup
> java.lang.IllegalArgumentException: Invalid version value: 4.0~alpha4
>  at 
> org.apache.cassandra.utils.CassandraVersion.(CassandraVersion.java:64)
>  at 
> org.apache.cassandra.io.sstable.SSTableHeaderFix.fixNonFrozenUDTIfUpgradeFrom30(SSTableHeaderFix.java:84)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:250)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:650)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:767)
> {code}
> The only way to get the node up and running again is by deleting all data 
> under /var/lib/cassandra.
>  
> Is that a known issue?
> Thanks, Eric
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15830) Invalid version value: 4.0~alpha4 during startup

2020-05-27 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118111#comment-17118111
 ] 

Michael Semb Wever commented on CASSANDRA-15830:


Reproduced with
{code}
cd /tmp
wget 
https://downloads.apache.org/cassandra/redhat/40x/cassandra-4.0~alpha4-1.noarch.rpm
rpm2cpio cassandra-4.0\~alpha4-1.noarch.rpm| cpio -idmv
jar xvf ./usr/share/cassandra/apache-cassandra-4.0~alpha4.jar 
org/apache/cassandra/config/version.properties
cat org/apache/cassandra/config/version.properties
{code}

The version used is wrong when building the project artifacts inside the 
[build-rpms.sh|https://github.com/apache/cassandra-builds/blob/master/docker/build-rpms.sh#L75]
 script, either the ${deb_release} or $CASSANDRA_VERSION.


> Invalid version value: 4.0~alpha4 during startup
> 
>
> Key: CASSANDRA-15830
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15830
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Eric Wong
>Assignee: Michael Semb Wever
>Priority: Normal
>
> Hi:
> We are testing the latest cassandra-4.0 on Centos 7 using a clean database.  
> When we started cassandra the first time, everything is fine.  However, when 
> we stop and restart cassandra, we got the following error and the db refuses 
> to startup:
> {code}
> ERROR [main] 2020-05-22 05:58:18,698 CassandraDaemon.java:789 - Exception 
> encountered during startup
> java.lang.IllegalArgumentException: Invalid version value: 4.0~alpha4
>  at 
> org.apache.cassandra.utils.CassandraVersion.(CassandraVersion.java:64)
>  at 
> org.apache.cassandra.io.sstable.SSTableHeaderFix.fixNonFrozenUDTIfUpgradeFrom30(SSTableHeaderFix.java:84)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:250)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:650)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:767)
> {code}
> The only way to get the node up and running again is by deleting all data 
> under /var/lib/cassandra.
>  
> Is that a known issue?
> Thanks, Eric
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15837) Enhance fqltool to be able to export the fql log into a format which doesn't depend on Cassandra

2020-05-27 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118024#comment-17118024
 ] 

David Capwell commented on CASSANDRA-15837:
---

Spoke to Marcus about this, dumping context here.

* "why not read the raw fql logs?"

main reasons
1) cassandra brings in its dependencies, and these can be very old causing 
conflict
2) cassandra doesn’t have a stable API, so using different versions require a 
rewrite [1]
3) thrift is a lot faster [2], so we spend less time reading the files and more 
time trying to put load on the cluster.  Right now my bottleneck isn’t reading 
the files, its that cassandra can’t keep up so need to throttle the operations.

[1] - It is less of an issue reading the files, but there are no guarantees 
that QueryOptions won’t change the java API at will.  The main issue I had in 
upgrading from 3.0 to 4.0 was the query parsing side to classify the query.  I 
added logic to annotate what the query does and touches, so tools could filter 
for specific tables or only replay specific types of queries (such as selects 
or updates, etc.); having this logic is very useful in tooling (actively using 
right now) but this part is not compatible cross releases.

[2] - Below are two consumers of the fql logs: one reading the raw logs and 
ignoring the output, the other reading the thrift version and collecting stats. 
In both tools they read 100% of the data and do this sequentially.
$ time ./bin/fqltool thrift-stats ../query_logs/*/fql/fql.thrift.gz 1>/dev/null
 13.67 real14.88 user 1.59 sys
$ time ./bin/fqltool dump-thrift -- ../query_logs/*/fql/
 56.97 real76.70 user 2.47 sys

* "we just removed [thrift]"

I started in 3.0 so used thrift because it was there, in doing so I grew to 
hate it...  I find protobuf to be better so will likely switch to that.  But, 
to avoid adding a dependency into Cassandra, I can take on the work to allow 
FQL to have a different set of dependencies.  

> Enhance fqltool to be able to export the fql log into a format which doesn't 
> depend on Cassandra
> 
>
> Key: CASSANDRA-15837
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15837
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/fql
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the fql log format uses Cassandra serialization within the message, 
> which means that reading the file also requires Cassandra classes. To make it 
> easier for outside tools to read the fql logs we should enhance the fqltool 
> to be able to dump the logs to a file format using thrift or protobuf.
> Additionally we should support exporting the original query with a 
> deterministic version allowing tools to have reproducible and comparable 
> results.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15821) Metrics Documentation Enhancements

2020-05-27 Thread Stephen Mallette (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117939#comment-17117939
 ] 

Stephen Mallette commented on CASSANDRA-15821:
--

I've pushed another batch of changes to my branch which cover now cover all the 
documented items in my spreadsheet (i.e. all the known metrics that I could 
identify after a simple cassandra initialization are now in the 
{{metrics.rst}}). 

A few odds and ends I noticed:

1. Seems a bit odd that {{DroppedMessageMetrics}} and {{MessagingMetrics}} 
aren't handled in a consistent fashion. The former use the {{Verb}} as the 
scope (which is nice) but the latter appends the {{Verb}} to the metric name 
itself (which seems less nice). I'm not sure what decisions led to this 
situation but I'd be curious to hear if anyone thinks this a concern at all.
2. ReadRepair.RepairedAsync does not appear to be in use? I could be missing 
something but it does not seem to be referenced in the code beyond its 
declaration. Could this be deleted?
3. {{DroppedMessageMetrics}} had PAGED_SLICE and RANGED_SLICE documented but 
they don't appear to be available. I couldn't quite isolate exactly when they 
were removed but I assume it's safe that that happened?
4. A minor point but I'd wonder what tolerance there is for making casing 
consistent throughout the metrics. For example, Client metrics has some a 
mixture of casing. For example, connectedNativeClients and AuthFailure - Would 
be nice to my eyes to change to "ConnectedNativeClients" in that example.


> Metrics Documentation Enhancements
> --
>
> Key: CASSANDRA-15821
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15821
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Stephen Mallette
>Assignee: Stephen Mallette
>Priority: Normal
>
> CASSANDRA-15582 involves quality around metrics and it was mentioned that 
> reviewing and [improving 
> documentation|https://github.com/apache/cassandra/blob/trunk/doc/source/operating/metrics.rst]
>  around metrics would fall into that scope. Please consider some of this 
> analysis in determining what improvements to make here:
> Please see [this 
> spreadsheet|https://docs.google.com/spreadsheets/d/1iPWfCMIG75CI6LbYuDtCTjEOvZw-5dyH-e08bc63QnI/edit?usp=sharing]
>  that itemizes almost all of cassandra's metrics and whether they are 
> documented or not (and other notes).  That spreadsheet is "almost all" 
> because there are some metrics that don't seem to initialize as part of 
> Cassandra startup (i was able to trigger some to initialize, but all were not 
> immediately obvious). The missing metrics seem to be related to the following:
> * ThreadPool metrics - only some initialize at startup the list of which 
> follow below
> * Streaming Metrics
> * HintedHandoff Metrics
> * HintsService Metrics
> Here are the ThreadPool scopes that get listed:
> {code}
> AntiEntropyStage
> CacheCleanupExecutor
> CompactionExecutor
> GossipStage
> HintsDispatcher
> MemtableFlushWriter
> MemtablePostFlush
> MemtableReclaimMemory
> MigrationStage
> MutationStage
> Native-Transport-Requests
> PendingRangeCalculator
> PerDiskMemtableFlushWriter_0
> ReadStage
> Repair-Task
> RequestResponseStage
> Sampler
> SecondaryIndexManagement
> ValidationExecutor
> ViewBuildExecutor
> {code}
> I noticed that Keyspace Metrics have this note: "Most of these metrics are 
> the same as the Table Metrics above, only they are aggregated at the Keyspace 
> level." I think I've isolated those metrics on table that are not on keyspace 
> to specifically be:
> {code}
> BloomFilterFalsePositives
> BloomFilterFalseRatio
> BytesAnticompacted
> BytesFlushed
> BytesMutatedAnticompaction
> BytesPendingRepair
> BytesRepaired
> BytesUnrepaired
> CompactionBytesWritten
> CompressionRatio
> CoordinatorReadLatency
> CoordinatorScanLatency
> CoordinatorWriteLatency
> EstimatedColumnCountHistogram
> EstimatedPartitionCount
> EstimatedPartitionSizeHistogram
> KeyCacheHitRate
> LiveSSTableCount
> MaxPartitionSize
> MeanPartitionSize
> MinPartitionSize
> MutatedAnticompactionGauge
> PercentRepaired
> RowCacheHitOutOfRange
> RowCacheHit
> RowCacheMiss
> SpeculativeSampleLatencyNanos
> SyncTime
> WaitingOnFreeMemtableSpace
> DroppedMutations
> {code}
> Someone with greater knowledge of this area might consider it worth the 
> effort to see if any of these metrics should be aggregated to the keyspace 
> level in case they were inadvertently missed. In any case, perhaps the 
> documentation could easily now reflect which metric names could be expected 
> on Keyspace.
> The DroppedMessage metrics have a much larger body of scopes than just what 
> were documented:
> {code}
> ASYMMETRIC_SYNC_REQ
> BATCH_REMOVE_REQ
> BATCH_REMOVE_RSP
> BATCH_STORE_REQ
> 

[jira] [Comment Edited] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-05-27 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117855#comment-17117855
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-12126 at 5/27/20, 3:39 PM:
--

The test cases I provided demonstrate several consistency violations during 
range movements.  I've just thought of another one, and am writing a test case 
for it.  Perhaps we could claim that range movements are always (potentially) 
consistency violations, but they are particularly keenly felt when you claim a 
linearisable history.

There are also (more debatably) issues with TTL on {{system.paxos}}, 
particularly when mixed with non-global commit; perhaps we could claim this is 
the user's problem, but it's not clear why we support global consensus that can 
be lost through local commit, and I don't think we communicate clearly the 
consistency implications to not call this a bug.

Also, mixing LOCAL_SERIAL and SERIAL is entirely unsafe, and even supporting 
them both is arguably a consistency violation without mechanisms to safely 
transition from one level to another.


was (Author: benedict):
The test cases I provided demonstrate several consistency violations during 
range movements.  I've just thought of another one, and am writing a test case 
for it.  Perhaps we could claim that range movements are always consistency 
violations, but they are particularly keenly felt when you claim a linearisable 
history.

There are also (more debatably) issues with TTL on {{system.paxos}}, 
particularly when mixed with non-global commit; perhaps we could claim this is 
the user's problem, but it's not clear why we support global consensus that can 
be lost through local commit, and I don't think we communicate clearly the 
consistency implications to not call this a bug.

Also, mixing LOCAL_SERIAL and SERIAL is entirely unsafe, and even supporting 
them both is arguably a consistency violation without mechanisms to safely 
transition from one level to another.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To 

[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-05-27 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117855#comment-17117855
 ] 

Benedict Elliott Smith commented on CASSANDRA-12126:


The test cases I provided demonstrate several consistency violations during 
range movements.  I've just thought of another one, and am writing a test case 
for it.  Perhaps we could claim that range movements are always consistency 
violations, but they are particularly keenly felt when you claim a linearisable 
history.

There are also (more debatably) issues with TTL on {{system.paxos}}, 
particularly when mixed with non-global commit; perhaps we could claim this is 
the user's problem, but it's not clear why we support global consensus that can 
be lost through local commit, and I don't think we communicate clearly the 
consistency implications to not call this a bug.

Also, mixing LOCAL_SERIAL and SERIAL is entirely unsafe, and even supporting 
them both is arguably a consistency violation without mechanisms to safely 
transition from one level to another.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15805) Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones interacts with collection tombstones

2020-05-27 Thread Sylvain Lebresne (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-15805:
-
  Fix Version/s: (was: 3.11.x)
 (was: 3.0.x)
 3.11.7
 3.0.21
  Since Version: 3.0 alpha 1
Source Control Link: 
[8358e19840d352475a5831d130ff3c43a11f2f4e|https://github.com/apache/cassandra/commit/8358e19840d352475a5831d130ff3c43a11f2f4e],
 
[c8a2834606d683ba9945e9cc11bdb4207ce269d1|https://github.com/apache/cassandra/commit/c8a2834606d683ba9945e9cc11bdb4207ce269d1]
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones 
> interacts with collection tombstones
> --
>
> Key: CASSANDRA-15805
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15805
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Local/SSTable
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Normal
> Fix For: 3.0.21, 3.11.7
>
>
> The legacy reading code ({{LegacyLayout}} and 
> {{UnfilteredDeserializer.OldFormatDeserializer}}) does not handle correctly 
> the case where a range tombstone covering multiple rows interacts with a 
> collection tombstone.
> A simple example of this problem is if one runs on 2.X:
> {noformat}
> CREATE TABLE t (
>   k int,
>   c1 text,
>   c2 text,
>   a text,
>   b set,
>   c text,
>   PRIMARY KEY((k), c1, c2)
> );
> // Delete all rows where c1 is 'A'
> DELETE FROM t USING TIMESTAMP 1 WHERE k = 0 AND c1 = 'A';
> // Inserts a row covered by that previous range tombstone
> INSERT INTO t(k, c1, c2, a, b, c) VALUES (0, 'A', 'X', 'foo', {'whatever'}, 
> 'bar') USING TIMESTAMP 2;
> // Delete the collection of that previously inserted row
> DELETE b FROM t USING TIMESTAMP 3 WHERE k = 0 AND c1 = 'A' and c2 = 'X';
> {noformat}
> If the following is ran on 2.X (with everything either flushed in the same 
> table or compacted together), then this will result in the inserted row being 
> duplicated (one part containing the {{a}} column, the other the {{c}} one).
> I will note that this is _not_ a duplicate of CASSANDRA-15789 and this 
> reproduce even with the fix to {{LegacyLayout}} of this ticket. That said, 
> the additional code added to CASSANDRA-15789 to force merging duplicated rows 
> if they are produced _will_ end up fixing this as a consequence (assuming 
> there is no variation of this problem that leads to other visible issues than 
> duplicated rows). That said, I "think" we'd still rather fix the source of 
> the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassandra-3.11' into trunk

2020-05-27 Thread slebresne
This is an automated email from the ASF dual-hosted git repository.

slebresne pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 44c4e1f7874c962309a853fa2434a9dd4ae16aeb
Merge: e690e29 ebfd052
Author: Sylvain Lebresne 
AuthorDate: Wed May 27 17:18:43 2020 +0200

Merge branch 'cassandra-3.11' into trunk



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated (e690e29 -> 44c4e1f)

2020-05-27 Thread slebresne
This is an automated email from the ASF dual-hosted git repository.

slebresne pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from e690e29  Add docs section on configuring the Jenkins master to create 
the "Cassandra" category throttle.
 new 8358e19  Fix legacy handling of RangeTombstone with collection ones
 new c8a2834  Fix LegacyLayout handling of non-selected collection 
tombstones
 new ebfd052  Merge commit 'c8a2834606d683ba9945e9cc11bdb4207ce269d1' into 
cassandra-3.11
 new 44c4e1f  Merge branch 'cassandra-3.11' into trunk

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge commit 'c8a2834606d683ba9945e9cc11bdb4207ce269d1' into cassandra-3.11

2020-05-27 Thread slebresne
This is an automated email from the ASF dual-hosted git repository.

slebresne pushed a commit to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit ebfd05254f84000f71fa018650632d24d3761f07
Merge: 3cda9d7 c8a2834
Author: Sylvain Lebresne 
AuthorDate: Wed May 27 17:12:44 2020 +0200

Merge commit 'c8a2834606d683ba9945e9cc11bdb4207ce269d1' into cassandra-3.11

 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/db/LegacyLayout.java | 105 +
 .../cassandra/db/UnfilteredDeserializer.java   | 129 -
 .../upgrade/MixedModeRangeTombstoneTest.java   |  73 
 .../org/apache/cassandra/db/LegacyLayoutTest.java  | 102 +---
 5 files changed, 340 insertions(+), 70 deletions(-)

diff --cc CHANGES.txt
index 11515c4,46b3f56..a809016
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,7 -1,5 +1,8 @@@
 -3.0.21
 +3.11.7
 + * Fix CQL formatting of read command restrictions for slow query log 
(CASSANDRA-15503)
 + * Allow sstableloader to use SSL on the native port (CASSANDRA-14904)
 +Merged from 3.0:
+  * Fix duplicated row on 2.x upgrades when multi-rows range tombstones 
interact with collection ones (CASSANDRA-15805)
   * Rely on snapshotted session infos on StreamResultFuture.maybeComplete to 
avoid race conditions (CASSANDRA-15667)
   * EmptyType doesn't override writeValue so could attempt to write bytes when 
expected not to (CASSANDRA-15790)
   * Fix index queries on partition key columns when some partitions contains 
only static data (CASSANDRA-13666)
diff --cc src/java/org/apache/cassandra/db/LegacyLayout.java
index 4ec0c30,8492de5..b28c72a
--- a/src/java/org/apache/cassandra/db/LegacyLayout.java
+++ b/src/java/org/apache/cassandra/db/LegacyLayout.java
@@@ -1891,9 -1934,9 +1936,9 @@@ public abstract class LegacyLayou
  if ((start.collectionName == null) != (stop.collectionName == 
null))
  {
  if (start.collectionName == null)
- stop = new LegacyBound(stop.bound, stop.isStatic, null);
 -stop = new 
LegacyBound(Slice.Bound.inclusiveEndOf(stop.bound.values), stop.isStatic, null);
++stop = new 
LegacyBound(ClusteringBound.inclusiveEndOf(stop.bound.values), stop.isStatic, 
null);
  else
- start = new LegacyBound(start.bound, start.isStatic, 
null);
 -start = new 
LegacyBound(Slice.Bound.inclusiveStartOf(start.bound.values), start.isStatic, 
null);
++start = new 
LegacyBound(ClusteringBound.inclusiveStartOf(start.bound.values), 
start.isStatic, null);
  }
  else if (!Objects.equals(start.collectionName, 
stop.collectionName))
  {
@@@ -1920,11 -1963,21 +1965,21 @@@
  return new LegacyRangeTombstone(newStart, stop, deletionTime);
  }
  
 -public LegacyRangeTombstone withNewStart(Slice.Bound newStart)
++public LegacyRangeTombstone withNewStart(ClusteringBound newStart)
+ {
+ return withNewStart(new LegacyBound(newStart, start.isStatic, 
null));
+ }
+ 
  public LegacyRangeTombstone withNewEnd(LegacyBound newStop)
  {
  return new LegacyRangeTombstone(start, newStop, deletionTime);
  }
  
 -public LegacyRangeTombstone withNewEnd(Slice.Bound newEnd)
++public LegacyRangeTombstone withNewEnd(ClusteringBound newEnd)
+ {
+ return withNewEnd(new LegacyBound(newEnd, stop.isStatic, null));
+ }
+ 
  public boolean isCell()
  {
  return false;
diff --cc src/java/org/apache/cassandra/db/UnfilteredDeserializer.java
index cdcde2e,2d270bc..262b333
--- a/src/java/org/apache/cassandra/db/UnfilteredDeserializer.java
+++ b/src/java/org/apache/cassandra/db/UnfilteredDeserializer.java
@@@ -480,19 -480,9 +480,10 @@@ public abstract class UnfilteredDeseria
  this.helper = helper;
  this.grouper = new LegacyLayout.CellGrouper(metadata, helper);
  this.tombstoneTracker = new 
TombstoneTracker(partitionDeletion);
- this.atoms = new AtomIterator(atomReader);
- }
- 
- private boolean isRow(LegacyLayout.LegacyAtom atom)
- {
- if (atom.isCell())
- return true;
- 
- LegacyLayout.LegacyRangeTombstone tombstone = 
atom.asRangeTombstone();
- return tombstone.isCollectionTombstone() || 
tombstone.isRowDeletion(metadata);
+ this.atoms = new AtomIterator(atomReader, metadata);
  }
  
 +
  public boolean hasNext()
  {
  // Note that we loop on next == null because 
TombstoneTracker.openNew() could return null below or the atom might be 
shadowed.
@@@ -540,13 -530,57 +531,57 @@@
  

[cassandra] branch cassandra-3.11 updated (3cda9d7 -> ebfd052)

2020-05-27 Thread slebresne
This is an automated email from the ASF dual-hosted git repository.

slebresne pushed a change to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 3cda9d7  Merge branch cassandra-3.0 into cassandra-3.11
 new 8358e19  Fix legacy handling of RangeTombstone with collection ones
 new c8a2834  Fix LegacyLayout handling of non-selected collection 
tombstones
 new ebfd052  Merge commit 'c8a2834606d683ba9945e9cc11bdb4207ce269d1' into 
cassandra-3.11

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/db/LegacyLayout.java | 105 +
 .../cassandra/db/UnfilteredDeserializer.java   | 129 -
 .../upgrade/MixedModeRangeTombstoneTest.java   |  73 
 .../org/apache/cassandra/db/LegacyLayoutTest.java  | 102 +---
 5 files changed, 340 insertions(+), 70 deletions(-)
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeRangeTombstoneTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/02: Fix legacy handling of RangeTombstone with collection ones

2020-05-27 Thread slebresne
This is an automated email from the ASF dual-hosted git repository.

slebresne pushed a commit to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 8358e19840d352475a5831d130ff3c43a11f2f4e
Author: Sylvain Lebresne 
AuthorDate: Fri May 8 18:12:55 2020 +0200

Fix legacy handling of RangeTombstone with collection ones

When a multi-row range tombstone interacts with a a collection tombstone
within one of a covered row, the resulting range tombstone in the legacy
format will start in the middle of the row and extend past said row and
it needs special handling.

Before this commit, the code deserializing that RT was making it
artificially start at the end of the row (in which the collection
tombstone is), but that means that when `LegacyLayout.CellGrouper`
encountered it, it decided the row was finished, even if it was not,
leading to potential row duplication.

The patch solves this by:
1. making that problematic tombstone start at the beginning of the row
instead of its end (to avoid code deciding the row is over).
2. modify `UnfilteredDeserializer` to 'split' that range tombstone into
a row tombstone for the row it covers, which is handled as a normal row
tombstone, and push the rest of the range tombstone (that starts after
the row and extends to the original end of the RT) to be handled after
that row is fully "grouped".

The patch also removes the possibility of getting an empty row from
`LegacyLayout#getNextRow` to avoid theoretical problems with that.

Patch by Sylvain Lebresne; reviewed by Marcus Eriksson & Aleksey
Yeschenko for CASSANDRA-15805
---
 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/db/LegacyLayout.java |  99 
 .../cassandra/db/UnfilteredDeserializer.java   | 129 -
 .../upgrade/MixedModeRangeTombstoneTest.java   |  73 
 4 files changed, 252 insertions(+), 50 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index cdb9ad0..46b3f56 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.21
+ * Fix duplicated row on 2.x upgrades when multi-rows range tombstones 
interact with collection ones (CASSANDRA-15805)
  * Rely on snapshotted session infos on StreamResultFuture.maybeComplete to 
avoid race conditions (CASSANDRA-15667)
  * EmptyType doesn't override writeValue so could attempt to write bytes when 
expected not to (CASSANDRA-15790)
  * Fix index queries on partition key columns when some partitions contains 
only static data (CASSANDRA-13666)
diff --git a/src/java/org/apache/cassandra/db/LegacyLayout.java 
b/src/java/org/apache/cassandra/db/LegacyLayout.java
index 37cc935..39dd54a 100644
--- a/src/java/org/apache/cassandra/db/LegacyLayout.java
+++ b/src/java/org/apache/cassandra/db/LegacyLayout.java
@@ -1115,7 +1115,7 @@ public abstract class LegacyLayout
 return true;
 }
 
-private static Comparator legacyAtomComparator(CFMetaData 
metadata)
+static Comparator legacyAtomComparator(CFMetaData metadata)
 {
 return (o1, o2) ->
 {
@@ -1373,8 +1373,24 @@ public abstract class LegacyLayout
 this.hasValidCells = false;
 }
 
+/**
+ * Try adding the provided atom to the currently grouped row.
+ *
+ * @param atom the new atom to try to add. This must be a "row" 
atom, that is either a cell or a legacy
+ * range tombstone that covers only one row (row deletion) 
or a subset of it (collection
+ * deletion). Meaning that legacy range tombstone covering 
multiple rows (that should be handled as
+ * legit range tombstone in the new storage engine) should 
be handled separately. Atoms should also
+ * be provided in proper clustering order.
+ * @return {@code true} if the provided atom has been "consumed" by 
this grouper (this does _not_ mean the
+ *  atom has been "used" by the grouper as the grouper will 
skip some shadowed atoms for instance, just
+ *  that {@link #getRow()} shouldn't be called just yet if 
there is more atom in the atom iterator we're
+ *  grouping). {@code false} otherwise, that is if the row 
currently built by this grouper is done
+ *  _without_ the provided atom being "consumed" (and so 
{@link #getRow()} should be called and the
+ *  grouper resetted, after which the provided atom should be 
provided again).
+ */
 public boolean addAtom(LegacyAtom atom)
 {
+assert atom.isRowAtom(metadata) : "Unexpected non in-row legacy 
range tombstone " + atom;
 return atom.isCell()
  ? addCell(atom.asCell())
  : addRangeTombstone(atom.asRangeTombstone());
@@ -1472,11 +1488,16 @@ public abstract class 

[cassandra] 02/02: Fix LegacyLayout handling of non-selected collection tombstones

2020-05-27 Thread slebresne
This is an automated email from the ASF dual-hosted git repository.

slebresne pushed a commit to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit c8a2834606d683ba9945e9cc11bdb4207ce269d1
Author: Sylvain Lebresne 
AuthorDate: Wed May 13 11:44:08 2020 +0200

Fix LegacyLayout handling of non-selected collection tombstones

If a collection tombstone is not included by a query, it can be ignored,
but it currently made `LegacyLayout.CellGrouper#addCollectionTombstone`
return `false`, which made it stop the current row, which is incorrect
(this can potentially lead to a duplicate row). This patch changes it
to return `true`.

Patch by Sylvain Lebresne; reviewed by Marcus Eriksson & Aleksey
Yeschenko for CASSANDRA-15805
---
 src/java/org/apache/cassandra/db/LegacyLayout.java |   6 +-
 .../org/apache/cassandra/db/LegacyLayoutTest.java  | 102 +
 2 files changed, 88 insertions(+), 20 deletions(-)

diff --git a/src/java/org/apache/cassandra/db/LegacyLayout.java 
b/src/java/org/apache/cassandra/db/LegacyLayout.java
index 39dd54a..8492de5 100644
--- a/src/java/org/apache/cassandra/db/LegacyLayout.java
+++ b/src/java/org/apache/cassandra/db/LegacyLayout.java
@@ -1537,8 +1537,12 @@ public abstract class LegacyLayout
 
 private boolean addCollectionTombstone(LegacyRangeTombstone tombstone)
 {
+// If the collection tombstone is not included in the query (which 
technically would only apply to thrift
+// queries since CQL one "fetch" everything), we can skip it (so 
return), but we're problably still within
+// the current row so we return `true`. Technically, it is 
possible that tombstone belongs to another row
+// that the row currently grouped, but as we ignore it, returning 
`true` is ok in that case too.
 if (!helper.includes(tombstone.start.collectionName))
-return false; // see CASSANDRA-13109
+return true; // see CASSANDRA-13109
 
 // The helper needs to be informed about the current complex 
column identifier before
 // it can perform the comparison between the recorded drop time 
and the RT deletion time.
diff --git a/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java 
b/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java
index 0bb2459..f0d2a02 100644
--- a/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java
+++ b/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java
@@ -24,18 +24,19 @@ import java.nio.file.Files;
 import java.nio.file.Path;
 import java.nio.file.Paths;
 
+import org.apache.cassandra.db.LegacyLayout.CellGrouper;
+import org.apache.cassandra.db.LegacyLayout.LegacyBound;
+import org.apache.cassandra.db.LegacyLayout.LegacyCell;
+import org.apache.cassandra.db.LegacyLayout.LegacyRangeTombstone;
 import org.apache.cassandra.db.filter.ColumnFilter;
 import org.apache.cassandra.db.marshal.MapType;
 import org.apache.cassandra.db.marshal.UTF8Type;
-import org.apache.cassandra.db.partitions.ImmutableBTreePartition;
 import org.apache.cassandra.db.rows.BufferCell;
 import org.apache.cassandra.db.rows.Cell;
 import org.apache.cassandra.db.rows.RowIterator;
-import org.apache.cassandra.db.rows.Rows;
 import org.apache.cassandra.db.rows.SerializationHelper;
 import org.apache.cassandra.db.rows.UnfilteredRowIterator;
 import org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer;
-import org.apache.cassandra.db.rows.UnfilteredRowIterators;
 import org.apache.cassandra.db.transform.FilteredRows;
 import org.apache.cassandra.exceptions.ConfigurationException;
 import org.apache.cassandra.io.util.DataInputBuffer;
@@ -62,10 +63,10 @@ import org.apache.cassandra.db.rows.BTreeRow;
 import org.apache.cassandra.db.rows.Row;
 import org.apache.cassandra.dht.Murmur3Partitioner;
 import org.apache.cassandra.schema.KeyspaceParams;
-import org.apache.cassandra.utils.ByteBufferUtil;
 import org.apache.cassandra.utils.Hex;
 
 import static org.apache.cassandra.net.MessagingService.VERSION_21;
+import static org.apache.cassandra.utils.ByteBufferUtil.bytes;
 import static org.junit.Assert.*;
 
 public class LegacyLayoutTest
@@ -98,7 +99,7 @@ public class LegacyLayoutTest
 builder.addComplexDeletion(b, new DeletionTime(1L, 1));
 Row row = builder.build();
 
-ByteBuffer key = ByteBufferUtil.bytes(1);
+ByteBuffer key = bytes(1);
 PartitionUpdate upd = PartitionUpdate.singleRowUpdate(table, key, row);
 
 LegacyLayout.LegacyUnfilteredPartition p = 
LegacyLayout.fromUnfilteredRowIterator(null, upd.unfilteredIterator());
@@ -216,7 +217,7 @@ public class LegacyLayoutTest
 builder.addCell(new BufferCell(v, 1L, Cell.NO_TTL, 
Cell.NO_DELETION_TIME, Int32Serializer.instance.serialize(1), null));
 Row row = builder.build();
 
-DecoratedKey pk = table.decorateKey(ByteBufferUtil.bytes(1));
+DecoratedKey pk = 

[cassandra] branch cassandra-3.0 updated (a4b6deb -> c8a2834)

2020-05-27 Thread slebresne
This is an automated email from the ASF dual-hosted git repository.

slebresne pushed a change to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from a4b6deb  Rely on snapshotted session infos on 
StreamResultFuture.maybeComplete to avoid races
 new 8358e19  Fix legacy handling of RangeTombstone with collection ones
 new c8a2834  Fix LegacyLayout handling of non-selected collection 
tombstones

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/db/LegacyLayout.java | 105 +
 .../cassandra/db/UnfilteredDeserializer.java   | 129 -
 .../upgrade/MixedModeRangeTombstoneTest.java   |  73 
 .../org/apache/cassandra/db/LegacyLayoutTest.java  | 102 +---
 5 files changed, 340 insertions(+), 70 deletions(-)
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeRangeTombstoneTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-05-27 Thread Sylvain Lebresne (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117833#comment-17117833
 ] 

Sylvain Lebresne commented on CASSANDRA-12126:
--

bq. But we do have other serious consistency violations that should also be 
fixed.

Could you expand on that?


> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15778) CorruptSSTableException after a 2.1 SSTable is upgraded to 3.0, failing reads

2020-05-27 Thread Sylvain Lebresne (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117827#comment-17117827
 ] 

Sylvain Lebresne commented on CASSANDRA-15778:
--

That patch looks like a reasonable solution to me, at least from my 
understanding of the issue.

Small comments on the code itself:
* I'd put a comment in {{AlterTableStatement}} to point out to this ticket (may 
feel like a peculiar special case to future readers without context).
* In {{AbstractType}}, the changes to {{writeValue}}/{{writtenLength}} feels 
confusing to me, and if the new code is ever triggered, this would mean we 
silently drop a value on the floor (we get a non-empty value, but the type say 
the value should be empty, so we'd write nothing), and that doesn't feel lik a 
good idea. Instead of specializing the 0 size case, I'd just add a {{assert 
valueLengthIfFixed < 0 || value.remaining() == valueLengthIfFixed}} to 
basically ensure we're not going to write something we don't know how to read 
(and effectively forbid the call of those method for {{EmptyType}} in 
conjunction with the existing assert).
* Assuming we agree on the previous point, I'd prefer not overriding the 
methods in {{EmptyType}}. For the write ones, it wouldn't add anything, and 
overriding {{readValue}} feels confusing when the rest of the code ensures we 
can never write an empty value through those methods.
* Nit: LegacySchemaMigrator has unused leftover imports 
({{java.io.InvalidClassException}} and 
{{net.bytebuddy.implementation.bytecode.Throw}}).


> CorruptSSTableException after a 2.1 SSTable is upgraded to 3.0, failing reads
> -
>
> Key: CASSANDRA-15778
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15778
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Sumanth Pasupuleti
>Assignee: Alex Petrov
>Priority: Normal
> Fix For: 3.0.x
>
>
> Below is the exception with stack trace. This issue is consistently 
> reproduce-able.
> {code:java}
> ERROR [SharedPool-Worker-1] 2020-05-01 14:57:57,661 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,main]ERROR [SharedPool-Worker-1] 2020-05-01 
> 14:57:57,661 AbstractLocalAwareExecutorService.java:169 - Uncaught exception 
> on thread 
> Thread[SharedPool-Worker-1,5,main]org.apache.cassandra.io.sstable.CorruptSSTableException:
>  Corrupted: 
> /mnt/data/cassandra/data//  at 
> org.apache.cassandra.db.columniterator.AbstractSSTableIterator$Reader.hasNext(AbstractSSTableIterator.java:349)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.columniterator.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:220)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.columniterator.SSTableIterator.hasNext(SSTableIterator.java:33)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:131)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:77)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at 
> 

[jira] [Updated] (CASSANDRA-15665) StreamManager should clearly differentiate between "initiator" and "receiver" sessions

2020-05-27 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15665:
---
Reviewers: Benjamin Lerer, Sergio Bossa, Benjamin Lerer  (was: Benjamin 
Lerer, Sergio Bossa)
   Benjamin Lerer, Sergio Bossa, Benjamin Lerer  (was: Benjamin 
Lerer, Sergio Bossa)
   Status: Review In Progress  (was: Patch Available)

> StreamManager should clearly differentiate between "initiator" and "receiver" 
> sessions
> --
>
> Key: CASSANDRA-15665
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15665
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0
>
>
> {{StreamManager}} does currently a suboptimal job in differentiating between 
> stream sessions (in form of {{StreamResultFuture}}) which have been either 
> initiated or "received", for the following reasons:
> 1) Naming is IMO confusing: a "receiver" session could actually both send and 
> receive files, so technically an initiator is also a receiver.
> 2) {{StreamManager#findSession()}}  assumes we should first looking into 
> "initiator" sessions, then into "receiver" ones: this is a dangerous 
> assumptions, in particular for test environments where the same process could 
> work as both an initiator and a receiver.
> I would recommend the following changes:
> 1) Rename "receiver" with "follower" everywhere the former is used.
> 2) Introduce a new flag into {{StreamMessageHeader}} to signal if the message 
> comes from an initiator or follower session, in order to correctly 
> differentiate and look for sessions in {{StreamManager}}.
> While my arguments above might seem trivial, I believe they will improve 
> clarity and save from potential bugs/headaches at testing time, and doing 
> such changes now that we're revamping streaming for 4.0 seems the right time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15812) Submitting Validation requests can block ANTI_ENTROPY stage

2020-05-27 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117787#comment-17117787
 ] 

Benjamin Lerer commented on CASSANDRA-15812:


The patch looks good to me. I just wonder if it makes sense to allow users to 
use the blocking behavior for the {{ValidationExecutor}} as we know that this 
approach might lead to blocking the ANTI_ENTROPY stage. Is there a scenario in 
which using that behavior would be better?

> Submitting Validation requests can block ANTI_ENTROPY stage 
> 
>
> Key: CASSANDRA-15812
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15812
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 4.0-alpha
>
>
>  RepairMessages are handled on Stage.ANTI_ENTROPY, which has a thread pool 
> with core/max capacity of one, ie. we can only process one message at a time. 
>  
> Scheduling validation compactions may however block the stage completely, by 
> blocking on CompactionManager's ValidationExecutor while submitting a new 
> validation compaction, in cases where there are already more validations 
> running than can be executed in parallel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15825) Fix flaky test incrementalSSTableSelection - org.apache.cassandra.db.streaming.CassandraStreamManagerTest

2020-05-27 Thread Berenguer Blasi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-15825:

Test and Documentation Plan: See PR
 Status: Patch Available  (was: In Progress)

> Fix flaky test incrementalSSTableSelection - 
> org.apache.cassandra.db.streaming.CassandraStreamManagerTest
> -
>
> Key: CASSANDRA-15825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-alpha
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Build link: 
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/287/workflows/06baf3db-7094-431f-920d-e8fcd1da9cce/jobs/1398
>  
> {code}
> java.lang.RuntimeException: java.nio.file.NoSuchFileException: 
> /tmp/cassandra/build/test/cassandra/data:2/ks_1589913975959/tbl-051c0a709a0111eab5fb6f52366536f8/na-4-big-Statistics.db
>   at 
> org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:55)
>   at 
> org.apache.cassandra.io.util.ChannelProxy.(ChannelProxy.java:66)
>   at 
> org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:315)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:126)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:136)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableReader.reloadSSTableMetadata(SSTableReader.java:2047)
>   at 
> org.apache.cassandra.db.streaming.CassandraStreamManagerTest.mutateRepaired(CassandraStreamManagerTest.java:128)
>   at 
> org.apache.cassandra.db.streaming.CassandraStreamManagerTest.incrementalSSTableSelection(CassandraStreamManagerTest.java:175)
> Caused by: java.nio.file.NoSuchFileException: 
> /tmp/cassandra/build/test/cassandra/data:2/ks_1589913975959/tbl-051c0a709a0111eab5fb6f52366536f8/na-4-big-Statistics.db
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:51)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-05-27 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117652#comment-17117652
 ] 

Benedict Elliott Smith commented on CASSANDRA-12126:


bq. I'm rather cold on that because, tbh. I think non-strict serializability is 
a theoretical notion that is useless in practice and that it is something we 
should not offer. And I'd rather avoid one more "feature" for which we spend 
our time saying "don't use it".

Yeah, I'm very sympathetic to this view, and have always assumed 
linearizability with partitions as the object.  I'm just really trying to 
morally justify providing some time to fix this without any negative 
repercussions.  

Either way, we should definitely clarify what we mean by SERIAL in some 
official project documentation somewhere though.  We probably need to do so in 
terms of strict serializability as opposed to linearizability, so that it can 
be consistent with a future in which we support multi-partition transactions 
(which as a project we really need to deliver in the not-too-distant future).

bq. non-applying CAS for that

FWIW, I think this particular case is a no-brainer; there's no real cost to 
strengthening the semantics of non-applying CAS IMO, since users should 
anticipate their CAS operations will ordinarily take this long.  Whatever the 
conclusion of our discussion, I think we should apply a fix at least for the 
non-applying case immediately, and I do not believe any flag to disable this 
part of the fix is necessary.

Reads are trickier, because the user will see a significant performance penalty 
on patch version upgrade.  I'm sympathetic to the view we should just fix the 
read part immediately, performance regressions be damned.  But we do have other 
serious consistency violations that should also be fixed.  I think it is worth 
_considering_ if we should instead aggressively try to remedy all of the known 
issues, have a strong verification push, and then roll out all of the changes 
at-once - including a fix for this that does not regress performance.  It might 
seem a lot for a patch version, but I'm not sure risk is a concern when we know 
there are several serious problems today, and have been for years.

I'm not going to advocate super strongly for either approach, as I don't think 
there's a clear answer, I just want to raise the alternative as an option to 
expressly consider.

bq. Awesome, thanks. I'll look at integrating those in the branch if you don't 
mind.

Absolutely, that was my intention.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors 

[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-05-27 Thread Sylvain Lebresne (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117628#comment-17117628
 ] 

Sylvain Lebresne commented on CASSANDRA-12126:
--

bq. I'm amenable to such flag

Actually, let me rephrase that a bit. I'd *really* prefer not adding such flag. 
If someone is ok with serializability without linearizability, then they can 
use QUORUM reads, and given how things are implemented, it provides 
(non-strict) serializability. Granted, for someone that uses SERIAL today, is 
ok with the lack of linearizability and can't afford the performance penalty, 
it'll require a client side change, which this flag would avoid, so there is 
not zero value to such flag. But I suspect user fitting that category 
(knowingly ok with lack of linearizability) is really really small, and we 
always have to make trade-offs. So in that case I feel adding one more flag, 
one I consider dangerous, is not worth it. So to clarify, if a consensus 
appears for such flag, so be it, I'll add it, but I'm personally not neutral 
either.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15825) Fix flaky test incrementalSSTableSelection - org.apache.cassandra.db.streaming.CassandraStreamManagerTest

2020-05-27 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117624#comment-17117624
 ] 

Berenguer Blasi commented on CASSANDRA-15825:
-

I noticed commenting out the other test in that class made this one fail 
consistently. So there seems to be cross talk between test cases. It seems that 
on the 4th sstable a compaction is triggered and that can remove the sstables 
under our feet depending on who is faster. The fix I applied is to prevent the 
compaction from running. I hope it makes sense. Waiting on CI now.

> Fix flaky test incrementalSSTableSelection - 
> org.apache.cassandra.db.streaming.CassandraStreamManagerTest
> -
>
> Key: CASSANDRA-15825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-alpha
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Build link: 
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/287/workflows/06baf3db-7094-431f-920d-e8fcd1da9cce/jobs/1398
>  
> {code}
> java.lang.RuntimeException: java.nio.file.NoSuchFileException: 
> /tmp/cassandra/build/test/cassandra/data:2/ks_1589913975959/tbl-051c0a709a0111eab5fb6f52366536f8/na-4-big-Statistics.db
>   at 
> org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:55)
>   at 
> org.apache.cassandra.io.util.ChannelProxy.(ChannelProxy.java:66)
>   at 
> org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:315)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:126)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:136)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableReader.reloadSSTableMetadata(SSTableReader.java:2047)
>   at 
> org.apache.cassandra.db.streaming.CassandraStreamManagerTest.mutateRepaired(CassandraStreamManagerTest.java:128)
>   at 
> org.apache.cassandra.db.streaming.CassandraStreamManagerTest.incrementalSSTableSelection(CassandraStreamManagerTest.java:175)
> Caused by: java.nio.file.NoSuchFileException: 
> /tmp/cassandra/build/test/cassandra/data:2/ks_1589913975959/tbl-051c0a709a0111eab5fb6f52366536f8/na-4-big-Statistics.db
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:51)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-05-27 Thread Sylvain Lebresne (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117615#comment-17117615
 ] 

Sylvain Lebresne commented on CASSANDRA-12126:
--

{quote}I think we should include a flag to disable the fix
{quote}
The option of having a flag occurred to me, but I rejected it initially because 
I continue to believe the current behavior is wrong (a moral judgment, I guess) 
and in principle, having a "please, make my database broken" flag does not feel 
like a good idea.

But I reckon that it _may_ exists advanced users that did noticed the lack of 
linearizability for reads and effectively built around it knowingly, for which 
the performance impact may be considered a regression with no upside (but if 
you sense skepticism on my part when reading that sentence, you're radar is not 
completely off).

And as we're talking minor upgrade here, I'm amenable to such flag, though I'd 
prefer making it clear somehow that it is unsafe/risky and something we may 
remove in the future with no particular warning.
{quote}It would be good to have a test for that as well.
{quote}
Certainly, good point, I can add the 2 missing interleaving.
{quote}do we actually claim our consistency properties are for SERIAL?
{quote}
While our official doc on the matter is certainly lacking (not spelling much 
guarantee at all afaict, and I'm happy to piggy-back on this ticket to correct 
that), we've always implied linearizability. I have, at least, and I'm sure I 
can dig up other doing it as well on the mailing list if necessary. We did this 
both by throwing the linearizable word out from time to time, but also by 
repeatedly recommending that when a write times out, one needs to issue a 
SERIAL read to 'observe' if that write went through or not (and as an aside, if 
you can't rely on either reads or non-applying CAS for that, I'm not even sure 
how to use LWTs, except maybe for excessively specific cases).
{quote}perhaps we should instead introduce a new STRICT_SERIAL consistency level
{quote}
I'm rather cold on that because, tbh. I think non-strict serializability is a 
theoretical notion that is useless in practice and that it is something we 
should not offer. And I'd rather avoid one more "feature" for which we spend 
our time saying "don't use it".
{quote}I've pushed various test cases
{quote}
Awesome, thanks. I'll look at integrating those in the branch if you don't mind.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should 

[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-05-27 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117597#comment-17117597
 ] 

Benedict Elliott Smith commented on CASSANDRA-12126:


I've pushed various test cases 
[here|https://github.com/belliottsmith/cassandra/tree/12126-tests-3.0] - most 
of them marked {{@Ignore}} because they are known to fail, and won't be 
resolved immediately.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15830) Invalid version value: 4.0~alpha4 during startup

2020-05-27 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever reassigned CASSANDRA-15830:
--

Assignee: Michael Semb Wever

> Invalid version value: 4.0~alpha4 during startup
> 
>
> Key: CASSANDRA-15830
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15830
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Eric Wong
>Assignee: Michael Semb Wever
>Priority: Normal
>
> Hi:
> We are testing the latest cassandra-4.0 on Centos 7 using a clean database.  
> When we started cassandra the first time, everything is fine.  However, when 
> we stop and restart cassandra, we got the following error and the db refuses 
> to startup:
> {code}
> ERROR [main] 2020-05-22 05:58:18,698 CassandraDaemon.java:789 - Exception 
> encountered during startup
> java.lang.IllegalArgumentException: Invalid version value: 4.0~alpha4
>  at 
> org.apache.cassandra.utils.CassandraVersion.(CassandraVersion.java:64)
>  at 
> org.apache.cassandra.io.sstable.SSTableHeaderFix.fixNonFrozenUDTIfUpgradeFrom30(SSTableHeaderFix.java:84)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:250)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:650)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:767)
> {code}
> The only way to get the node up and running again is by deleting all data 
> under /var/lib/cassandra.
>  
> Is that a known issue?
> Thanks, Eric
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org