[jira] [Updated] (CASSANDRA-15274) Multiple Corrupt datafiles across entire environment

2019-10-28 Thread Phil O Conduin (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil O Conduin updated CASSANDRA-15274:
---
Resolution: Fixed
Status: Resolved  (was: Triage Needed)

Our datafile corruption issues were a problem with the OS wrongly taking one 
block belonging to a C* data file thinking it was no longer used and treating 
it as a free block that would later be used.

For example:
C* deletes file after compaction, OS collects all blocks which are free now and 
sends TRIM command to SSD, but SSD from time to time picks the wrong block, not 
the one reported by OS - does the trim - causing zeroized blocks to be seen in 
the datafile and later use it for different file.
So the symptom is - we suddenly see 4096 zeroes in the datafile- it means SSD 
just trimmed the block, after some time we can see some data written to those 
blocks - it means the block is used by other file and therefore gives us a 
corrupt file.

We turned off the scheduled TRIM function on the OS and we are no longer 
getting corruptions.

This was very difficult to pinpoint.

> Multiple Corrupt datafiles across entire environment 
> -
>
> Key: CASSANDRA-15274
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15274
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Phil O Conduin
>Priority: Normal
>  Labels: impact-high
>
> Cassandra Version: 2.2.13
> PRE-PROD environment.
>  * 2 datacenters.
>  * 9 physical servers in each datacenter - (_Cisco UCS C220 M4 SFF_)
>  * 4 Cassandra instances on each server (cass_a, cass_b, cass_c, cass_d)
>  * 72 Cassandra instances across the 2 data centres, 36 in site A, 36 in site 
> B.
> We also have 2 Reaper Nodes we use for repair.  One reaper node in each 
> datacenter each running with its own Cassandra back end in a cluster together.
> OS Details [Red Hat Linux]
> cass_a@x 0 10:53:01 ~ $ uname -a
> Linux x 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018 x86_64 
> x86_64 x86_64 GNU/Linux
> cass_a@x 0 10:57:31 ~ $ cat /etc/*release
> NAME="Red Hat Enterprise Linux Server"
> VERSION="7.6 (Maipo)"
> ID="rhel"
> Storage Layout 
> cass_a@xx 0 10:46:28 ~ $ df -h
> Filesystem                         Size  Used Avail Use% Mounted on
> /dev/mapper/vg01-lv_root            20G  2.2G   18G  11% /
> devtmpfs                            63G     0   63G   0% /dev
> tmpfs                               63G     0   63G   0% /dev/shm
> tmpfs                               63G  4.1G   59G   7% /run
> tmpfs                               63G     0   63G   0% /sys/fs/cgroup
> >> 4 cassandra instances
> /dev/sdd                           1.5T  802G  688G  54% /data/ssd4
> /dev/sda                           1.5T  798G  692G  54% /data/ssd1
> /dev/sdb                           1.5T  681G  810G  46% /data/ssd2
> /dev/sdc                           1.5T  558G  932G  38% /data/ssd3
> Cassandra load is about 200GB and the rest of the space is snapshots
> CPU
> cass_a@x 127 10:58:47 ~ $ lscpu | grep -E '^Thread|^Core|^Socket|^CPU\('
> CPU(s):                64
> Thread(s) per core:    2
> Core(s) per socket:    16
> Socket(s):             2
> *Description of problem:*
> During repair of the cluster, we are seeing multiple corruptions in the log 
> files on a lot of instances.  There seems to be no pattern to the corruption. 
>  It seems that the repair job is finding all the corrupted files for us.  The 
> repair will hang on the node where the corrupted file is found.  To fix this 
> we remove/rename the datafile and bounce the Cassandra instance.  Our 
> hardware/OS team have stated there is no problem on their side.  I do not 
> believe it the repair causing the corruption. 
>  
> So let me give you an example of a corrupted file and maybe someone might be 
> able to work through it with me?
> When this corrupted file was reported in the log it looks like it was the 
> repair that found it.
> $ journalctl -u cassmeta-cass_b.service --since "2019-08-07 22:25:00" --until 
> "2019-08-07 22:45:00"
> Aug 07 22:30:33 cassandra[34611]: INFO  21:30:33 Writing 
> Memtable-compactions_in_progress@830377457(0.008KiB serialized bytes, 1 ops, 
> 0%/0% of on/off-heap limit)
> Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Failed creating a merkle 
> tree for [repair #9587a200-b95a-11e9-8920-9f72868b8375 on KeyspaceMetadata/x, 
> (-1476350953672479093,-1474461
> Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Exception in thread 
> Thread[ValidationExecutor:825,1,main]
> Aug 07 22:30:33 cassandra[34611]: org.apache.cassandra.io.FSReadError: 
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /x/ssd2/data/KeyspaceMetadata/x-1e453cb0
> Aug 07 22:30:33 cassandra[34611]: at 
> 

[jira] [Commented] (CASSANDRA-15274) Multiple Corrupt datafiles across entire environment

2019-10-28 Thread Phil O Conduin (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961436#comment-16961436
 ] 

Phil O Conduin commented on CASSANDRA-15274:


Hi Benedict,

Sorry I forgot to come back and update this jira.

 

Our datafile corruption issues were a problem with the OS wrongly taking one 
block belonging to a C* data file thinking it was no longer used and treating 
it as a free block that would later be used.

For example:
C* deletes file after compaction, OS collects all blocks which are free now and 
sends TRIM command to SSD, but SSD from time to time picks the wrong block, not 
the one reported by OS - does the trim - causing zeroized blocks to be seen in 
the datafile and later use it for different file.
So the symptom is - we suddenly see 4096 zeroes in the datafile- it means SSD 
just trimmed the block, after some time we can see some data written to those 
blocks - it means the block is used by other file and therefore gives us a 
corrupt file.

We turned off the scheduled TRIM function on the OS and we are no longer 
getting corruptions.

This was very difficult to pinpoint.

> Multiple Corrupt datafiles across entire environment 
> -
>
> Key: CASSANDRA-15274
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15274
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Phil O Conduin
>Priority: Normal
>  Labels: impact-high
>
> Cassandra Version: 2.2.13
> PRE-PROD environment.
>  * 2 datacenters.
>  * 9 physical servers in each datacenter - (_Cisco UCS C220 M4 SFF_)
>  * 4 Cassandra instances on each server (cass_a, cass_b, cass_c, cass_d)
>  * 72 Cassandra instances across the 2 data centres, 36 in site A, 36 in site 
> B.
> We also have 2 Reaper Nodes we use for repair.  One reaper node in each 
> datacenter each running with its own Cassandra back end in a cluster together.
> OS Details [Red Hat Linux]
> cass_a@x 0 10:53:01 ~ $ uname -a
> Linux x 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018 x86_64 
> x86_64 x86_64 GNU/Linux
> cass_a@x 0 10:57:31 ~ $ cat /etc/*release
> NAME="Red Hat Enterprise Linux Server"
> VERSION="7.6 (Maipo)"
> ID="rhel"
> Storage Layout 
> cass_a@xx 0 10:46:28 ~ $ df -h
> Filesystem                         Size  Used Avail Use% Mounted on
> /dev/mapper/vg01-lv_root            20G  2.2G   18G  11% /
> devtmpfs                            63G     0   63G   0% /dev
> tmpfs                               63G     0   63G   0% /dev/shm
> tmpfs                               63G  4.1G   59G   7% /run
> tmpfs                               63G     0   63G   0% /sys/fs/cgroup
> >> 4 cassandra instances
> /dev/sdd                           1.5T  802G  688G  54% /data/ssd4
> /dev/sda                           1.5T  798G  692G  54% /data/ssd1
> /dev/sdb                           1.5T  681G  810G  46% /data/ssd2
> /dev/sdc                           1.5T  558G  932G  38% /data/ssd3
> Cassandra load is about 200GB and the rest of the space is snapshots
> CPU
> cass_a@x 127 10:58:47 ~ $ lscpu | grep -E '^Thread|^Core|^Socket|^CPU\('
> CPU(s):                64
> Thread(s) per core:    2
> Core(s) per socket:    16
> Socket(s):             2
> *Description of problem:*
> During repair of the cluster, we are seeing multiple corruptions in the log 
> files on a lot of instances.  There seems to be no pattern to the corruption. 
>  It seems that the repair job is finding all the corrupted files for us.  The 
> repair will hang on the node where the corrupted file is found.  To fix this 
> we remove/rename the datafile and bounce the Cassandra instance.  Our 
> hardware/OS team have stated there is no problem on their side.  I do not 
> believe it the repair causing the corruption. 
>  
> So let me give you an example of a corrupted file and maybe someone might be 
> able to work through it with me?
> When this corrupted file was reported in the log it looks like it was the 
> repair that found it.
> $ journalctl -u cassmeta-cass_b.service --since "2019-08-07 22:25:00" --until 
> "2019-08-07 22:45:00"
> Aug 07 22:30:33 cassandra[34611]: INFO  21:30:33 Writing 
> Memtable-compactions_in_progress@830377457(0.008KiB serialized bytes, 1 ops, 
> 0%/0% of on/off-heap limit)
> Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Failed creating a merkle 
> tree for [repair #9587a200-b95a-11e9-8920-9f72868b8375 on KeyspaceMetadata/x, 
> (-1476350953672479093,-1474461
> Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Exception in thread 
> Thread[ValidationExecutor:825,1,main]
> Aug 07 22:30:33 cassandra[34611]: org.apache.cassandra.io.FSReadError: 
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /x/ssd2/data/KeyspaceMetadata/x-1e453cb0
> Aug 07 22:30:33 cassandra[34611]: at 
> 

[jira] [Updated] (CASSANDRA-15382) Fix flaky unit test - testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest

2019-10-28 Thread Vinay Chella (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Chella updated CASSANDRA-15382:
-
Resolution: Duplicate
Status: Resolved  (was: Triage Needed)

> Fix flaky unit test - 
> testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest
> ---
>
> Key: CASSANDRA-15382
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15382
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Vinay Chella
>Priority: Normal
>
> As part of Apache Cassandra 4.0-alpha2, it is found out that 
> "testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest" is 
> flaky.
>  
>  +Failing test:+
>  *testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest*
> {code:java}
> junit.framework.AssertionFailedError
>   at 
> org.apache.cassandra.transport.IdleDisconnectTest.testIdleDisconnect(IdleDisconnectTest.java:56)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  
> +Sample failed runs:+
>  * 
> [https://circleci.com/gh/vinaykumarchella/cassandra/527#tests/containers/37]
>  * 
> [https://circleci.com/gh/vinaykumarchella/cassandra/535#tests/containers/37]
>  * [https://circleci.com/gh/vinaykumarchella/cassandra/489#tests/containers/7]
>  * 
> [https://circleci.com/gh/vinaykumarchella/cassandra/512#tests/containers/37]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15382) Fix flaky unit test - testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest

2019-10-28 Thread Andrew Prudhomme (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961365#comment-16961365
 ] 

Andrew Prudhomme commented on CASSANDRA-15382:
--

I think this is a duplicate of 
https://issues.apache.org/jira/browse/CASSANDRA-15310

> Fix flaky unit test - 
> testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest
> ---
>
> Key: CASSANDRA-15382
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15382
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Vinay Chella
>Priority: Normal
>
> As part of Apache Cassandra 4.0-alpha2, it is found out that 
> "testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest" is 
> flaky.
>  
>  +Failing test:+
>  *testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest*
> {code:java}
> junit.framework.AssertionFailedError
>   at 
> org.apache.cassandra.transport.IdleDisconnectTest.testIdleDisconnect(IdleDisconnectTest.java:56)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  
> +Sample failed runs:+
>  * 
> [https://circleci.com/gh/vinaykumarchella/cassandra/527#tests/containers/37]
>  * 
> [https://circleci.com/gh/vinaykumarchella/cassandra/535#tests/containers/37]
>  * [https://circleci.com/gh/vinaykumarchella/cassandra/489#tests/containers/7]
>  * 
> [https://circleci.com/gh/vinaykumarchella/cassandra/512#tests/containers/37]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15381) Failing test - testDatabaseDescriptorRef::org.apache.cassandra.config.DatabaseDescriptorRefTest

2019-10-28 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15381:
-
Resolution: Fixed
Status: Resolved  (was: Triage Needed)

> Failing test - 
> testDatabaseDescriptorRef::org.apache.cassandra.config.DatabaseDescriptorRefTest
> ---
>
> Key: CASSANDRA-15381
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15381
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Vinay Chella
>Priority: Normal
>
> As part of Apache Cassandra 4.0-alpha2 voting, the following test is failing 
> across different test suites and runs. 
> CircleCI Run: 
> [https://circleci.com/gh/vinaykumarchella/cassandra/487#tests/containers/37]
> *testDatabaseDescriptorRef-compression - 
> org.apache.cassandra.config.DatabaseDescriptorRefTest*
> {code:java}
> junit.framework.AssertionFailedError
>   at 
> org.apache.cassandra.config.DatabaseDescriptorRefTest.checkViolations(DatabaseDescriptorRefTest.java:293)
>   at 
> org.apache.cassandra.config.DatabaseDescriptorRefTest.testDatabaseDescriptorRef(DatabaseDescriptorRefTest.java:277){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15371) Incorrect messaging service version breaks in-JVM upgrade tests on trunk

2019-10-28 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15371:
-
  Fix Version/s: 3.11.x
 3.0.x
 2.2.x
 4.0
  Since Version: 2.2.x
Source Control Link: 
https://github.com/apache/cassandra/commit/c009a7b9d4e3c32e88238a7ba8f67b14d287c5d3
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Thanks for the patch.

> Incorrect messaging service version breaks in-JVM upgrade tests on trunk
> 
>
> Key: CASSANDRA-15371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x
>
>
> The in-JVM upgrade tests on trunk currently fail because the messaging
>  version for internode messaging is selected as 
> {{MessagingService.current_version}},
>  a regression from the implementation in CASSANDRA-15078.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15371) Incorrect messaging service version breaks in-JVM upgrade tests on trunk

2019-10-28 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15371:
-
Status: Ready to Commit  (was: Review In Progress)

+1

> Incorrect messaging service version breaks in-JVM upgrade tests on trunk
> 
>
> Key: CASSANDRA-15371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> The in-JVM upgrade tests on trunk currently fail because the messaging
>  version for internode messaging is selected as 
> {{MessagingService.current_version}},
>  a regression from the implementation in CASSANDRA-15078.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15371) Incorrect messaging service version breaks in-JVM upgrade tests on trunk

2019-10-28 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15371:
-
Status: Patch Available  (was: In Progress)

> Incorrect messaging service version breaks in-JVM upgrade tests on trunk
> 
>
> Key: CASSANDRA-15371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> The in-JVM upgrade tests on trunk currently fail because the messaging
>  version for internode messaging is selected as 
> {{MessagingService.current_version}},
>  a regression from the implementation in CASSANDRA-15078.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15371) Incorrect messaging service version breaks in-JVM upgrade tests on trunk

2019-10-28 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15371:
-
Reviewers: Dinesh Joshi, Dinesh Joshi  (was: Dinesh Joshi)
   Dinesh Joshi, Dinesh Joshi  (was: Dinesh Joshi)
   Status: Review In Progress  (was: Patch Available)

> Incorrect messaging service version breaks in-JVM upgrade tests on trunk
> 
>
> Key: CASSANDRA-15371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> The in-JVM upgrade tests on trunk currently fail because the messaging
>  version for internode messaging is selected as 
> {{MessagingService.current_version}},
>  a regression from the implementation in CASSANDRA-15078.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15371) Incorrect messaging service version breaks in-JVM upgrade tests on trunk

2019-10-28 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15371:
-
Status: In Progress  (was: Patch Available)

> Incorrect messaging service version breaks in-JVM upgrade tests on trunk
> 
>
> Key: CASSANDRA-15371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> The in-JVM upgrade tests on trunk currently fail because the messaging
>  version for internode messaging is selected as 
> {{MessagingService.current_version}},
>  a regression from the implementation in CASSANDRA-15078.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.11 updated (b697af8 -> 9c1925a)

2019-10-28 Thread djoshi
This is an automated email from the ASF dual-hosted git repository.

djoshi pushed a change to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from b697af8  Merge branch 'cassandra-3.0' into cassandra-3.11
 add 02cc5df  In-JVM DTest: Set correct internode message version for 
upgrade test
 add bfa7998  Merge branch 'cassandra-2.2' into cassandra-3.0
 add 9c1925a  Merge branch 'cassandra-3.0' into cassandra-3.11

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt |  4 
 .../org/apache/cassandra/distributed/impl/Instance.java | 13 +++--
 2 files changed, 15 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated (b97fc30 -> c009a7b)

2019-10-28 Thread djoshi
This is an automated email from the ASF dual-hosted git repository.

djoshi pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from b97fc30  Remove StageManager
 add 02cc5df  In-JVM DTest: Set correct internode message version for 
upgrade test
 add bfa7998  Merge branch 'cassandra-2.2' into cassandra-3.0
 add 9c1925a  Merge branch 'cassandra-3.0' into cassandra-3.11
 add c009a7b  Merge branch 'cassandra-3.11' into trunk

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt|  4 
 .../org/apache/cassandra/distributed/impl/Instance.java| 14 +-
 .../apache/cassandra/config/DatabaseDescriptorRefTest.java |  1 +
 3 files changed, 18 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-2.2 updated (4ee4cee -> 02cc5df)

2019-10-28 Thread djoshi
This is an automated email from the ASF dual-hosted git repository.

djoshi pushed a change to branch cassandra-2.2
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 4ee4cee  Fix 2.2 deb Build-Depends JDK version to match build.xml 
source.version
 add 02cc5df  In-JVM DTest: Set correct internode message version for 
upgrade test

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt  |  4 
 .../org/apache/cassandra/distributed/impl/Instance.java  | 12 +++-
 2 files changed, 15 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.0 updated (a81bfd6 -> bfa7998)

2019-10-28 Thread djoshi
This is an automated email from the ASF dual-hosted git repository.

djoshi pushed a change to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from a81bfd6  Use `rm -f` in rpm spec to prevent failure on missing file
 add 02cc5df  In-JVM DTest: Set correct internode message version for 
upgrade test
 add bfa7998  Merge branch 'cassandra-2.2' into cassandra-3.0

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt  |  5 +
 .../org/apache/cassandra/distributed/impl/Instance.java  | 12 +++-
 2 files changed, 16 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15373) validate value sizes in LegacyLayout

2019-10-28 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961319#comment-16961319
 ] 

Blake Eggleston commented on CASSANDRA-15373:
-

fixed, also made a small adjustment to unbreak a test for CASSANDRA-14912 
(https://circleci.com/gh/bdeggleston/cassandra/2537#tests/containers/94)

> validate value sizes in LegacyLayout
> 
>
> Key: CASSANDRA-15373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15373
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 3.0.19, 3.11.5, 4.0
>
>
> In 2.1, all values are serialized as variable length blobs, with a length 
> prefix, followed by the actual value, even with fixed width types like int32. 
> The 3.0 storage engine, on the other hand, omits the length prefix for fixed 
> width types. Since the length of fixed width types are not validated on the 
> 3.0 write path, writing data for a fixed width type from an incorrectly sized 
> byte buffer will over or underflow the space allocated for it, corrupting the 
> remainder of that partition or indexed region from being read. This is not 
> discovered until we attempt to read the corrupted value. This patch updates 
> LegacyLayout to throw a marshal exception if it encounters an unexpected 
> value size for fixed size columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15277) Make it possible to resize concurrent read / write thread pools at runtime

2019-10-28 Thread Jon Meredith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961249#comment-16961249
 ] 

Jon Meredith commented on CASSANDRA-15277:
--

And CASSANDRA-15227 landed, so reverted to the original version with review 
feedback applied and rebased against the current trunk.

[Branch|https://github.com/jonmeredith/cassandra/tree/CASSANDRA-15277-v4]

[GitHub PR|https://github.com/apache/cassandra/pull/371]

[CircleCI 
run|https://circleci.com/workflow-run/37c049c2-0a7b-4780-9d35-20493bf7a4e5]

> Make it possible to resize concurrent read / write thread pools at runtime
> --
>
> Key: CASSANDRA-15277
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15277
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> To better mitigate cluster overload the executor services for various stages 
> should be configurable at runtime (probably as a JMX hot property). 
> Related to CASSANDRA-5044, this would add the capability to resize to 
> multiThreadedLowSignalStage pools based on SEPExecutor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15365) Add primary key liveness info when skipping illegal cells

2019-10-28 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-15365:

Status: Ready to Commit  (was: Changes Suggested)

> Add primary key liveness info when skipping illegal cells
> -
>
> Key: CASSANDRA-15365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> In CASSANDRA-15086/CASSANDRA-15178 we started skipping the illegal legacy 
> cells, problem is that if the row only contains illegal cells, we return a 
> totally empty row which breaks stats collection: 
> https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/rows/Rows.java#L70
> If the row only has these invalid cells, we should add a primary key liveness 
> info to it to match the 2.1 behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15365) Add primary key liveness info when skipping illegal cells

2019-10-28 Thread Sam Tunnicliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961245#comment-16961245
 ] 

Sam Tunnicliffe commented on CASSANDRA-15365:
-

+1 

> Add primary key liveness info when skipping illegal cells
> -
>
> Key: CASSANDRA-15365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> In CASSANDRA-15086/CASSANDRA-15178 we started skipping the illegal legacy 
> cells, problem is that if the row only contains illegal cells, we return a 
> totally empty row which breaks stats collection: 
> https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/rows/Rows.java#L70
> If the row only has these invalid cells, we should add a primary key liveness 
> info to it to match the 2.1 behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15373) validate value sizes in LegacyLayout

2019-10-28 Thread Sam Tunnicliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961237#comment-16961237
 ] 

Sam Tunnicliffe commented on CASSANDRA-15373:
-

+1 from me too - you just need to fix the newest tests on the 3.11 branch (e.g. 
{{s/new Clustering/Clustering.make/}})

> validate value sizes in LegacyLayout
> 
>
> Key: CASSANDRA-15373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15373
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 3.0.19, 3.11.5, 4.0
>
>
> In 2.1, all values are serialized as variable length blobs, with a length 
> prefix, followed by the actual value, even with fixed width types like int32. 
> The 3.0 storage engine, on the other hand, omits the length prefix for fixed 
> width types. Since the length of fixed width types are not validated on the 
> 3.0 write path, writing data for a fixed width type from an incorrectly sized 
> byte buffer will over or underflow the space allocated for it, corrupting the 
> remainder of that partition or indexed region from being read. This is not 
> discovered until we attempt to read the corrupted value. This patch updates 
> LegacyLayout to throw a marshal exception if it encounters an unexpected 
> value size for fixed size columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15373) validate value sizes in LegacyLayout

2019-10-28 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15373:
---
Status: Ready to Commit  (was: Review In Progress)

+1 from me

> validate value sizes in LegacyLayout
> 
>
> Key: CASSANDRA-15373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15373
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 3.0.19, 3.11.5, 4.0
>
>
> In 2.1, all values are serialized as variable length blobs, with a length 
> prefix, followed by the actual value, even with fixed width types like int32. 
> The 3.0 storage engine, on the other hand, omits the length prefix for fixed 
> width types. Since the length of fixed width types are not validated on the 
> 3.0 write path, writing data for a fixed width type from an incorrectly sized 
> byte buffer will over or underflow the space allocated for it, corrupting the 
> remainder of that partition or indexed region from being read. This is not 
> discovered until we attempt to read the corrupted value. This patch updates 
> LegacyLayout to throw a marshal exception if it encounters an unexpected 
> value size for fixed size columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15373) validate value sizes in LegacyLayout

2019-10-28 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15373:
---
Reviewers: Benedict Elliott Smith, Sam Tunnicliffe, Benedict Elliott Smith  
(was: Benedict Elliott Smith, Sam Tunnicliffe)
   Benedict Elliott Smith, Sam Tunnicliffe, Benedict Elliott Smith  
(was: Benedict Elliott Smith, Sam Tunnicliffe)
   Status: Review In Progress  (was: Patch Available)

> validate value sizes in LegacyLayout
> 
>
> Key: CASSANDRA-15373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15373
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 3.0.19, 3.11.5, 4.0
>
>
> In 2.1, all values are serialized as variable length blobs, with a length 
> prefix, followed by the actual value, even with fixed width types like int32. 
> The 3.0 storage engine, on the other hand, omits the length prefix for fixed 
> width types. Since the length of fixed width types are not validated on the 
> 3.0 write path, writing data for a fixed width type from an incorrectly sized 
> byte buffer will over or underflow the space allocated for it, corrupting the 
> remainder of that partition or indexed region from being read. This is not 
> discovered until we attempt to read the corrupted value. This patch updates 
> LegacyLayout to throw a marshal exception if it encounters an unexpected 
> value size for fixed size columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15373) validate value sizes in LegacyLayout

2019-10-28 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961190#comment-16961190
 ] 

Blake Eggleston commented on CASSANDRA-15373:
-

good catch, fixed

> validate value sizes in LegacyLayout
> 
>
> Key: CASSANDRA-15373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15373
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 3.0.19, 3.11.5, 4.0
>
>
> In 2.1, all values are serialized as variable length blobs, with a length 
> prefix, followed by the actual value, even with fixed width types like int32. 
> The 3.0 storage engine, on the other hand, omits the length prefix for fixed 
> width types. Since the length of fixed width types are not validated on the 
> 3.0 write path, writing data for a fixed width type from an incorrectly sized 
> byte buffer will over or underflow the space allocated for it, corrupting the 
> remainder of that partition or indexed region from being read. This is not 
> discovered until we attempt to read the corrupted value. This patch updates 
> LegacyLayout to throw a marshal exception if it encounters an unexpected 
> value size for fixed size columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15365) Add primary key liveness info when skipping illegal cells

2019-10-28 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961189#comment-16961189
 ] 

Benedict Elliott Smith commented on CASSANDRA-15365:


WFM

> Add primary key liveness info when skipping illegal cells
> -
>
> Key: CASSANDRA-15365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> In CASSANDRA-15086/CASSANDRA-15178 we started skipping the illegal legacy 
> cells, problem is that if the row only contains illegal cells, we return a 
> totally empty row which breaks stats collection: 
> https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/rows/Rows.java#L70
> If the row only has these invalid cells, we should add a primary key liveness 
> info to it to match the 2.1 behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15364) Avoid over scanning data directories in LogFile.verify()

2019-10-28 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961188#comment-16961188
 ] 

David Capwell commented on CASSANDRA-15364:
---

Assert msg looks good to me; loop assumes the above asserts ran so can safely 
loop over the files.

+1

> Avoid over scanning data directories in LogFile.verify()
> 
>
> Key: CASSANDRA-15364
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15364
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We currently list the data directory for every {{REMOVE}} record in the file 
> in {{LogFile.verify()}} - this can get very expensive during startup when we 
> call {{LogTransaction.removeUnfinishedLeftovers()}}. In 
> {{LogRecord.getExistingFiles(Set absoluteFilePaths)}} we also fully 
> parse the file name of the sstables found, here we only need to prefix match.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14629) Abstract Virtual Table for very large result sets

2019-10-28 Thread Aleksey Yeschenko (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961186#comment-16961186
 ] 

Aleksey Yeschenko commented on CASSANDRA-14629:
---

Looks ok to me overall, but I do have two questions (two variations of the same 
question, really):

1. {{select(DecoratedKey partitionKey, ClusteringIndexFilter clusteringFilter, 
ColumnFilter columnFilter)}} involves invoking {{hasKey(DecoratedKey 
partitionKey)}} and then {{getRows(DecoratedKey key, ClusteringIndexFilter 
clusteringFilter, ColumnFilter columnFilter)}}. Depending on the underlying 
implementation, this could mean a lot of extra work - up to doubling the amount 
of work needed. As an illustration, a common {{Map}} usage antipattern comes to 
mind: doing a {{contains()}} followed by {{get()}} instead of just doing get 
and checking for {{null}}. I know that one of the use cases you have in mind 
for this code is exposing the raw content of sstables, and I can see this 
overhead being relatively significant there potentially. I would suggest 
getting rid of {{hasKey()}} entirely and of the related check.
2. Similarly, {{select(DataRange dataRange, ColumnFilter columnFilter)}} and 
{{getPartitionKeys(DataRange dataRange)}} invocation could maybe also be 
remodelled to permit a single underlying iterator?

It's possible that I'm missing something here, so these aren't demands for 
changes - just a conversation starter.

P.S. You can suppress the redundant suppression warnings themselves like this: 
{code:java}
@SuppressWarnings({"resource", "RedundantSuppression"})
{code}

> Abstract Virtual Table for very large result sets
> -
>
> Key: CASSANDRA-14629
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14629
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/CQL, Legacy/Observability
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Low
>  Labels: pull-request-available, virtual-tables
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> For virtual tables that are very large we cannot use existing 
> abstractvirtualtable since it would OOM the node possibly. An example would 
> be a table to view the internal cache contents or to view contents of 
> sstables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15365) Add primary key liveness info when skipping illegal cells

2019-10-28 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961184#comment-16961184
 ] 

Marcus Eriksson commented on CASSANDRA-15365:
-

so, in 2.1 these invalid cells act as a kind of permanent row markers, for 
example:

* {{create table %s (pk int, c1 text, c2 text, v1 text, primary key (pk, c1, 
c2))}}
* insert an invalid cell, select * now returns: {{3, a, aa, null}}
* do an insert with a ttl for the whole row: {{insert into %s (pk, c1, c2, v1) 
values (3, 'a', 'aa', 'vaaalue') using ttl 2}}
* select * before ttl: {{3, a, aa, vaaalue}}
* select * after ttl expires: {{3, a, aa, null}}

If the invalid cell didn't exist, the select would have returned nothing

Translating this to 3.0 including this patch, we would translate the 
whole-row-insert row marker to the PKLI and ignore the invalid cell, and after 
ttl expires we would return nothing.

We can't really fully translate the 2.1 behaviour to 3.0 using only the PKLI - 
for example if we in the example above set the PKLI to the timestamp of the 
invalid cell and later overwrote that row with a ttl row, we would fully purge 
it after ttl expires, while in the 2.1 case we would still keep the invalid cell

I vote we keep the current patch behaviour, wdyt [~samt] & [~benedict]?


> Add primary key liveness info when skipping illegal cells
> -
>
> Key: CASSANDRA-15365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> In CASSANDRA-15086/CASSANDRA-15178 we started skipping the illegal legacy 
> cells, problem is that if the row only contains illegal cells, we return a 
> totally empty row which breaks stats collection: 
> https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/rows/Rows.java#L70
> If the row only has these invalid cells, we should add a primary key liveness 
> info to it to match the 2.1 behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15381) Failing test - testDatabaseDescriptorRef::org.apache.cassandra.config.DatabaseDescriptorRefTest

2019-10-28 Thread Jon Meredith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961160#comment-16961160
 ] 

Jon Meredith commented on CASSANDRA-15381:
--

Fix included as part of CASSANDRA-15371

> Failing test - 
> testDatabaseDescriptorRef::org.apache.cassandra.config.DatabaseDescriptorRefTest
> ---
>
> Key: CASSANDRA-15381
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15381
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Vinay Chella
>Priority: Normal
>
> As part of Apache Cassandra 4.0-alpha2 voting, the following test is failing 
> across different test suites and runs. 
> CircleCI Run: 
> [https://circleci.com/gh/vinaykumarchella/cassandra/487#tests/containers/37]
> *testDatabaseDescriptorRef-compression - 
> org.apache.cassandra.config.DatabaseDescriptorRefTest*
> {code:java}
> junit.framework.AssertionFailedError
>   at 
> org.apache.cassandra.config.DatabaseDescriptorRefTest.checkViolations(DatabaseDescriptorRefTest.java:293)
>   at 
> org.apache.cassandra.config.DatabaseDescriptorRefTest.testDatabaseDescriptorRef(DatabaseDescriptorRefTest.java:277){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15373) validate value sizes in LegacyLayout

2019-10-28 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961057#comment-16961057
 ] 

Benedict Elliott Smith commented on CASSANDRA-15373:


I think there's one remaining missing check: in {{decodeCellName}} we need to 
verify that the {{collectionElement}} is valid, as it can be a fixed width type.

> validate value sizes in LegacyLayout
> 
>
> Key: CASSANDRA-15373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15373
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 3.0.19, 3.11.5, 4.0
>
>
> In 2.1, all values are serialized as variable length blobs, with a length 
> prefix, followed by the actual value, even with fixed width types like int32. 
> The 3.0 storage engine, on the other hand, omits the length prefix for fixed 
> width types. Since the length of fixed width types are not validated on the 
> 3.0 write path, writing data for a fixed width type from an incorrectly sized 
> byte buffer will over or underflow the space allocated for it, corrupting the 
> remainder of that partition or indexed region from being read. This is not 
> discovered until we attempt to read the corrupted value. This patch updates 
> LegacyLayout to throw a marshal exception if it encounters an unexpected 
> value size for fixed size columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15227) Remove StageManager

2019-10-28 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15227:
---
Source Control Link: 
[b97fc302b10d0ec5303421b3b185675872672c46|https://github.com/apache/cassandra/commit/b97fc302b10d0ec5303421b3b185675872672c46]
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Sorry for letting this one atrophy - I've made some minor cosmetic changes and 
committed.

> Remove StageManager
> ---
>
> Key: CASSANDRA-15227
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15227
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> his is a minor cleanup; this class should not exist, but should be embedded 
> in the Stage enum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15227) Remove StageManager

2019-10-28 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15227:
---
Status: Ready to Commit  (was: Review In Progress)

> Remove StageManager
> ---
>
> Key: CASSANDRA-15227
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15227
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> his is a minor cleanup; this class should not exist, but should be embedded 
> in the Stage enum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15227) Remove StageManager

2019-10-28 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15227:
---
Reviewers: Benedict Elliott Smith, Benedict Elliott Smith  (was: Benedict 
Elliott Smith)
   Benedict Elliott Smith, Benedict Elliott Smith  (was: Benedict 
Elliott Smith)
   Status: Review In Progress  (was: Patch Available)

> Remove StageManager
> ---
>
> Key: CASSANDRA-15227
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15227
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> his is a minor cleanup; this class should not exist, but should be embedded 
> in the Stage enum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated (3522b54 -> b97fc30)

2019-10-28 Thread benedict
This is an automated email from the ASF dual-hosted git repository.

benedict pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 3522b54  Make ConnectionBurnTest a proper unit test (fixes `ant 
test-burn`)
 add b97fc30  Remove StageManager

No new revisions were added by this update.

Summary of changes:
 .../concurrent/JMXEnabledSingleThreadExecutor.java |  77 +
 .../concurrent/JMXEnabledThreadPoolExecutor.java   |   5 -
 .../org/apache/cassandra/concurrent/Stage.java | 178 -
 .../apache/cassandra/concurrent/StageManager.java  | 155 --
 .../org/apache/cassandra/db/ColumnFamilyStore.java |   8 +-
 src/java/org/apache/cassandra/db/Keyspace.java |   3 +-
 .../cassandra/db/SinglePartitionReadCommand.java   |   6 +-
 .../cassandra/db/commitlog/CommitLogReplayer.java  |   3 +-
 src/java/org/apache/cassandra/gms/Gossiper.java|  20 +--
 .../cassandra/index/SecondaryIndexManager.java |   4 +-
 .../cassandra/net/InboundMessageHandler.java   |   3 +-
 .../org/apache/cassandra/net/MessagingService.java |   3 +-
 .../org/apache/cassandra/net/RequestCallbacks.java |   3 +-
 .../org/apache/cassandra/repair/Validator.java |   5 +-
 .../apache/cassandra/schema/MigrationManager.java  |   9 +-
 .../cassandra/schema/SchemaPushVerbHandler.java|   3 +-
 .../org/apache/cassandra/service/CacheService.java |   5 +-
 .../org/apache/cassandra/service/StorageProxy.java |  17 +-
 .../apache/cassandra/service/StorageService.java   |  11 +-
 .../service/reads/AbstractReadExecutor.java|   4 +-
 .../reads/ShortReadPartitionsProtection.java   |   3 +-
 .../service/reads/repair/AbstractReadRepair.java   |   4 +-
 .../apache/cassandra/tracing/TraceStateImpl.java   |   4 +-
 .../org/apache/cassandra/tracing/TracingImpl.java  |   3 +-
 .../cassandra/distributed/impl/Instance.java   |   6 +-
 .../org/apache/cassandra/cql3/ViewLongTest.java|   5 +-
 .../org/apache/cassandra/cql3/ViewComplexTest.java |   5 +-
 .../apache/cassandra/cql3/ViewFilteringTest.java   |   5 +-
 .../org/apache/cassandra/cql3/ViewSchemaTest.java  |   5 +-
 test/unit/org/apache/cassandra/cql3/ViewTest.java  |   5 +-
 .../org/apache/cassandra/net/MatcherResponse.java  |   3 +-
 31 files changed, 275 insertions(+), 295 deletions(-)
 create mode 100644 
src/java/org/apache/cassandra/concurrent/JMXEnabledSingleThreadExecutor.java
 delete mode 100644 src/java/org/apache/cassandra/concurrent/StageManager.java


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test

2019-10-28 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-11928:
---
Since Version: 4.0-alpha

> dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
> 
>
> Key: CASSANDRA-11928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11928
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Craig Kodman
>Priority: Normal
>  Labels: dtest, flaky
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test
> Failed on CassCI build cassandra-3.0_dtest #727
> Is it a problem that the tracing message with the query is missing?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test

2019-10-28 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960886#comment-16960886
 ] 

Michael Semb Wever commented on CASSANDRA-11928:



Seems to be a regression in trunk of CASSANDRA-11465, the tracing doesn't use 
same consistency level as the request.
  trunk: 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/856/testReport/junit/cql_tracing_test/TestCqlTracing/
  3.11: 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-3.11-dtest/466/testReport/cql_tracing_test/TestCqlTracing/

My understanding is that tracing data was intended to be eventually consistent, 
and making it strictly consistency (via 
`-Dcassandra.wait_for_tracing_events_timeout_secs=xx`) was only for the purpose 
of testing. 

If that's true, a simple fix is just to reduce ccm nodes for that test, ie 
https://github.com/thelastpickle/cassandra-dtest/commit/f22f89fdb3080ac48f4310ee1a5aeb219ac2f093#diff-b866cba7cf982d53e6406cca014e659eR23

[~pauloricardomg], [~jkni], [~mambocab], thoughts? Is it worth bisecting where 
the regression came from? Or removing the 
`wait_for_tracing_events_timeout_secs` flag?

> dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
> 
>
> Key: CASSANDRA-11928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11928
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Craig Kodman
>Priority: Normal
>  Labels: dtest, flaky
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test
> Failed on CassCI build cassandra-3.0_dtest #727
> Is it a problem that the tracing message with the query is missing?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test

2019-10-28 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960886#comment-16960886
 ] 

Michael Semb Wever edited comment on CASSANDRA-11928 at 10/28/19 9:33 AM:
--

Seems to be a regression in trunk of CASSANDRA-11465, the tracing doesn't use 
same consistency level as the request.
  trunk: 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/856/testReport/junit/cql_tracing_test/TestCqlTracing/
  3.11: 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-3.11-dtest/466/testReport/cql_tracing_test/TestCqlTracing/

My understanding is that tracing data was intended to be eventually consistent, 
and making it strictly consistency (via 
{{`-Dcassandra.wait_for_tracing_events_timeout_secs=xx`}}) was only for the 
purpose of testing. 

If that's true, a simple fix is just to reduce ccm nodes for that test, ie 
https://github.com/thelastpickle/cassandra-dtest/commit/f22f89fdb3080ac48f4310ee1a5aeb219ac2f093#diff-b866cba7cf982d53e6406cca014e659eR23

[~pauloricardomg], [~jkni], [~mambocab], thoughts? Is it worth bisecting where 
the regression came from? Or removing the 
{{wait_for_tracing_events_timeout_secs}} flag?


was (Author: michaelsembwever):

Seems to be a regression in trunk of CASSANDRA-11465, the tracing doesn't use 
same consistency level as the request.
  trunk: 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/856/testReport/junit/cql_tracing_test/TestCqlTracing/
  3.11: 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-3.11-dtest/466/testReport/cql_tracing_test/TestCqlTracing/

My understanding is that tracing data was intended to be eventually consistent, 
and making it strictly consistency (via 
`-Dcassandra.wait_for_tracing_events_timeout_secs=xx`) was only for the purpose 
of testing. 

If that's true, a simple fix is just to reduce ccm nodes for that test, ie 
https://github.com/thelastpickle/cassandra-dtest/commit/f22f89fdb3080ac48f4310ee1a5aeb219ac2f093#diff-b866cba7cf982d53e6406cca014e659eR23

[~pauloricardomg], [~jkni], [~mambocab], thoughts? Is it worth bisecting where 
the regression came from? Or removing the 
`wait_for_tracing_events_timeout_secs` flag?

> dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
> 
>
> Key: CASSANDRA-11928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11928
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Craig Kodman
>Priority: Normal
>  Labels: dtest, flaky
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test
> Failed on CassCI build cassandra-3.0_dtest #727
> Is it a problem that the tracing message with the query is missing?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15364) Avoid over scanning data directories in LogFile.verify()

2019-10-28 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960882#comment-16960882
 ] 

Marcus Eriksson commented on CASSANDRA-15364:
-

pushed a commit which adds an explanation to the assert and changes the second 
loop to iterate over the values of the map instead

> Avoid over scanning data directories in LogFile.verify()
> 
>
> Key: CASSANDRA-15364
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15364
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We currently list the data directory for every {{REMOVE}} record in the file 
> in {{LogFile.verify()}} - this can get very expensive during startup when we 
> call {{LogTransaction.removeUnfinishedLeftovers()}}. In 
> {{LogRecord.getExistingFiles(Set absoluteFilePaths)}} we also fully 
> parse the file name of the sstables found, here we only need to prefix match.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Make ConnectionBurnTest a proper unit test (fixes `ant test-burn`)

2019-10-28 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 3522b54  Make ConnectionBurnTest a proper unit test (fixes `ant 
test-burn`)
3522b54 is described below

commit 3522b54f2d7f34c3dc8234c8981a4629ebcf9a50
Author: Mick Semb Wever 
AuthorDate: Sat Oct 26 22:23:47 2019 +0200

Make ConnectionBurnTest a proper unit test (fixes `ant test-burn`)

patch by Mick Semb Wever; reviewed by Benedict Elliott Smith
 https://the-asf.slack.com/archives/CK23JSY2K/p1572206490030800
---
 test/burn/org/apache/cassandra/net/ConnectionBurnTest.java | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/test/burn/org/apache/cassandra/net/ConnectionBurnTest.java 
b/test/burn/org/apache/cassandra/net/ConnectionBurnTest.java
index 81b6402..57eb726 100644
--- a/test/burn/org/apache/cassandra/net/ConnectionBurnTest.java
+++ b/test/burn/org/apache/cassandra/net/ConnectionBurnTest.java
@@ -622,7 +622,7 @@ public class ConnectionBurnTest
 }
 }
 
-public static void test(GlobalInboundSettings inbound, 
OutboundConnectionSettings outbound) throws ExecutionException, 
InterruptedException, NoSuchFieldException, IllegalAccessException, 
TimeoutException
+private void test(GlobalInboundSettings inbound, 
OutboundConnectionSettings outbound) throws ExecutionException, 
InterruptedException, NoSuchFieldException, IllegalAccessException, 
TimeoutException
 {
 MessageGenerator small = new UniformPayloadGenerator(0, 1, (1 << 15));
 MessageGenerator large = new UniformPayloadGenerator(0, 1, (1 << 16) + 
(1 << 15));
@@ -635,12 +635,19 @@ public class ConnectionBurnTest
 .endpoints(4)
 .inbound(inbound)
 .outbound(outbound)
-.time(2L, TimeUnit.DAYS)
+// change the following for a longer burn
+.time(2L, TimeUnit.MINUTES)
 .build().run();
 }
 
 public static void main(String[] args) throws ExecutionException, 
InterruptedException, NoSuchFieldException, IllegalAccessException, 
TimeoutException
 {
+new ConnectionBurnTest().test();
+}
+
+@org.junit.Test
+public void test() throws ExecutionException, InterruptedException, 
NoSuchFieldException, IllegalAccessException, TimeoutException
+{
 GlobalInboundSettings inboundSettings = new GlobalInboundSettings()
 .withQueueCapacity(1 << 18)
 .withEndpointReserveLimit(1 << 
20)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org