[jira] [Created] (CASSANDRA-11741) Don't return data to the client when skipping a page

2016-05-09 Thread Boying Lu (JIRA)
Boying Lu created CASSANDRA-11741:
-

 Summary: Don't return data to the client when skipping a page
 Key: CASSANDRA-11741
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11741
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Boying Lu


DataStax java driver support 'paging' but it doesn't support skip between 
pages. 

To go from page A to page B,  user has to go through each page between A and B. 
 Because user only interested in the "PagingState" object before reaching page 
B, it can save great bandwidth between server and client if the data of pages 
between A and B doesn't return to the client, especially when the page size is 
large. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11645) (single) dtest failure in snapshot_test.TestArchiveCommitlog.test_archive_commitlog_with_active_commitlog

2016-05-09 Thread Russ Hatch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277369#comment-15277369
 ] 

Russ Hatch commented on CASSANDRA-11645:


20 runs locally do not repro.

> (single) dtest failure in 
> snapshot_test.TestArchiveCommitlog.test_archive_commitlog_with_active_commitlog
> -
>
> Key: CASSANDRA-11645
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11645
> Project: Cassandra
>  Issue Type: Test
>Reporter: Russ Hatch
>Assignee: Russ Hatch
>  Labels: dtest
>
> This was a singular but pretty recent failure, so thought it might be worth 
> digging into to see if it repros.
> http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/211/testReport/snapshot_test/TestArchiveCommitlog/test_archive_commitlog_with_active_commitlog
> Failed on CassCI build cassandra-2.1_dtest_jdk8 #211



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-11645) (single) dtest failure in snapshot_test.TestArchiveCommitlog.test_archive_commitlog_with_active_commitlog

2016-05-09 Thread Russ Hatch (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russ Hatch reassigned CASSANDRA-11645:
--

Assignee: Russ Hatch  (was: DS Test Eng)

> (single) dtest failure in 
> snapshot_test.TestArchiveCommitlog.test_archive_commitlog_with_active_commitlog
> -
>
> Key: CASSANDRA-11645
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11645
> Project: Cassandra
>  Issue Type: Test
>Reporter: Russ Hatch
>Assignee: Russ Hatch
>  Labels: dtest
>
> This was a singular but pretty recent failure, so thought it might be worth 
> digging into to see if it repros.
> http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/211/testReport/snapshot_test/TestArchiveCommitlog/test_archive_commitlog_with_active_commitlog
> Failed on CassCI build cassandra-2.1_dtest_jdk8 #211



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-11668) dtest failure in upgrade_tests.upgrade_through_versions_test.ProtoV4Upgrade_3_2_UpTo_3_3_HEAD.rolling_upgrade_with_internode_ssl_test

2016-05-09 Thread Russ Hatch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277279#comment-15277279
 ] 

Russ Hatch edited comment on CASSANDRA-11668 at 5/9/16 11:13 PM:
-

a few runs locally pass fine. trying a bulk run here: 
http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/95/


was (Author: rhatch):
a few runs locally pass fine. trying a bulk run here: 
http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/

> dtest failure in 
> upgrade_tests.upgrade_through_versions_test.ProtoV4Upgrade_3_2_UpTo_3_3_HEAD.rolling_upgrade_with_internode_ssl_test
> -
>
> Key: CASSANDRA-11668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11668
> Project: Cassandra
>  Issue Type: Test
>Reporter: Russ Hatch
>Assignee: Russ Hatch
>  Labels: dtest
>
> since this was on upgrade to 3.3 head, I doubt it's an actual problem 
> (assuming changes aren't actively happening there). Nevertheless, should take 
> a quick look and see if there's anything going on.
> example failure:
> http://cassci.datastax.com/job/upgrade_tests-all/39/testReport/upgrade_tests.upgrade_through_versions_test/ProtoV4Upgrade_3_2_UpTo_3_3_HEAD/rolling_upgrade_with_internode_ssl_test
> Failed on CassCI build upgrade_tests-all #39



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11668) dtest failure in upgrade_tests.upgrade_through_versions_test.ProtoV4Upgrade_3_2_UpTo_3_3_HEAD.rolling_upgrade_with_internode_ssl_test

2016-05-09 Thread Russ Hatch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277279#comment-15277279
 ] 

Russ Hatch commented on CASSANDRA-11668:


a few runs locally pass fine. trying a bulk run here: 
http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/

> dtest failure in 
> upgrade_tests.upgrade_through_versions_test.ProtoV4Upgrade_3_2_UpTo_3_3_HEAD.rolling_upgrade_with_internode_ssl_test
> -
>
> Key: CASSANDRA-11668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11668
> Project: Cassandra
>  Issue Type: Test
>Reporter: Russ Hatch
>Assignee: Russ Hatch
>  Labels: dtest
>
> since this was on upgrade to 3.3 head, I doubt it's an actual problem 
> (assuming changes aren't actively happening there). Nevertheless, should take 
> a quick look and see if there's anything going on.
> example failure:
> http://cassci.datastax.com/job/upgrade_tests-all/39/testReport/upgrade_tests.upgrade_through_versions_test/ProtoV4Upgrade_3_2_UpTo_3_3_HEAD/rolling_upgrade_with_internode_ssl_test
> Failed on CassCI build upgrade_tests-all #39



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11740) Nodes about wrong membership view of the cluster

2016-05-09 Thread Dikang Gu (JIRA)
Dikang Gu created CASSANDRA-11740:
-

 Summary: Nodes about wrong membership view of the cluster
 Key: CASSANDRA-11740
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11740
 Project: Cassandra
  Issue Type: Bug
Reporter: Dikang Gu
 Fix For: 2.2.x, 3.x


We have a few hundreds nodes across 3 data centers, and we are doing a few 
millions writes per second into the cluster.

The problem we found is that there are some nodes (>10) have very wrong view of 
the cluster.

For example, we have 3 data centers A, B and C. On the problem nodes, in the 
output of the 'nodetool status', it shows that ~100 nodes are not in data 
center A, B, or C. Instead, it shows nodes are in DC1, and rack r1, which is 
very wrong. And as a result, the node will return wrong results to client 
requests.

Datacenter: DC1
===
Status=Up/Down
/ State=Normal/Leaving/Joining/Moving
– Address Load Tokens Owns Host ID Rack
UN 2401:db00:11:6134:face:0:1:0 509.52 GB 256 ? 
e24656ac-c3b2-4117-b933-a5b06852c993 r1
UN 2401:db00:11:b218:face:0:5:0 510.01 GB 256 ? 
53da2104-b1b5-4fa5-a3dd-52c7557149f9 r1
UN 2401:db00:2130:5133:face:0:4d:0 459.75 GB 256 ? 
ef8311f0-f6b8-491c-904d-baa925cdd7c2 r1

We are using GossipingPropertyFileSnitch.

Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11740) Nodes have wrong membership view of the cluster

2016-05-09 Thread Dikang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-11740:
--
Summary: Nodes have wrong membership view of the cluster  (was: Nodes about 
wrong membership view of the cluster)

> Nodes have wrong membership view of the cluster
> ---
>
> Key: CASSANDRA-11740
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11740
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dikang Gu
> Fix For: 2.2.x, 3.x
>
>
> We have a few hundreds nodes across 3 data centers, and we are doing a few 
> millions writes per second into the cluster.
> The problem we found is that there are some nodes (>10) have very wrong view 
> of the cluster.
> For example, we have 3 data centers A, B and C. On the problem nodes, in the 
> output of the 'nodetool status', it shows that ~100 nodes are not in data 
> center A, B, or C. Instead, it shows nodes are in DC1, and rack r1, which is 
> very wrong. And as a result, the node will return wrong results to client 
> requests.
> Datacenter: DC1
> ===
> Status=Up/Down
> / State=Normal/Leaving/Joining/Moving
> – Address Load Tokens Owns Host ID Rack
> UN 2401:db00:11:6134:face:0:1:0 509.52 GB 256 ? 
> e24656ac-c3b2-4117-b933-a5b06852c993 r1
> UN 2401:db00:11:b218:face:0:5:0 510.01 GB 256 ? 
> 53da2104-b1b5-4fa5-a3dd-52c7557149f9 r1
> UN 2401:db00:2130:5133:face:0:4d:0 459.75 GB 256 ? 
> ef8311f0-f6b8-491c-904d-baa925cdd7c2 r1
> We are using GossipingPropertyFileSnitch.
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-05-09 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277113#comment-15277113
 ] 

Jeremiah Jordan commented on CASSANDRA-11452:
-

[~blambov] I see this is resolved as fixed, but trunk is still using 
caffeine-2.2.6.jar where the commit in caffeine that talks about this 
discussion is  in 2.2.7?
Should we upgrade to 2.2.7?

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11724) False Failure Detection in Big Cassandra Cluster

2016-05-09 Thread Jeffrey F. Lukman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277100#comment-15277100
 ] 

Jeffrey F. Lukman commented on CASSANDRA-11724:
---

[~jeromatron] : okay, I will try this again and report the result later whether 
this config 
will cause a different result or not.

For now, can you help me by confirming  whether you also see the Workload-4 bug 
or not?
The Workload-4 : running 512-nodes cluster with some data, then we 
decommissioned a node.
In our place, we see a high numbers of wrong false failure detection.

> False Failure Detection in Big Cassandra Cluster
> 
>
> Key: CASSANDRA-11724
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11724
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeffrey F. Lukman
>  Labels: gossip, node-failure
> Attachments: Workload1.jpg, Workload2.jpg, Workload3.jpg, 
> Workload4.jpg, experiment-result.txt
>
>
> We are running some testing on Cassandra v2.2.5 stable in a big cluster. The 
> setting in our testing is that each machine has 16-cores and runs 8 cassandra 
> instances, and our testing is 32, 64, 128, 256, and 512 instances of 
> Cassandra. We use the default number of vnodes for each instance which is 
> 256. The data and log directories are on in-memory tmpfs file system.
> We run several types of workloads on this Cassandra cluster:
> Workload1: Just start the cluster
> Workload2: Start half of the cluster, wait until it gets into a stable 
> condition, and run another half of the cluster
> Workload3: Start half of the cluster, wait until it gets into a stable 
> condition, load some data, and run another half of the cluster
> Workload4: Start the cluster, wait until it gets into a stable condition, 
> load some data and decommission one node
> For this testing, we measure the total numbers of false failure detection 
> inside the cluster. By false failure detection, we mean that, for example, 
> instance-1 marks the instance-2 down, but the instance-2 is not down. We dig 
> deeper into the root cause and find out that instance-1 has not received any 
> heartbeat after some time from instance-2 because the instance-2 run a long 
> computation process.
> Here I attach the graphs of each workload result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11709) Lock contention when large number of dead nodes come back within short time

2016-05-09 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277099#comment-15277099
 ] 

Joel Knighton commented on CASSANDRA-11709:
---

That sounds like a different issue for me - I'd recommend opening another issue 
with as much info about your set up (snitch, etc) as possible.

> Lock contention when large number of dead nodes come back within short time
> ---
>
> Key: CASSANDRA-11709
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11709
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Joel Knighton
> Fix For: 2.2.x, 3.x
>
>
> We have a few hundreds nodes across 3 data centers, and we are doing a few 
> millions writes per second into the cluster. 
> We were trying to simulate a data center failure, by disabling the gossip on 
> all the nodes in one data center. After ~20mins, I re-enabled the gossip on 
> those nodes, was doing 5 nodes in each batch, and sleep 5 seconds between the 
> batch.
> After that, I saw the latency of read/write requests increased a lot, and 
> client requests started to timeout.
> On the node, I can see there are huge number of pending tasks in GossipStage. 
> =
> 2016-05-02_23:55:08.99515 WARN  23:55:08 Gossip stage has 36337 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:09.36009 INFO  23:55:09 Node 
> /2401:db00:2020:717a:face:0:41:0 state jump to normal
> 2016-05-02_23:55:09.99057 INFO  23:55:09 Node 
> /2401:db00:2020:717a:face:0:43:0 state jump to normal
> 2016-05-02_23:55:10.09742 WARN  23:55:10 Gossip stage has 36421 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:10.91860 INFO  23:55:10 Node 
> /2401:db00:2020:717a:face:0:45:0 state jump to normal
> 2016-05-02_23:55:11.20100 WARN  23:55:11 Gossip stage has 36558 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:11.57893 INFO  23:55:11 Node 
> /2401:db00:2030:612a:face:0:49:0 state jump to normal
> 2016-05-02_23:55:12.23405 INFO  23:55:12 Node /2401:db00:2020:7189:face:0:7:0 
> state jump to normal
> 
> And I took jstack of the node, I found the read/write threads are blocked by 
> a lock,
>  read thread ==
> "Thrift:7994" daemon prio=10 tid=0x7fde91080800 nid=0x5255 waiting for 
> monitor entry [0x7fde6f8a1000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.locator.TokenMetadata.cachedOnlyTokenMap(TokenMetadata.java:546)
> - waiting to lock <0x7fe4faef4398> (a 
> org.apache.cassandra.locator.TokenMetadata)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:111)
> at 
> org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:3155)
> at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1526)
> at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1521)
> at 
> org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:155)
> at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1328)
> at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1270)
> at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1195)
> at 
> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:118)
> at 
> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:275)
> at 
> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:457)
> at 
> org.apache.cassandra.thrift.CassandraServer.getSliceInternal(CassandraServer.java:346)
> at 
> org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:325)
> at 
> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3659)
> at 
> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3643)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> =  writer ===
> 

[jira] [Commented] (CASSANDRA-11709) Lock contention when large number of dead nodes come back within short time

2016-05-09 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277090#comment-15277090
 ] 

Dikang Gu commented on CASSANDRA-11709:
---

[~jkni], cool, thanks!

One more question, I also found a serious problem recently, that in "nodetool 
status", bunch of nodes are not shown in correct region. For example, we have 
three regions A, B and C, I found that there are almost 100 hundreds nodes are 
not shown in any of those regions, they are shown as in DC1, and Rack r1, which 
I think complete broke the replication, and return incorrect data to client 
requests.

Do you think they are the same issue, or I'd better open another jira?

Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  OwnsHost ID 
  Rack
UN  2401:db00:11:6134:face:0:1:0 509.52 GB  256 ?   
e24656ac-c3b2-4117-b933-a5b06852c993  r1
UN  2401:db00:11:b218:face:0:5:0 510.01 GB  256 ?   
53da2104-b1b5-4fa5-a3dd-52c7557149f9  r1
UN  2401:db00:2130:5133:face:0:4d:0  459.75 GB  256 ?   
ef8311f0-f6b8-491c-904d-baa925cdd7c2  r1

> Lock contention when large number of dead nodes come back within short time
> ---
>
> Key: CASSANDRA-11709
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11709
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Joel Knighton
> Fix For: 2.2.x, 3.x
>
>
> We have a few hundreds nodes across 3 data centers, and we are doing a few 
> millions writes per second into the cluster. 
> We were trying to simulate a data center failure, by disabling the gossip on 
> all the nodes in one data center. After ~20mins, I re-enabled the gossip on 
> those nodes, was doing 5 nodes in each batch, and sleep 5 seconds between the 
> batch.
> After that, I saw the latency of read/write requests increased a lot, and 
> client requests started to timeout.
> On the node, I can see there are huge number of pending tasks in GossipStage. 
> =
> 2016-05-02_23:55:08.99515 WARN  23:55:08 Gossip stage has 36337 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:09.36009 INFO  23:55:09 Node 
> /2401:db00:2020:717a:face:0:41:0 state jump to normal
> 2016-05-02_23:55:09.99057 INFO  23:55:09 Node 
> /2401:db00:2020:717a:face:0:43:0 state jump to normal
> 2016-05-02_23:55:10.09742 WARN  23:55:10 Gossip stage has 36421 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:10.91860 INFO  23:55:10 Node 
> /2401:db00:2020:717a:face:0:45:0 state jump to normal
> 2016-05-02_23:55:11.20100 WARN  23:55:11 Gossip stage has 36558 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:11.57893 INFO  23:55:11 Node 
> /2401:db00:2030:612a:face:0:49:0 state jump to normal
> 2016-05-02_23:55:12.23405 INFO  23:55:12 Node /2401:db00:2020:7189:face:0:7:0 
> state jump to normal
> 
> And I took jstack of the node, I found the read/write threads are blocked by 
> a lock,
>  read thread ==
> "Thrift:7994" daemon prio=10 tid=0x7fde91080800 nid=0x5255 waiting for 
> monitor entry [0x7fde6f8a1000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.locator.TokenMetadata.cachedOnlyTokenMap(TokenMetadata.java:546)
> - waiting to lock <0x7fe4faef4398> (a 
> org.apache.cassandra.locator.TokenMetadata)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:111)
> at 
> org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:3155)
> at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1526)
> at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1521)
> at 
> org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:155)
> at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1328)
> at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1270)
> at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1195)
> at 
> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:118)
> at 
> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:275)
> at 
> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:457)
> at 
> org.apache.cassandra.thrift.CassandraServer.getSliceInternal(CassandraServer.java:346)
>

[jira] [Comment Edited] (CASSANDRA-11709) Lock contention when large number of dead nodes come back within short time

2016-05-09 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277090#comment-15277090
 ] 

Dikang Gu edited comment on CASSANDRA-11709 at 5/9/16 9:31 PM:
---

[~jkni], cool, thanks!

One more question, I also found a serious problem recently, that in "nodetool 
status", bunch of nodes are not shown in correct region. For example, we have 
three regions A, B and C, I found that there are almost 100 nodes are not shown 
in any of those regions, they are shown as in DC1, and Rack r1, which I think 
complete broke the replication, and return incorrect data to client requests.

Do you think they are the same issue, or I'd better open another jira?

Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  OwnsHost ID 
  Rack
UN  2401:db00:11:6134:face:0:1:0 509.52 GB  256 ?   
e24656ac-c3b2-4117-b933-a5b06852c993  r1
UN  2401:db00:11:b218:face:0:5:0 510.01 GB  256 ?   
53da2104-b1b5-4fa5-a3dd-52c7557149f9  r1
UN  2401:db00:2130:5133:face:0:4d:0  459.75 GB  256 ?   
ef8311f0-f6b8-491c-904d-baa925cdd7c2  r1


was (Author: dikanggu):
[~jkni], cool, thanks!

One more question, I also found a serious problem recently, that in "nodetool 
status", bunch of nodes are not shown in correct region. For example, we have 
three regions A, B and C, I found that there are almost 100 hundreds nodes are 
not shown in any of those regions, they are shown as in DC1, and Rack r1, which 
I think complete broke the replication, and return incorrect data to client 
requests.

Do you think they are the same issue, or I'd better open another jira?

Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  OwnsHost ID 
  Rack
UN  2401:db00:11:6134:face:0:1:0 509.52 GB  256 ?   
e24656ac-c3b2-4117-b933-a5b06852c993  r1
UN  2401:db00:11:b218:face:0:5:0 510.01 GB  256 ?   
53da2104-b1b5-4fa5-a3dd-52c7557149f9  r1
UN  2401:db00:2130:5133:face:0:4d:0  459.75 GB  256 ?   
ef8311f0-f6b8-491c-904d-baa925cdd7c2  r1

> Lock contention when large number of dead nodes come back within short time
> ---
>
> Key: CASSANDRA-11709
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11709
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Joel Knighton
> Fix For: 2.2.x, 3.x
>
>
> We have a few hundreds nodes across 3 data centers, and we are doing a few 
> millions writes per second into the cluster. 
> We were trying to simulate a data center failure, by disabling the gossip on 
> all the nodes in one data center. After ~20mins, I re-enabled the gossip on 
> those nodes, was doing 5 nodes in each batch, and sleep 5 seconds between the 
> batch.
> After that, I saw the latency of read/write requests increased a lot, and 
> client requests started to timeout.
> On the node, I can see there are huge number of pending tasks in GossipStage. 
> =
> 2016-05-02_23:55:08.99515 WARN  23:55:08 Gossip stage has 36337 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:09.36009 INFO  23:55:09 Node 
> /2401:db00:2020:717a:face:0:41:0 state jump to normal
> 2016-05-02_23:55:09.99057 INFO  23:55:09 Node 
> /2401:db00:2020:717a:face:0:43:0 state jump to normal
> 2016-05-02_23:55:10.09742 WARN  23:55:10 Gossip stage has 36421 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:10.91860 INFO  23:55:10 Node 
> /2401:db00:2020:717a:face:0:45:0 state jump to normal
> 2016-05-02_23:55:11.20100 WARN  23:55:11 Gossip stage has 36558 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:11.57893 INFO  23:55:11 Node 
> /2401:db00:2030:612a:face:0:49:0 state jump to normal
> 2016-05-02_23:55:12.23405 INFO  23:55:12 Node /2401:db00:2020:7189:face:0:7:0 
> state jump to normal
> 
> And I took jstack of the node, I found the read/write threads are blocked by 
> a lock,
>  read thread ==
> "Thrift:7994" daemon prio=10 tid=0x7fde91080800 nid=0x5255 waiting for 
> monitor entry [0x7fde6f8a1000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.locator.TokenMetadata.cachedOnlyTokenMap(TokenMetadata.java:546)
> - waiting to lock <0x7fe4faef4398> (a 
> org.apache.cassandra.locator.TokenMetadata)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:111)
> at 
> 

[jira] [Commented] (CASSANDRA-11709) Lock contention when large number of dead nodes come back within short time

2016-05-09 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277075#comment-15277075
 ] 

Joel Knighton commented on CASSANDRA-11709:
---

I've just started really looking at this now - I'll post an update regarding 
the strategy to fix this once I have one.

> Lock contention when large number of dead nodes come back within short time
> ---
>
> Key: CASSANDRA-11709
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11709
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Joel Knighton
> Fix For: 2.2.x, 3.x
>
>
> We have a few hundreds nodes across 3 data centers, and we are doing a few 
> millions writes per second into the cluster. 
> We were trying to simulate a data center failure, by disabling the gossip on 
> all the nodes in one data center. After ~20mins, I re-enabled the gossip on 
> those nodes, was doing 5 nodes in each batch, and sleep 5 seconds between the 
> batch.
> After that, I saw the latency of read/write requests increased a lot, and 
> client requests started to timeout.
> On the node, I can see there are huge number of pending tasks in GossipStage. 
> =
> 2016-05-02_23:55:08.99515 WARN  23:55:08 Gossip stage has 36337 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:09.36009 INFO  23:55:09 Node 
> /2401:db00:2020:717a:face:0:41:0 state jump to normal
> 2016-05-02_23:55:09.99057 INFO  23:55:09 Node 
> /2401:db00:2020:717a:face:0:43:0 state jump to normal
> 2016-05-02_23:55:10.09742 WARN  23:55:10 Gossip stage has 36421 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:10.91860 INFO  23:55:10 Node 
> /2401:db00:2020:717a:face:0:45:0 state jump to normal
> 2016-05-02_23:55:11.20100 WARN  23:55:11 Gossip stage has 36558 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:11.57893 INFO  23:55:11 Node 
> /2401:db00:2030:612a:face:0:49:0 state jump to normal
> 2016-05-02_23:55:12.23405 INFO  23:55:12 Node /2401:db00:2020:7189:face:0:7:0 
> state jump to normal
> 
> And I took jstack of the node, I found the read/write threads are blocked by 
> a lock,
>  read thread ==
> "Thrift:7994" daemon prio=10 tid=0x7fde91080800 nid=0x5255 waiting for 
> monitor entry [0x7fde6f8a1000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.locator.TokenMetadata.cachedOnlyTokenMap(TokenMetadata.java:546)
> - waiting to lock <0x7fe4faef4398> (a 
> org.apache.cassandra.locator.TokenMetadata)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:111)
> at 
> org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:3155)
> at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1526)
> at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1521)
> at 
> org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:155)
> at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1328)
> at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1270)
> at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1195)
> at 
> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:118)
> at 
> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:275)
> at 
> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:457)
> at 
> org.apache.cassandra.thrift.CassandraServer.getSliceInternal(CassandraServer.java:346)
> at 
> org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:325)
> at 
> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3659)
> at 
> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3643)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> =  writer ===
> "Thrift:7668" daemon prio=10 

[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-05-09 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277068#comment-15277068
 ] 

Brandon Williams commented on CASSANDRA-8523:
-

[~pauloricardomg] assigning to you, please do continue working on this.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11709) Lock contention when large number of dead nodes come back within short time

2016-05-09 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277066#comment-15277066
 ] 

Dikang Gu commented on CASSANDRA-11709:
---

[~jkni], can you please share a bit about how are you going to fix this? Thanks!

> Lock contention when large number of dead nodes come back within short time
> ---
>
> Key: CASSANDRA-11709
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11709
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Joel Knighton
> Fix For: 2.2.x, 3.x
>
>
> We have a few hundreds nodes across 3 data centers, and we are doing a few 
> millions writes per second into the cluster. 
> We were trying to simulate a data center failure, by disabling the gossip on 
> all the nodes in one data center. After ~20mins, I re-enabled the gossip on 
> those nodes, was doing 5 nodes in each batch, and sleep 5 seconds between the 
> batch.
> After that, I saw the latency of read/write requests increased a lot, and 
> client requests started to timeout.
> On the node, I can see there are huge number of pending tasks in GossipStage. 
> =
> 2016-05-02_23:55:08.99515 WARN  23:55:08 Gossip stage has 36337 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:09.36009 INFO  23:55:09 Node 
> /2401:db00:2020:717a:face:0:41:0 state jump to normal
> 2016-05-02_23:55:09.99057 INFO  23:55:09 Node 
> /2401:db00:2020:717a:face:0:43:0 state jump to normal
> 2016-05-02_23:55:10.09742 WARN  23:55:10 Gossip stage has 36421 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:10.91860 INFO  23:55:10 Node 
> /2401:db00:2020:717a:face:0:45:0 state jump to normal
> 2016-05-02_23:55:11.20100 WARN  23:55:11 Gossip stage has 36558 pending 
> tasks; skipping status check (no nodes will be marked down)
> 2016-05-02_23:55:11.57893 INFO  23:55:11 Node 
> /2401:db00:2030:612a:face:0:49:0 state jump to normal
> 2016-05-02_23:55:12.23405 INFO  23:55:12 Node /2401:db00:2020:7189:face:0:7:0 
> state jump to normal
> 
> And I took jstack of the node, I found the read/write threads are blocked by 
> a lock,
>  read thread ==
> "Thrift:7994" daemon prio=10 tid=0x7fde91080800 nid=0x5255 waiting for 
> monitor entry [0x7fde6f8a1000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.locator.TokenMetadata.cachedOnlyTokenMap(TokenMetadata.java:546)
> - waiting to lock <0x7fe4faef4398> (a 
> org.apache.cassandra.locator.TokenMetadata)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:111)
> at 
> org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:3155)
> at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1526)
> at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1521)
> at 
> org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:155)
> at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1328)
> at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1270)
> at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1195)
> at 
> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:118)
> at 
> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:275)
> at 
> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:457)
> at 
> org.apache.cassandra.thrift.CassandraServer.getSliceInternal(CassandraServer.java:346)
> at 
> org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:325)
> at 
> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3659)
> at 
> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3643)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> =  writer ===
> "Thrift:7668" daemon prio=10 tid=0x7fde90d91000 nid=0x50e9 waiting for 
> 

[jira] [Updated] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-05-09 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-8523:

Assignee: Paulo Motta  (was: Brandon Williams)

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11606) Upgrade from 2.1.9 to 3.0.5 Fails with AssertionError

2016-05-09 Thread Anthony Verslues (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276996#comment-15276996
 ] 

Anthony Verslues commented on CASSANDRA-11606:
--

Sorry for the late response, I have attached a scheme of the table.

> Upgrade from 2.1.9 to 3.0.5 Fails with AssertionError
> -
>
> Key: CASSANDRA-11606
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11606
> Project: Cassandra
>  Issue Type: Bug
> Environment: Fedora 20, Oracle Java 8, Apache Cassandra 2.1.9 -> 3.0.5
>Reporter: Anthony Verslues
> Fix For: 3.0.x
>
> Attachments: sample.txt
>
>
> I get this error while upgrading sstables. I got the same error when 
> upgrading to 3.0.2 and 3.0.4.
> error: null
> -- StackTrace --
> java.lang.AssertionError
> at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1167)
> at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1142)
> at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:444)
> at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:423)
> at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:289)
> at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.readStaticRow(SSTableSimpleIterator.java:133)
> at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:57)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator$1.initializeIterator(BigTableScanner.java:334)
> at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48)
> at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.isReverseOrder(LazilyInitializedUnfilteredRowIterator.java:65)
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$1.reduce(UnfilteredPartitionIterators.java:109)
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$1.reduce(UnfilteredPartitionIterators.java:100)
> at 
> org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:442)
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.hasNext(UnfilteredPartitionIterators.java:150)
> at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72)
> at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:177)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:416)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:313)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11734) Enable partition component index for SASI

2016-05-09 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11734:

Reviewer: Pavel Yaskevich

> Enable partition component index for SASI
> -
>
> Key: CASSANDRA-11734
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11734
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: DOAN DuyHai
>  Labels: doc-impacting, sasi, secondaryIndex
> Fix For: 3.8
>
> Attachments: patch.txt
>
>
> Enable partition component index for SASI
> For the given schema:
> {code:sql}
> CREATE TABLE test.comp (
> pk1 int,
> pk2 text,
> val text,
> PRIMARY KEY ((pk1, pk2))
> );
> CREATE CUSTOM INDEX comp_val_idx ON test.comp (val) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> CREATE CUSTOM INDEX comp_pk2_idx ON test.comp (pk2) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'PREFIX', 
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
> 'case_sensitive': 'false'};
> CREATE CUSTOM INDEX comp_pk1_idx ON test.comp (pk1) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> {code}
> The following queries are possible:
> {code:sql}
> SELECT * FROM test.comp WHERE pk1=1;
> SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5;
> SELECT * FROM test.comp WHERE pk1=1 AND val='xxx' ALLOW FILTERING;
> SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND val='xxx' ALLOW FILTERING;
> SELECT * FROM test.comp WHERE pk2='some text';
> SELECT * FROM test.comp WHERE pk2 LIKE 'prefix%';
> SELECT * FROM test.comp WHERE pk2='some text' AND val='xxx' ALLOW FILTERING;
> SELECT * FROM test.comp WHERE pk2 LIKE 'prefix%' AND val='xxx' ALLOW 
> FILTERING;
> //Without using SASI
> SELECT * FROM test.comp WHERE pk1 = 1 AND pk2='some text';
> SELECT * FROM test.comp WHERE pk1 IN(1,2,3) AND pk2='some text';
> SELECT * FROM test.comp WHERE pk1 = 1 AND pk2 IN ('text1','text2');
> SELECT * FROM test.comp WHERE pk1 IN(1,2,3) AND pk2 IN ('text1','text2');
> {code}
> However, the following queries *are not possible*
> {code:sql}
> SELECT * FROM test.comp WHERE pk1=1 AND pk2 LIKE 'prefix%';
> SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND pk2 = 'some text';
> SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND pk2 LIKE 'prefix%';
> {code}
> All of them are throwing the following exception
> {noformat}
> ava.lang.UnsupportedOperationException: null
>   at 
> org.apache.cassandra.cql3.restrictions.SingleColumnRestriction$LikeRestriction.appendTo(SingleColumnRestriction.java:715)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.restrictions.PartitionKeySingleRestrictionSet.values(PartitionKeySingleRestrictionSet.java:86)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.restrictions.StatementRestrictions.getPartitionKeys(StatementRestrictions.java:585)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:473)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.getQuery(SelectStatement.java:265)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:230)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:79)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:239) 
> ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:224) 
> ~[main/:na]
>   at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
>  [main/:na]
>   at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
>  [main/:na]
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  [netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:283)
>  [netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_45]
>   at 
> 

[jira] [Updated] (CASSANDRA-11604) select on table fails after changing user defined type in map

2016-05-09 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11604:

Reviewer: Joel Knighton

> select on table fails after changing user defined type in map
> -
>
> Key: CASSANDRA-11604
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11604
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andreas Jaekle
>Assignee: Alex Petrov
> Fix For: 3.x
>
>
> in cassandra 3.5 i get the following exception when i run this cqls:
> {code}
> --DROP KEYSPACE bugtest ;
> CREATE KEYSPACE bugtest
>  WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
> use bugtest;
> CREATE TYPE tt (
>   a boolean
> );
> create table t1 (
>   k text,
>   v map,
>   PRIMARY KEY(k)
> );
> insert into t1 (k,v) values ('k2',{'mk':{a:false}});
> ALTER TYPE tt ADD b boolean;
> UPDATE t1 SET v['mk'] = { b:true } WHERE k = 'k2';
> select * from t1;  
> {code}
> the last select fails.
> {code}
> WARN  [SharedPool-Worker-5] 2016-04-19 14:18:49,885 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-5,5,main]: {}
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.db.rows.ComplexColumnData$Builder.addCell(ComplexColumnData.java:254)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:623)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:419)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:279)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:112) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:38)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:289)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> 

[jira] [Updated] (CASSANDRA-11606) Upgrade from 2.1.9 to 3.0.5 Fails with AssertionError

2016-05-09 Thread Anthony Verslues (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Verslues updated CASSANDRA-11606:
-
Attachment: sample.txt

> Upgrade from 2.1.9 to 3.0.5 Fails with AssertionError
> -
>
> Key: CASSANDRA-11606
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11606
> Project: Cassandra
>  Issue Type: Bug
> Environment: Fedora 20, Oracle Java 8, Apache Cassandra 2.1.9 -> 3.0.5
>Reporter: Anthony Verslues
> Fix For: 3.0.x
>
> Attachments: sample.txt
>
>
> I get this error while upgrading sstables. I got the same error when 
> upgrading to 3.0.2 and 3.0.4.
> error: null
> -- StackTrace --
> java.lang.AssertionError
> at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1167)
> at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1142)
> at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:444)
> at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:423)
> at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:289)
> at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.readStaticRow(SSTableSimpleIterator.java:133)
> at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:57)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator$1.initializeIterator(BigTableScanner.java:334)
> at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48)
> at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.isReverseOrder(LazilyInitializedUnfilteredRowIterator.java:65)
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$1.reduce(UnfilteredPartitionIterators.java:109)
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$1.reduce(UnfilteredPartitionIterators.java:100)
> at 
> org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:442)
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.hasNext(UnfilteredPartitionIterators.java:150)
> at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72)
> at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:177)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:416)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:313)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9613) Omit (de)serialization of state variable in UDAs

2016-05-09 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-9613:
---
Reviewer: Tyler Hobbs

> Omit (de)serialization of state variable in UDAs
> 
>
> Key: CASSANDRA-9613
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9613
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> Currently the result of each UDA's state function call is serialized and then 
> deserialized for the next state-function invocation and optionally final 
> function invocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11739) Cache key references might cause OOM on incremental repair

2016-05-09 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276896#comment-15276896
 ] 

Marcus Eriksson commented on CASSANDRA-11739:
-

sounds good

we should probably bite the bullet and do CASSANDRA-8858 at some point as well

> Cache key references might cause OOM on incremental repair
> --
>
> Key: CASSANDRA-11739
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11739
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
> Attachments: heapdump.png
>
>
> We keep {{SSTableReader}} references for the duration of the repair to 
> anti-compact later, and their tidier keep references to cache keys to be 
> invalidated which are only cleaned up by GC after repair is finished. These 
> cache keys can accumulate while repair is being executed leading to OOM for 
> large tables/keyspaces.
> Heap dump attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11734) Enable partition component index for SASI

2016-05-09 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276894#comment-15276894
 ] 

Pavel Yaskevich commented on CASSANDRA-11734:
-

Thanks for taking a stub at this, [~doanduyhai]! By the nature of changes it 
looks like we will have to postpone this until I'm done with QueryPlan porting 
(CASSANDRA-10765) which is going to make it more sane to have indexed 
restrictions on partitions with(-out) ranges. From the patch I see couple of 
things right away: CFMetaData.getLiveIndices() you mentioned goes against the 
fact that some of the queries don't even allow usage of the indexes, which 
there is no way (currently) to check from inside of the 
SingleColumnRestrictions, checking 
{{QueryController#hasIndexFor(ColumnDefinition)}} on every run of the results 
checking logic is very inefficient and I think instead of using DecoratedKey 
separately we might be better off providing {{Operation.satisfiedBy}} methods 
with {{UnfilteredRowIterator}} and let it iterate it if needed instead of 
involving {{QueryPlan}}. So I would rather have this after CASSANDRA-10765, 
looks like it would make everybody's life a bit easier :)

> Enable partition component index for SASI
> -
>
> Key: CASSANDRA-11734
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11734
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: DOAN DuyHai
>  Labels: doc-impacting, sasi, secondaryIndex
> Fix For: 3.8
>
> Attachments: patch.txt
>
>
> Enable partition component index for SASI
> For the given schema:
> {code:sql}
> CREATE TABLE test.comp (
> pk1 int,
> pk2 text,
> val text,
> PRIMARY KEY ((pk1, pk2))
> );
> CREATE CUSTOM INDEX comp_val_idx ON test.comp (val) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> CREATE CUSTOM INDEX comp_pk2_idx ON test.comp (pk2) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'PREFIX', 
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
> 'case_sensitive': 'false'};
> CREATE CUSTOM INDEX comp_pk1_idx ON test.comp (pk1) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> {code}
> The following queries are possible:
> {code:sql}
> SELECT * FROM test.comp WHERE pk1=1;
> SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5;
> SELECT * FROM test.comp WHERE pk1=1 AND val='xxx' ALLOW FILTERING;
> SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND val='xxx' ALLOW FILTERING;
> SELECT * FROM test.comp WHERE pk2='some text';
> SELECT * FROM test.comp WHERE pk2 LIKE 'prefix%';
> SELECT * FROM test.comp WHERE pk2='some text' AND val='xxx' ALLOW FILTERING;
> SELECT * FROM test.comp WHERE pk2 LIKE 'prefix%' AND val='xxx' ALLOW 
> FILTERING;
> //Without using SASI
> SELECT * FROM test.comp WHERE pk1 = 1 AND pk2='some text';
> SELECT * FROM test.comp WHERE pk1 IN(1,2,3) AND pk2='some text';
> SELECT * FROM test.comp WHERE pk1 = 1 AND pk2 IN ('text1','text2');
> SELECT * FROM test.comp WHERE pk1 IN(1,2,3) AND pk2 IN ('text1','text2');
> {code}
> However, the following queries *are not possible*
> {code:sql}
> SELECT * FROM test.comp WHERE pk1=1 AND pk2 LIKE 'prefix%';
> SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND pk2 = 'some text';
> SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND pk2 LIKE 'prefix%';
> {code}
> All of them are throwing the following exception
> {noformat}
> ava.lang.UnsupportedOperationException: null
>   at 
> org.apache.cassandra.cql3.restrictions.SingleColumnRestriction$LikeRestriction.appendTo(SingleColumnRestriction.java:715)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.restrictions.PartitionKeySingleRestrictionSet.values(PartitionKeySingleRestrictionSet.java:86)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.restrictions.StatementRestrictions.getPartitionKeys(StatementRestrictions.java:585)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:473)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.getQuery(SelectStatement.java:265)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:230)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:79)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208)
>  ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:239) 
> ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:224) 
> ~[main/:na]
>   at 
> 

[jira] [Commented] (CASSANDRA-11739) Cache key references might cause OOM on incremental repair

2016-05-09 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276886#comment-15276886
 ] 

Paulo Motta commented on CASSANDRA-11739:
-

Idea is to track only SSTable data component filename on parent repair session, 
rather than {{SSTableReader}} references. When doing anti-compaction, get 
references of existing sstables from CFS. WDYT [~krummas]?

> Cache key references might cause OOM on incremental repair
> --
>
> Key: CASSANDRA-11739
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11739
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
> Attachments: heapdump.png
>
>
> We keep {{SSTableReader}} references for the duration of the repair to 
> anti-compact later, and their tidier keep references to cache keys to be 
> invalidated which are only cleaned up by GC after repair is finished. These 
> cache keys can accumulate while repair is being executed leading to OOM for 
> large tables/keyspaces.
> Heap dump attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11739) Cache key references might cause OOM on incremental repair

2016-05-09 Thread Paulo Motta (JIRA)
Paulo Motta created CASSANDRA-11739:
---

 Summary: Cache key references might cause OOM on incremental repair
 Key: CASSANDRA-11739
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11739
 Project: Cassandra
  Issue Type: Bug
Reporter: Paulo Motta
Assignee: Paulo Motta
 Attachments: heapdump.png

We keep {{SSTableReader}} references for the duration of the repair to 
anti-compact later, and their tidier keep references to cache keys to be 
invalidated which are only cleaned up by GC after repair is finished. These 
cache keys can accumulate while repair is being executed leading to OOM for 
large tables/keyspaces.

Heap dump attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11724) False Failure Detection in Big Cassandra Cluster

2016-05-09 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276804#comment-15276804
 ] 

Jeremy Hanna commented on CASSANDRA-11724:
--

I suppose I should just say, you should set auto_bootstrap=false in your 
cassandra.yaml and you wouldn't need to do the two minute intervals since this 
is a fresh cluster.

> False Failure Detection in Big Cassandra Cluster
> 
>
> Key: CASSANDRA-11724
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11724
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeffrey F. Lukman
>  Labels: gossip, node-failure
> Attachments: Workload1.jpg, Workload2.jpg, Workload3.jpg, 
> Workload4.jpg, experiment-result.txt
>
>
> We are running some testing on Cassandra v2.2.5 stable in a big cluster. The 
> setting in our testing is that each machine has 16-cores and runs 8 cassandra 
> instances, and our testing is 32, 64, 128, 256, and 512 instances of 
> Cassandra. We use the default number of vnodes for each instance which is 
> 256. The data and log directories are on in-memory tmpfs file system.
> We run several types of workloads on this Cassandra cluster:
> Workload1: Just start the cluster
> Workload2: Start half of the cluster, wait until it gets into a stable 
> condition, and run another half of the cluster
> Workload3: Start half of the cluster, wait until it gets into a stable 
> condition, load some data, and run another half of the cluster
> Workload4: Start the cluster, wait until it gets into a stable condition, 
> load some data and decommission one node
> For this testing, we measure the total numbers of false failure detection 
> inside the cluster. By false failure detection, we mean that, for example, 
> instance-1 marks the instance-2 down, but the instance-2 is not down. We dig 
> deeper into the root cause and find out that instance-1 has not received any 
> heartbeat after some time from instance-2 because the instance-2 run a long 
> computation process.
> Here I attach the graphs of each workload result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11725) Check for unnecessary JMX port setting in env vars at startup

2016-05-09 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-11725:

Assignee: Sam Tunnicliffe
Reviewer: T Jake Luciani
  Status: Patch Available  (was: Open)

I've pushed a branch which adds a {{StartupCheck}} that warns if 
{{com.sun.management.jmxremote.port}} is set. 

As this issue also exposed the fact that a number of clients where relying on 
passing the JMX config directly via system properties, rather than using 
{{cassandra-env.sh}}, I've made startup more permissive to emulate previous 
behaviour. So, if the property is present, when it C* comes to init the JMX 
server it will log an additional warning and skip the setup. 

The additional warning is because at some point, we should remove this 
compatibility mode and go back to failing startup if the {{jmxremote.port}} is 
set directly, but we'll the {{StartupCheck}} should remain.

||branch||testall||dtest||
|[11725-3.6|https://github.com/beobal/cassandra/tree/11725-3.6]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11725-3.6-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11725-3.6-dtest]|


> Check for unnecessary JMX port setting in env vars at startup
> -
>
> Key: CASSANDRA-11725
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11725
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Lifecycle
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
>
> Since CASSANDRA-10091, C* expects to always be in control of initializing its 
> JMX connector server. However, if  {{com.sun.management.jmxremote.port}} is 
> set when the JVM is started, the bootstrap agent takes over and sets up the 
> server before any C* code runs. Because C* is then unable to bind the server 
> it creates to the specified port, startup is halted and the root cause is 
> somewhat unclear. 
> We should add a check at startup so a more informative message can be 
> provided. This would test for the presence of the system property which would 
> differentiate from the case where some other process is already bound to the 
> port. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11710) Cassandra dies with OOM when running stress

2016-05-09 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-11710:
---
Resolution: Fixed
  Reviewer: T Jake Luciani  (was: Marcus Eriksson)
Status: Resolved  (was: Patch Available)

+1 committed in {{31cab36b1800f2042623633445d8be944217d5a2}}

> Cassandra dies with OOM when running stress
> ---
>
> Key: CASSANDRA-11710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11710
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Branimir Lambov
> Fix For: 3.6
>
>
> Running stress on trunk dies with OOM after about 3.5M ops:
> {code}
> ERROR [CompactionExecutor:1] 2016-05-04 15:01:31,231 
> JVMStabilityInspector.java:137 - JVM state determined to be unstable.  
> Exiting forcefully due to:
> java.lang.OutOfMemoryError: Direct buffer memory
> at java.nio.Bits.reserveMemory(Bits.java:693) ~[na:1.8.0_91]
> at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
> ~[na:1.8.0_91]
> at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) 
> ~[na:1.8.0_91]
> at 
> org.apache.cassandra.utils.memory.BufferPool.allocateDirectAligned(BufferPool.java:519)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.memory.BufferPool.access$600(BufferPool.java:46) 
> ~[main/:na]
> at 
> org.apache.cassandra.utils.memory.BufferPool$GlobalPool.allocateMoreChunks(BufferPool.java:276)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.memory.BufferPool$GlobalPool.get(BufferPool.java:249)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.addChunkFromGlobalPool(BufferPool.java:338)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.get(BufferPool.java:381)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.memory.BufferPool.maybeTakeFromPool(BufferPool.java:142)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.memory.BufferPool.takeFromPool(BufferPool.java:114)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.memory.BufferPool.get(BufferPool.java:84) 
> ~[main/:na]
> at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:135) 
> ~[main/:na]
> at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:19) 
> ~[main/:na]
> at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalLoadingCache.lambda$new$0(BoundedLocalCache.java:2949)
>  ~[caffeine-2.2.6.jar:na]
> at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$15(BoundedLocalCache.java:1807)
>  ~[caffeine-2.2.6.jar:na]
> at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) 
> ~[na:1.8.0_91]
> at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:1805)
>  ~[caffeine-2.2.6.jar:na]
> at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1788)
>  ~[caffeine-2.2.6.jar:na]
> at 
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
>  ~[caffeine-2.2.6.jar:na]
> at 
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
>  ~[caffeine-2.2.6.jar:na]
> at 
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:215)
>  ~[main/:na]
> at 
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:193)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.util.LimitingRebufferer.rebuffer(LimitingRebufferer.java:34)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:78)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:72)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:400) 
> ~[main/:na]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readWithVIntLength(ByteBufferUtil.java:338)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:414) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:243) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:473)
>  

[5/5] cassandra git commit: Merge branch 'cassandra-3.7' into trunk

2016-05-09 Thread jake
Merge branch 'cassandra-3.7' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/653d0bff
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/653d0bff
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/653d0bff

Branch: refs/heads/trunk
Commit: 653d0bffcf02e698ca727bb5c151dfea19202eb5
Parents: 640072b a093e8c
Author: T Jake Luciani 
Authored: Mon May 9 13:51:52 2016 -0400
Committer: T Jake Luciani 
Committed: Mon May 9 13:51:52 2016 -0400

--
 CHANGES.txt|  1 +
 src/java/org/apache/cassandra/config/Config.java   |  2 +-
 .../cassandra/config/DatabaseDescriptor.java   |  7 +++
 .../apache/cassandra/utils/memory/BufferPool.java  | 17 -
 4 files changed, 25 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/653d0bff/CHANGES.txt
--
diff --cc CHANGES.txt
index d9f1688,6e60bba..731bfc8
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -12,8 -6,8 +12,9 @@@ Merged from 2.2
   * Prohibit Reversed Counter type as part of the PK (CASSANDRA-9395)
   * cqlsh: correctly handle non-ascii chars in error messages (CASSANDRA-11626)
  
 +
  3.6
+  * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710)
   * Enhanced Compaction Logging (CASSANDRA-10805)
   * Make prepared statement cache size configurable (CASSANDRA-11555)
   * Integrated JMX authentication and authorization (CASSANDRA-10091)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/653d0bff/src/java/org/apache/cassandra/utils/memory/BufferPool.java
--



[1/5] cassandra git commit: Prevent direct memory OOM on buffer pool allocations

2016-05-09 Thread jake
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.7 a8a3a7338 -> a093e8cae
  refs/heads/trunk 640072b09 -> 653d0bffc


Prevent direct memory OOM on buffer pool allocations

Patch by Branimir Lambov; reviewed by tjake for
(CASSANDRA-11710)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/31cab36b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/31cab36b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/31cab36b

Branch: refs/heads/cassandra-3.7
Commit: 31cab36b1800f2042623633445d8be944217d5a2
Parents: 5634cea
Author: Branimir Lambov 
Authored: Thu May 5 11:30:00 2016 +0300
Committer: T Jake Luciani 
Committed: Mon May 9 13:48:30 2016 -0400

--
 CHANGES.txt|  1 +
 src/java/org/apache/cassandra/config/Config.java   |  2 +-
 .../cassandra/config/DatabaseDescriptor.java   |  7 +++
 .../apache/cassandra/utils/memory/BufferPool.java  | 17 -
 4 files changed, 25 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 4ff5b1a..b7715ba 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.6
+ * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710)
  * Enhanced Compaction Logging (CASSANDRA-10805)
  * Make prepared statement cache size configurable (CASSANDRA-11555)
  * Integrated JMX authentication and authorization (CASSANDRA-10091)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/Config.java
--
diff --git a/src/java/org/apache/cassandra/config/Config.java 
b/src/java/org/apache/cassandra/config/Config.java
index 02635bf..466b791 100644
--- a/src/java/org/apache/cassandra/config/Config.java
+++ b/src/java/org/apache/cassandra/config/Config.java
@@ -242,7 +242,7 @@ public class Config
 
 private static boolean isClientMode = false;
 
-public Integer file_cache_size_in_mb = 512;
+public Integer file_cache_size_in_mb;
 
 public boolean buffer_pool_use_heap_if_exhausted = true;
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index d8acdb8..3d38646 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -1776,6 +1776,13 @@ public class DatabaseDescriptor
 
 public static int getFileCacheSizeInMB()
 {
+if (conf.file_cache_size_in_mb == null)
+{
+// In client mode the value is not set.
+assert Config.isClientMode();
+return 0;
+}
+
 return conf.file_cache_size_in_mb;
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/utils/memory/BufferPool.java
--
diff --git a/src/java/org/apache/cassandra/utils/memory/BufferPool.java 
b/src/java/org/apache/cassandra/utils/memory/BufferPool.java
index ad2404f..5cd0051 100644
--- a/src/java/org/apache/cassandra/utils/memory/BufferPool.java
+++ b/src/java/org/apache/cassandra/utils/memory/BufferPool.java
@@ -273,7 +273,22 @@ public class BufferPool
 }
 
 // allocate a large chunk
-Chunk chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE));
+Chunk chunk;
+try
+{
+chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE));
+}
+catch (OutOfMemoryError oom)
+{
+noSpamLogger.error("Buffer pool failed to allocate chunk of 
{}, current size {} ({}). " +
+   "Attempting to continue; buffers will be 
allocated in on-heap memory which can degrade performance. " +
+   "Make sure direct memory size 
(-XX:MaxDirectMemorySize) is large enough to accommodate off-heap memtables and 
caches.",
+   
FBUtilities.prettyPrintMemory(MACRO_CHUNK_SIZE),
+   
FBUtilities.prettyPrintMemory(sizeInBytes()),
+   oom.toString());
+return false;
+}
+
 chunk.acquire(null);
 macroChunks.add(chunk);
 for (int i = 0 ; 

[2/5] cassandra git commit: Prevent direct memory OOM on buffer pool allocations

2016-05-09 Thread jake
Prevent direct memory OOM on buffer pool allocations

Patch by Branimir Lambov; reviewed by tjake for
(CASSANDRA-11710)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/31cab36b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/31cab36b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/31cab36b

Branch: refs/heads/trunk
Commit: 31cab36b1800f2042623633445d8be944217d5a2
Parents: 5634cea
Author: Branimir Lambov 
Authored: Thu May 5 11:30:00 2016 +0300
Committer: T Jake Luciani 
Committed: Mon May 9 13:48:30 2016 -0400

--
 CHANGES.txt|  1 +
 src/java/org/apache/cassandra/config/Config.java   |  2 +-
 .../cassandra/config/DatabaseDescriptor.java   |  7 +++
 .../apache/cassandra/utils/memory/BufferPool.java  | 17 -
 4 files changed, 25 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 4ff5b1a..b7715ba 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.6
+ * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710)
  * Enhanced Compaction Logging (CASSANDRA-10805)
  * Make prepared statement cache size configurable (CASSANDRA-11555)
  * Integrated JMX authentication and authorization (CASSANDRA-10091)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/Config.java
--
diff --git a/src/java/org/apache/cassandra/config/Config.java 
b/src/java/org/apache/cassandra/config/Config.java
index 02635bf..466b791 100644
--- a/src/java/org/apache/cassandra/config/Config.java
+++ b/src/java/org/apache/cassandra/config/Config.java
@@ -242,7 +242,7 @@ public class Config
 
 private static boolean isClientMode = false;
 
-public Integer file_cache_size_in_mb = 512;
+public Integer file_cache_size_in_mb;
 
 public boolean buffer_pool_use_heap_if_exhausted = true;
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index d8acdb8..3d38646 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -1776,6 +1776,13 @@ public class DatabaseDescriptor
 
 public static int getFileCacheSizeInMB()
 {
+if (conf.file_cache_size_in_mb == null)
+{
+// In client mode the value is not set.
+assert Config.isClientMode();
+return 0;
+}
+
 return conf.file_cache_size_in_mb;
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/utils/memory/BufferPool.java
--
diff --git a/src/java/org/apache/cassandra/utils/memory/BufferPool.java 
b/src/java/org/apache/cassandra/utils/memory/BufferPool.java
index ad2404f..5cd0051 100644
--- a/src/java/org/apache/cassandra/utils/memory/BufferPool.java
+++ b/src/java/org/apache/cassandra/utils/memory/BufferPool.java
@@ -273,7 +273,22 @@ public class BufferPool
 }
 
 // allocate a large chunk
-Chunk chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE));
+Chunk chunk;
+try
+{
+chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE));
+}
+catch (OutOfMemoryError oom)
+{
+noSpamLogger.error("Buffer pool failed to allocate chunk of 
{}, current size {} ({}). " +
+   "Attempting to continue; buffers will be 
allocated in on-heap memory which can degrade performance. " +
+   "Make sure direct memory size 
(-XX:MaxDirectMemorySize) is large enough to accommodate off-heap memtables and 
caches.",
+   
FBUtilities.prettyPrintMemory(MACRO_CHUNK_SIZE),
+   
FBUtilities.prettyPrintMemory(sizeInBytes()),
+   oom.toString());
+return false;
+}
+
 chunk.acquire(null);
 macroChunks.add(chunk);
 for (int i = 0 ; i < MACRO_CHUNK_SIZE ; i += CHUNK_SIZE)



[4/5] cassandra git commit: Merge branch '3.6-retag' into cassandra-3.7

2016-05-09 Thread jake
Merge branch '3.6-retag' into cassandra-3.7


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a093e8ca
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a093e8ca
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a093e8ca

Branch: refs/heads/cassandra-3.7
Commit: a093e8caeec431bdc8ec31efa8b3c1aeb9067285
Parents: a8a3a73 31cab36
Author: T Jake Luciani 
Authored: Mon May 9 13:50:24 2016 -0400
Committer: T Jake Luciani 
Committed: Mon May 9 13:50:24 2016 -0400

--
 CHANGES.txt|  1 +
 src/java/org/apache/cassandra/config/Config.java   |  2 +-
 .../cassandra/config/DatabaseDescriptor.java   |  7 +++
 .../apache/cassandra/utils/memory/BufferPool.java  | 17 -
 4 files changed, 25 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a093e8ca/CHANGES.txt
--
diff --cc CHANGES.txt
index 3cee7ae,b7715ba..6e60bba
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,12 -1,5 +1,13 @@@
 +3.7
 +Merged from 3.0:
 + * Refactor Materialized View code (CASSANDRA-11475)
 + * Update Java Driver (CASSANDRA-11615)
 +Merged from 2.2:
 + * Prohibit Reversed Counter type as part of the PK (CASSANDRA-9395)
 + * cqlsh: correctly handle non-ascii chars in error messages (CASSANDRA-11626)
 +
  3.6
+  * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710)
   * Enhanced Compaction Logging (CASSANDRA-10805)
   * Make prepared statement cache size configurable (CASSANDRA-11555)
   * Integrated JMX authentication and authorization (CASSANDRA-10091)



[3/5] cassandra git commit: Merge branch '3.6-retag' into cassandra-3.7

2016-05-09 Thread jake
Merge branch '3.6-retag' into cassandra-3.7


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a093e8ca
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a093e8ca
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a093e8ca

Branch: refs/heads/trunk
Commit: a093e8caeec431bdc8ec31efa8b3c1aeb9067285
Parents: a8a3a73 31cab36
Author: T Jake Luciani 
Authored: Mon May 9 13:50:24 2016 -0400
Committer: T Jake Luciani 
Committed: Mon May 9 13:50:24 2016 -0400

--
 CHANGES.txt|  1 +
 src/java/org/apache/cassandra/config/Config.java   |  2 +-
 .../cassandra/config/DatabaseDescriptor.java   |  7 +++
 .../apache/cassandra/utils/memory/BufferPool.java  | 17 -
 4 files changed, 25 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a093e8ca/CHANGES.txt
--
diff --cc CHANGES.txt
index 3cee7ae,b7715ba..6e60bba
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,12 -1,5 +1,13 @@@
 +3.7
 +Merged from 3.0:
 + * Refactor Materialized View code (CASSANDRA-11475)
 + * Update Java Driver (CASSANDRA-11615)
 +Merged from 2.2:
 + * Prohibit Reversed Counter type as part of the PK (CASSANDRA-9395)
 + * cqlsh: correctly handle non-ascii chars in error messages (CASSANDRA-11626)
 +
  3.6
+  * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710)
   * Enhanced Compaction Logging (CASSANDRA-10805)
   * Make prepared statement cache size configurable (CASSANDRA-11555)
   * Integrated JMX authentication and authorization (CASSANDRA-10091)



cassandra git commit: Prevent direct memory OOM on buffer pool allocations

2016-05-09 Thread jake
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.6 5634cea37 -> 31cab36b1


Prevent direct memory OOM on buffer pool allocations

Patch by Branimir Lambov; reviewed by tjake for
(CASSANDRA-11710)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/31cab36b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/31cab36b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/31cab36b

Branch: refs/heads/cassandra-3.6
Commit: 31cab36b1800f2042623633445d8be944217d5a2
Parents: 5634cea
Author: Branimir Lambov 
Authored: Thu May 5 11:30:00 2016 +0300
Committer: T Jake Luciani 
Committed: Mon May 9 13:48:30 2016 -0400

--
 CHANGES.txt|  1 +
 src/java/org/apache/cassandra/config/Config.java   |  2 +-
 .../cassandra/config/DatabaseDescriptor.java   |  7 +++
 .../apache/cassandra/utils/memory/BufferPool.java  | 17 -
 4 files changed, 25 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 4ff5b1a..b7715ba 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.6
+ * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710)
  * Enhanced Compaction Logging (CASSANDRA-10805)
  * Make prepared statement cache size configurable (CASSANDRA-11555)
  * Integrated JMX authentication and authorization (CASSANDRA-10091)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/Config.java
--
diff --git a/src/java/org/apache/cassandra/config/Config.java 
b/src/java/org/apache/cassandra/config/Config.java
index 02635bf..466b791 100644
--- a/src/java/org/apache/cassandra/config/Config.java
+++ b/src/java/org/apache/cassandra/config/Config.java
@@ -242,7 +242,7 @@ public class Config
 
 private static boolean isClientMode = false;
 
-public Integer file_cache_size_in_mb = 512;
+public Integer file_cache_size_in_mb;
 
 public boolean buffer_pool_use_heap_if_exhausted = true;
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index d8acdb8..3d38646 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -1776,6 +1776,13 @@ public class DatabaseDescriptor
 
 public static int getFileCacheSizeInMB()
 {
+if (conf.file_cache_size_in_mb == null)
+{
+// In client mode the value is not set.
+assert Config.isClientMode();
+return 0;
+}
+
 return conf.file_cache_size_in_mb;
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/utils/memory/BufferPool.java
--
diff --git a/src/java/org/apache/cassandra/utils/memory/BufferPool.java 
b/src/java/org/apache/cassandra/utils/memory/BufferPool.java
index ad2404f..5cd0051 100644
--- a/src/java/org/apache/cassandra/utils/memory/BufferPool.java
+++ b/src/java/org/apache/cassandra/utils/memory/BufferPool.java
@@ -273,7 +273,22 @@ public class BufferPool
 }
 
 // allocate a large chunk
-Chunk chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE));
+Chunk chunk;
+try
+{
+chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE));
+}
+catch (OutOfMemoryError oom)
+{
+noSpamLogger.error("Buffer pool failed to allocate chunk of 
{}, current size {} ({}). " +
+   "Attempting to continue; buffers will be 
allocated in on-heap memory which can degrade performance. " +
+   "Make sure direct memory size 
(-XX:MaxDirectMemorySize) is large enough to accommodate off-heap memtables and 
caches.",
+   
FBUtilities.prettyPrintMemory(MACRO_CHUNK_SIZE),
+   
FBUtilities.prettyPrintMemory(sizeInBytes()),
+   oom.toString());
+return false;
+}
+
 chunk.acquire(null);
 macroChunks.add(chunk);
 for (int i = 0 ; i < MACRO_CHUNK_SIZE ; i += CHUNK_SIZE)



[cassandra] Git Push Summary

2016-05-09 Thread jake
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.6 [created] 5634cea37


cassandra git commit: Use re-initialised headers for ColumnIndex for pre-3.0 sstables

2016-05-09 Thread jake
Repository: cassandra
Updated Branches:
  refs/heads/trunk 0f0b2dfce -> 640072b09


Use re-initialised headers for ColumnIndex for pre-3.0 sstables

Patch by Alex Petrov; reviewed by tjake for CASSANDRA-11736


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/640072b0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/640072b0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/640072b0

Branch: refs/heads/trunk
Commit: 640072b093ac7040a28ca932034e905935357ead
Parents: 0f0b2df
Author: Alex Petrov 
Authored: Mon May 9 15:27:50 2016 +0200
Committer: T Jake Luciani 
Committed: Mon May 9 12:52:48 2016 -0400

--
 .../org/apache/cassandra/io/sstable/format/big/BigTableWriter.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/640072b0/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java
--
diff --git 
a/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java 
b/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java
index 44b1c3a..39dc889 100644
--- a/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java
+++ b/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java
@@ -87,7 +87,7 @@ public class BigTableWriter extends SSTableWriter
 }
 iwriter = new IndexWriter(keyCount, dataFile);
 
-columnIndexWriter = new ColumnIndex(header, dataFile, 
descriptor.version, observers, 
getRowIndexEntrySerializer().indexInfoSerializer());
+columnIndexWriter = new ColumnIndex(this.header, dataFile, 
descriptor.version, this.observers, 
getRowIndexEntrySerializer().indexInfoSerializer());
 }
 
 public void mark()



[jira] [Updated] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails

2016-05-09 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-11736:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed {{640072b093ac7040a28ca932034e905935357ead}}

> LegacySSTableTest::testStreamLegacyCqlTables fails
> --
>
> Key: CASSANDRA-11736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11736
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Minor
> Fix For: 3.7
>
>
> [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/]
>  
> Error Message
> {code}
> org.apache.cassandra.streaming.StreamException: Stream failed
> {code}
> Stacktrace
> {code}
> java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>   at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>   at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>   at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>   at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>   at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
>   at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>   at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>   at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>   at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>   at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215)
>   at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191)
>   at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429)
>   at 
> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639)
>   at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489)
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> I've ran {{bisect}} against last commits and (given it fails constantly) it 
> started failing after [this 
> commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch

2016-05-09 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-11737:
--
Reviewer: Aleksey Yeschenko

> Add a way to disable severity in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-11737
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11737
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Jeremiah Jordan
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> I have seen in a few clusters now where severity can out weigh latency in 
> DynamicEndpointSnitch causing issues (a node that is completely overloaded 
> CPU wise and has super high latency will get selected for queries, even 
> though nodes with much lower latency exist, but they have a higher severity 
> score).  There should be a way to disable the use of severity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch

2016-05-09 Thread Jeremiah Jordan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan updated CASSANDRA-11737:

Fix Version/s: 3.x
   3.0.x
   2.2.x
   2.1.x

> Add a way to disable severity in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-11737
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11737
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Jeremiah Jordan
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> I have seen in a few clusters now where severity can out weigh latency in 
> DynamicEndpointSnitch causing issues (a node that is completely overloaded 
> CPU wise and has super high latency will get selected for queries, even 
> though nodes with much lower latency exist, but they have a higher severity 
> score).  There should be a way to disable the use of severity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11738) Re-think the use of Severity in the DynamicEndpointSnitch calculation

2016-05-09 Thread Jeremiah Jordan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan updated CASSANDRA-11738:

Fix Version/s: 3.x

> Re-think the use of Severity in the DynamicEndpointSnitch calculation
> -
>
> Key: CASSANDRA-11738
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11738
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
> Fix For: 3.x
>
>
> CASSANDRA-11737 was opened to allow completely disabling the use of severity 
> in the DynamicEndpointSnitch calculation, but that is a pretty big hammer.  
> There is probably something we can do to better use the score.
> The issue seems to be that severity is given equal weight with latency in the 
> current code, also that severity is only based on disk io.  If you have a 
> node that is CPU bound on something (say catching up on LCS compactions 
> because of bootstrap/repair/replace) the IO wait can be low, but the latency 
> to the node is high.
> Some ideas I had are:
> 1. Allowing a yaml parameter to tune how much impact the severity score has 
> in the calculation.
> 2. Taking CPU load into account as well as IO Wait (this would probably help 
> in the cases I have seen things go sideways)
> 3. Move the -D from CASSANDRA-11737 to being a yaml level setting
> 4. Go back to just relying on Latency and get rid of severity all together.  
> Now that we have rapid read protection, maybe just using latency is enough, 
> as it can help where the predictive nature of IO wait would have been useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11738) Re-think the use of Severity in the DynamicEndpointSnitch calculation

2016-05-09 Thread Jeremiah Jordan (JIRA)
Jeremiah Jordan created CASSANDRA-11738:
---

 Summary: Re-think the use of Severity in the DynamicEndpointSnitch 
calculation
 Key: CASSANDRA-11738
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11738
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremiah Jordan


CASSANDRA-11737 was opened to allow completely disabling the use of severity in 
the DynamicEndpointSnitch calculation, but that is a pretty big hammer.  There 
is probably something we can do to better use the score.

The issue seems to be that severity is given equal weight with latency in the 
current code, also that severity is only based on disk io.  If you have a node 
that is CPU bound on something (say catching up on LCS compactions because of 
bootstrap/repair/replace) the IO wait can be low, but the latency to the node 
is high.

Some ideas I had are:
1. Allowing a yaml parameter to tune how much impact the severity score has in 
the calculation.
2. Taking CPU load into account as well as IO Wait (this would probably help in 
the cases I have seen things go sideways)
3. Move the -D from CASSANDRA-11737 to being a yaml level setting
4. Go back to just relying on Latency and get rid of severity all together.  
Now that we have rapid read protection, maybe just using latency is enough, as 
it can help where the predictive nature of IO wait would have been useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch

2016-05-09 Thread Jeremiah Jordan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan updated CASSANDRA-11737:

Status: Patch Available  (was: Open)

> Add a way to disable severity in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-11737
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11737
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Jeremiah Jordan
>
> I have seen in a few clusters now where severity can out weigh latency in 
> DynamicEndpointSnitch causing issues (a node that is completely overloaded 
> CPU wise and has super high latency will get selected for queries, even 
> though nodes with much lower latency exist, but they have a higher severity 
> score).  There should be a way to disable the use of severity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch

2016-05-09 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276554#comment-15276554
 ] 

Jeremiah Jordan edited comment on CASSANDRA-11737 at 5/9/16 4:06 PM:
-

https://github.com/JeremiahDJordan/cassandra/commits/CASSANDRA-11737-21

That merges up cleanly all the way to trunk.  Adds a -D option to disable using 
severity in DynamicEndpointSnitch calculations.

For the clusters where we were seeing massive latency degredation when a single 
node was under load applying this patch and settings the -D to disable severity 
brought things back to normal.


was (Author: jjordan):
https://github.com/JeremiahDJordan/cassandra/commits/CASSANDRA-11737-21

That merges up cleanly all the way to trunk.  Adds a -D option to disable using 
severity in DynamicEndpointSnitch calculations.

> Add a way to disable severity in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-11737
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11737
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Jeremiah Jordan
>
> I have seen in a few clusters now where severity can out weigh latency in 
> DynamicEndpointSnitch causing issues (a node that is completely overloaded 
> CPU wise and has super high latency will get selected for queries, even 
> though nodes with much lower latency exist, but they have a higher severity 
> score).  There should be a way to disable the use of severity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch

2016-05-09 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276554#comment-15276554
 ] 

Jeremiah Jordan edited comment on CASSANDRA-11737 at 5/9/16 4:05 PM:
-

https://github.com/JeremiahDJordan/cassandra/commits/CASSANDRA-11737-21

That merges up cleanly all the way to trunk.  Adds a -D option to disable using 
severity in DynamicEndpointSnitch calculations.


was (Author: jjordan):
https://github.com/JeremiahDJordan/cassandra/commit/47768377afe9aff93a7e3de8190bc3124c5cefe6

> Add a way to disable severity in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-11737
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11737
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Jeremiah Jordan
>
> I have seen in a few clusters now where severity can out weigh latency in 
> DynamicEndpointSnitch causing issues (a node that is completely overloaded 
> CPU wise and has super high latency will get selected for queries, even 
> though nodes with much lower latency exist, but they have a higher severity 
> score).  There should be a way to disable the use of severity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch

2016-05-09 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276554#comment-15276554
 ] 

Jeremiah Jordan commented on CASSANDRA-11737:
-

https://github.com/JeremiahDJordan/cassandra/commit/47768377afe9aff93a7e3de8190bc3124c5cefe6

> Add a way to disable severity in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-11737
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11737
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Jeremiah Jordan
>
> I have seen in a few clusters now where severity can out weigh latency in 
> DynamicEndpointSnitch causing issues (a node that is completely overloaded 
> CPU wise and has super high latency will get selected for queries, even 
> though nodes with much lower latency exist, but they have a higher severity 
> score).  There should be a way to disable the use of severity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch

2016-05-09 Thread Jeremiah Jordan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan reassigned CASSANDRA-11737:
---

Assignee: Jeremiah Jordan

> Add a way to disable severity in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-11737
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11737
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Jeremiah Jordan
>
> I have seen in a few clusters now where severity can out weigh latency in 
> DynamicEndpointSnitch causing issues (a node that is completely overloaded 
> CPU wise and has super high latency will get selected for queries, even 
> though nodes with much lower latency exist, but they have a higher severity 
> score).  There should be a way to disable the use of severity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch

2016-05-09 Thread Jeremiah Jordan (JIRA)
Jeremiah Jordan created CASSANDRA-11737:
---

 Summary: Add a way to disable severity in DynamicEndpointSnitch
 Key: CASSANDRA-11737
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11737
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremiah Jordan


I have seen in a few clusters now where severity can out weigh latency in 
DynamicEndpointSnitch causing issues (a node that is completely overloaded CPU 
wise and has super high latency will get selected for queries, even though 
nodes with much lower latency exist, but they have a higher severity score).  
There should be a way to disable the use of severity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11491) Split repair job into tasks per table

2016-05-09 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276539#comment-15276539
 ] 

Paulo Motta commented on CASSANDRA-11491:
-

Increasing priority of this since this will allow running anti-compactions 
after each table is finished, and release sstable references earlier to avoid 
OOMing when running incremental repair in multiple tables (or keyspace-level).

> Split repair job into tasks per table
> -
>
> Key: CASSANDRA-11491
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11491
> Project: Cassandra
>  Issue Type: Task
>  Components: Streaming and Messaging
>Reporter: Paulo Motta
>Priority: Minor
>
> We currently split a parent repair session into multiple repair sessions, one 
> per range. Each repair session is further split into multiple repair jobs, 
> one per table.
> As we move into an auto-repair world with CASSANDRA-11190, with repair 
> settings per table, it will probably simplify things if we reason about 
> repair sessions on a per-table basis.
> Besides simplifying current code, this will simplify adding more advanced 
> scheduling of repair tasks per table and other optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11491) Split repair job into tasks per table

2016-05-09 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-11491:

Assignee: Paulo Motta
Priority: Major  (was: Minor)

> Split repair job into tasks per table
> -
>
> Key: CASSANDRA-11491
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11491
> Project: Cassandra
>  Issue Type: Task
>  Components: Streaming and Messaging
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> We currently split a parent repair session into multiple repair sessions, one 
> per range. Each repair session is further split into multiple repair jobs, 
> one per table.
> As we move into an auto-repair world with CASSANDRA-11190, with repair 
> settings per table, it will probably simplify things if we reason about 
> repair sessions on a per-table basis.
> Besides simplifying current code, this will simplify adding more advanced 
> scheduling of repair tasks per table and other optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails

2016-05-09 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-11736:
---
Reviewer: T Jake Luciani

> LegacySSTableTest::testStreamLegacyCqlTables fails
> --
>
> Key: CASSANDRA-11736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11736
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Minor
> Fix For: 3.7
>
>
> [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/]
>  
> Error Message
> {code}
> org.apache.cassandra.streaming.StreamException: Stream failed
> {code}
> Stacktrace
> {code}
> java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>   at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>   at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>   at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>   at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>   at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
>   at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>   at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>   at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>   at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>   at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215)
>   at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191)
>   at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429)
>   at 
> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639)
>   at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489)
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> I've ran {{bisect}} against last commits and (given it fails constantly) it 
> started failing after [this 
> commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails

2016-05-09 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276484#comment-15276484
 ] 

T Jake Luciani commented on CASSANDRA-11736:


+1 will commit once CI passes. Thx

> LegacySSTableTest::testStreamLegacyCqlTables fails
> --
>
> Key: CASSANDRA-11736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11736
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Minor
> Fix For: 3.7
>
>
> [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/]
>  
> Error Message
> {code}
> org.apache.cassandra.streaming.StreamException: Stream failed
> {code}
> Stacktrace
> {code}
> java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>   at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>   at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>   at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>   at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>   at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
>   at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>   at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>   at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>   at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>   at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215)
>   at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191)
>   at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429)
>   at 
> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639)
>   at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489)
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> I've ran {{bisect}} against last commits and (given it fails constantly) it 
> started failing after [this 
> commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9613) Omit (de)serialization of state variable in UDAs

2016-05-09 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-9613:

Status: Patch Available  (was: Open)

> Omit (de)serialization of state variable in UDAs
> 
>
> Key: CASSANDRA-9613
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9613
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> Currently the result of each UDA's state function call is serialized and then 
> deserialized for the next state-function invocation and optionally final 
> function invocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11031) MultiTenant : support “ALLOW FILTERING" for First Partition Key

2016-05-09 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276448#comment-15276448
 ] 

ZhaoYang commented on CASSANDRA-11031:
--

I have updated the patch and dtest. thank you

> MultiTenant : support “ALLOW FILTERING" for First Partition Key
> ---
>
> Key: CASSANDRA-11031
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11031
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Minor
> Fix For: 3.x
>
> Attachments: CASSANDRA-11031-3.7.patch
>
>
> Currently, Allow Filtering only works for secondary Index column or 
> clustering columns. And it's slow, because Cassandra will read all data from 
> SSTABLE from hard-disk to memory to filter.
> But we can support allow filtering on Partition Key, as far as I know, 
> Partition Key is in memory, so we can easily filter them, and then read 
> required data from SSTable.
> This will similar to "Select * from table" which scan through entire cluster.
> CREATE TABLE multi_tenant_table (
>   tenant_id text,
>   pk2 text,
>   c1 text,
>   c2 text,
>   v1 text,
>   v2 text,
>   PRIMARY KEY ((tenant_id,pk2),c1,c2)
> ) ;
> Select * from multi_tenant_table where tenant_id = "datastax" allow filtering;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11542) Create a benchmark to compare HDFS and Cassandra bulk read times

2016-05-09 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11542:

Reviewer: T Jake Luciani

> Create a benchmark to compare HDFS and Cassandra bulk read times
> 
>
> Key: CASSANDRA-11542
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11542
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 3.x
>
> Attachments: jfr_recordings.zip, spark-load-perf-results-001.zip, 
> spark-load-perf-results-002.zip, spark-load-perf-results-003.zip
>
>
> I propose creating a benchmark for comparing Cassandra and HDFS bulk reading 
> performance. Simple Spark queries will be performed on data stored in HDFS or 
> Cassandra, and the entire duration will be measured. An example query would 
> be the max or min of a column or a count\(*\).
> This benchmark should allow determining the impact of:
> * partition size
> * number of clustering columns
> * number of value columns (cells)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11728) Incremental repair fails with vnodes+lcs+multi-dc

2016-05-09 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276376#comment-15276376
 ] 

Marcus Eriksson commented on CASSANDRA-11728:
-

then you need to provide more details on how it fails

> Incremental repair fails with vnodes+lcs+multi-dc
> -
>
> Key: CASSANDRA-11728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11728
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nick Bailey
>
> Produced on 2.1.12
> We are seeing incremental repair fail with an error regarding creating 
> multiple repair sessions on overlapping sstables. This is happening in the 
> following setup
> * 6 nodes
> * 2 Datacenters
> * Vnodes enabled
> * Leveled compaction on the relevant tables
> When STCS is used instead, we don't hit an issue. This is slightly related to 
> https://issues.apache.org/jira/browse/CASSANDRA-11461, except in this case 
> OpsCenter repair service is running all repairs sequentially. Let me know 
> what other information we can provide. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (CASSANDRA-11728) Incremental repair fails with vnodes+lcs+multi-dc

2016-05-09 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reopened CASSANDRA-11728:
-

> Incremental repair fails with vnodes+lcs+multi-dc
> -
>
> Key: CASSANDRA-11728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11728
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nick Bailey
>
> Produced on 2.1.12
> We are seeing incremental repair fail with an error regarding creating 
> multiple repair sessions on overlapping sstables. This is happening in the 
> following setup
> * 6 nodes
> * 2 Datacenters
> * Vnodes enabled
> * Leveled compaction on the relevant tables
> When STCS is used instead, we don't hit an issue. This is slightly related to 
> https://issues.apache.org/jira/browse/CASSANDRA-11461, except in this case 
> OpsCenter repair service is running all repairs sequentially. Let me know 
> what other information we can provide. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails

2016-05-09 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-11736:

Status: Patch Available  (was: Open)

Constructor for {{ColumnIndex}} was moved from 
[here|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a#diff-59e5dd00b6986242a4d247b405808b0bL158]
 to 
[here|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a#diff-59e5dd00b6986242a4d247b405808b0bR90],
 and re-initialised {{header}} from {{SStableWriter}} constructor (field) was 
never used, passed constructor argument was used instead. Same was happening 
with {{observers}} field.

||[trunk|https://github.com/ifesdjeen/cassandra/tree/11736-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11736-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11736-trunk-dtest/]|


> LegacySSTableTest::testStreamLegacyCqlTables fails
> --
>
> Key: CASSANDRA-11736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11736
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Minor
> Fix For: 3.7
>
>
> [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/]
>  
> Error Message
> {code}
> org.apache.cassandra.streaming.StreamException: Stream failed
> {code}
> Stacktrace
> {code}
> java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>   at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>   at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>   at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>   at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>   at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
>   at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>   at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>   at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>   at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>   at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215)
>   at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191)
>   at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429)
>   at 
> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639)
>   at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489)
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> I've ran {{bisect}} against last commits and (given it fails constantly) it 
> started failing after [this 
> commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11728) Incremental repair fails with vnodes+lcs+multi-dc

2016-05-09 Thread Adam Hattrell (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276368#comment-15276368
 ] 

Adam Hattrell commented on CASSANDRA-11728:
---

We still see it - even with the fix.

> Incremental repair fails with vnodes+lcs+multi-dc
> -
>
> Key: CASSANDRA-11728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11728
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nick Bailey
>
> Produced on 2.1.12
> We are seeing incremental repair fail with an error regarding creating 
> multiple repair sessions on overlapping sstables. This is happening in the 
> following setup
> * 6 nodes
> * 2 Datacenters
> * Vnodes enabled
> * Leveled compaction on the relevant tables
> When STCS is used instead, we don't hit an issue. This is slightly related to 
> https://issues.apache.org/jira/browse/CASSANDRA-11461, except in this case 
> OpsCenter repair service is running all repairs sequentially. Let me know 
> what other information we can provide. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records

2016-05-09 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276365#comment-15276365
 ] 

Branimir Lambov commented on CASSANDRA-9669:


The test failures appear to be flakes -- mainly timeouts, and the failing tests 
pass when I run them locally. Patch is thus ready to commit.

> If sstable flushes complete out of order, on restart we can fail to replay 
> necessary commit log records
> ---
>
> Key: CASSANDRA-9669
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9669
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Benedict
>Priority: Critical
>  Labels: correctness
> Fix For: 2.2.x, 3.0.x, 3.x
>
>
> While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, 
> on restart we simply take the maximum replay position of any sstable on disk, 
> and ignore anything prior. 
> It is quite possible for there to be two flushes triggered for a given table, 
> and for the second to finish first by virtue of containing a much smaller 
> quantity of live data (or perhaps the disk is just under less pressure). If 
> we crash before the first sstable has been written, then on restart the data 
> it would have represented will disappear, since we will not replay the CL 
> records.
> This looks to be a bug present since time immemorial, and also seems pretty 
> serious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records

2016-05-09 Thread Branimir Lambov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-9669:
---
Status: Ready to Commit  (was: Patch Available)

> If sstable flushes complete out of order, on restart we can fail to replay 
> necessary commit log records
> ---
>
> Key: CASSANDRA-9669
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9669
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Benedict
>Priority: Critical
>  Labels: correctness
> Fix For: 2.2.x, 3.0.x, 3.x
>
>
> While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, 
> on restart we simply take the maximum replay position of any sstable on disk, 
> and ignore anything prior. 
> It is quite possible for there to be two flushes triggered for a given table, 
> and for the second to finish first by virtue of containing a much smaller 
> quantity of live data (or perhaps the disk is just under less pressure). If 
> we crash before the first sstable has been written, then on restart the data 
> it would have represented will disappear, since we will not replay the CL 
> records.
> This looks to be a bug present since time immemorial, and also seems pretty 
> serious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10292) java.lang.AssertionError: attempted to delete non-existing file CommitLog...

2016-05-09 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276357#comment-15276357
 ] 

Joshua McKenzie commented on CASSANDRA-10292:
-

The 2.2 fix of the bug is Windows-specific / nio related. Shouldn't be the 
issue here as env. is CentOS.

> java.lang.AssertionError: attempted to delete non-existing file CommitLog...
> 
>
> Key: CASSANDRA-10292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10292
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS Linux 7.1.1503, Cassandra 2.1.8 stable version, 6 
> nodes cluster
>Reporter: Dawid Szejnfeld
>Priority: Critical
>
> From time to time some nodes are stopping to work due to error in logs like 
> this:
> INFO  [CompactionExecutor:2475] 2015-09-09 12:36:50,363 
> CompactionTask.java:274 - Compacted 4 sstables to 
> [/mnt/cassandra--storage-machine/data/system/compactions_in_progress-55080ab05d9c38
> 8690a4acb25fe1f77b/system-compactions_in_progress-ka-126,].  419 bytes to 42 
> (~10% of original) in 33ms = 0.001214MB/s.  4 total partitions merged to 1.  
> Partition merge counts were {2:2, }
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,166 
> ColumnFamilyStore.java:912 - Enqueuing flush of settings: 78364 (0%) on-heap, 
> 0 (0%) off-heap
> INFO  [MemtableFlushWriter:301] 2015-09-09 12:52:34,172 Memtable.java:347 - 
> Writing Memtable-settings@1126939979(0.113KiB serialized bytes, 1850 ops, 
> 0%/0% of on/off-heap limit)
> INFO  [MemtableFlushWriter:301] 2015-09-09 12:52:34,174 Memtable.java:382 - 
> Completed flushing 
> /mnt/cassandra--storage-machine/data/OpsCenter/settings-464866c04b1311e590698d1a9fd4ba8b/OpsCe
> nter-settings-tmp-ka-12-Data.db (0.000KiB) for commitlog position 
> ReplayPosition(segmentId=1441362636571, position=33554415)
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,194 StorageService.java:453 
> - Stopping gossiper
> WARN  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 StorageService.java:359 
> - Stopping gossip by operator request
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 Gossiper.java:1410 - 
> Announcing shutdown
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,195 StorageService.java:458 
> - Stopping RPC server
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,196 ThriftServer.java:142 - 
> Stop listening to thrift clients
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,204 StorageService.java:463 
> - Stopping native transport
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,422 Server.java:213 - Stop 
> listening for CQL clients
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,423 CommitLog.java:397 - 
> Failed managing commit log segments. Commit disk failure policy is stop; 
> terminating thread
> java.lang.AssertionError: attempted to delete non-existing file 
> CommitLog-4-1441362636316.log
> at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:126) 
> ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegment.delete(CommitLogSegment.java:343)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:418)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:413)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:152)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> [apache-cassandra-2.1.8.jar:2.1.8]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85]
> After I create missing commit log file and restart cassandra service 
> everything is OK then.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails

2016-05-09 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov reassigned CASSANDRA-11736:
---

Assignee: Alex Petrov

> LegacySSTableTest::testStreamLegacyCqlTables fails
> --
>
> Key: CASSANDRA-11736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11736
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Minor
> Fix For: 3.7
>
>
> [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/]
>  
> Error Message
> {code}
> org.apache.cassandra.streaming.StreamException: Stream failed
> {code}
> Stacktrace
> {code}
> java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>   at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>   at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>   at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155)
>   at 
> org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>   at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>   at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
>   at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>   at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>   at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>   at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>   at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215)
>   at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191)
>   at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429)
>   at 
> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639)
>   at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489)
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> I've ran {{bisect}} against last commits and (given it fails constantly) it 
> started failing after [this 
> commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11604) select on table fails after changing user defined type in map

2016-05-09 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-11604:

Status: Patch Available  (was: Open)

This assertion was removed in 
[trunk|https://github.com/apache/cassandra/commit/677230df694752c7ecf6d5459eee60ad7cf45ecf#diff-bc19f192ef82fbca9abd27526054bb0fL254]
 (appl in 3.5 has similar effect).

>From what I can say, scrub doesn't fix that issue. Node restart alone has the 
>same effect, or the flush:
During the node restart, commit log will replay mutation with the same schema 
as the table itself. During the flush and consequent reads, all Cells will get 
the correct Column Definition. Although this assert doesn't change the 
behaviour, since ALTER statements only allow "backward-compatible" changes 
(after the schema change, it'll be possible to work with the old version, too).

I've added the test for this particular edge case (updating UDT within inserted 
non-frozen map) for {{trunk}} and removed assert in {{3.0.x}} (along with 
adding the test):

||[3.0|https://github.com/ifesdjeen/cassandra/tree/11604-3.0]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-3.0-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-3.0-dtest/]|
||[trunk|https://github.com/ifesdjeen/cassandra/tree/11604-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-dtest/]|

{{LegacySSTableTest}} is failing locally on trunk, too, although it happened 
before this commit (also, there were no code changes for trunk). The rest of 
tests are passing locally, too.

> select on table fails after changing user defined type in map
> -
>
> Key: CASSANDRA-11604
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11604
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andreas Jaekle
>Assignee: Alex Petrov
> Fix For: 3.x
>
>
> in cassandra 3.5 i get the following exception when i run this cqls:
> {code}
> --DROP KEYSPACE bugtest ;
> CREATE KEYSPACE bugtest
>  WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
> use bugtest;
> CREATE TYPE tt (
>   a boolean
> );
> create table t1 (
>   k text,
>   v map,
>   PRIMARY KEY(k)
> );
> insert into t1 (k,v) values ('k2',{'mk':{a:false}});
> ALTER TYPE tt ADD b boolean;
> UPDATE t1 SET v['mk'] = { b:true } WHERE k = 'k2';
> select * from t1;  
> {code}
> the last select fails.
> {code}
> WARN  [SharedPool-Worker-5] 2016-04-19 14:18:49,885 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-5,5,main]: {}
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.db.rows.ComplexColumnData$Builder.addCell(ComplexColumnData.java:254)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:623)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:419)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> 

[jira] [Created] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails

2016-05-09 Thread Alex Petrov (JIRA)
Alex Petrov created CASSANDRA-11736:
---

 Summary: LegacySSTableTest::testStreamLegacyCqlTables fails
 Key: CASSANDRA-11736
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11736
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Alex Petrov
Priority: Minor
 Fix For: 3.7


[example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/]
 

Error Message
{code}
org.apache.cassandra.streaming.StreamException: Stream failed
{code}

Stacktrace
{code}
java.util.concurrent.ExecutionException: 
org.apache.cassandra.streaming.StreamException: Stream failed
at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
at 
org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175)
at 
org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155)
at 
org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145)
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
at 
org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
at 
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
at 
com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
at 
com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
at 
com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
at 
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215)
at 
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191)
at 
org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429)
at 
org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639)
at 
org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489)
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276)
at java.lang.Thread.run(Thread.java:745)
{code}

I've ran {{bisect}} against last commits and (given it fails constantly) it 
started failing after [this 
commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-11735) cassandra-env.sh doesn't test the correct java version

2016-05-09 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe resolved CASSANDRA-11735.
-
Resolution: Duplicate

> cassandra-env.sh doesn't test the correct java version
> --
>
> Key: CASSANDRA-11735
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11735
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 14.04
> openjdk 7 patch >=100
>Reporter: Maxime Bugeia
>Priority: Minor
>
> With the latest patch of openjdk, all nodetool actions fail and display 
> "Cassandra 2.0 and later require Java 7u25 or later." because 
> cassandra-env.sh test of java version is broken.
> Line 102:
> if [ "$JVM_VERSION" \< "1.7" ] && [ "$JVM_PATCH_VERSION"  \< "25" ] ; then
> echo "Cassandra 2.0 and later require Java 7u25 or later."
> exit 1;
> fi
> The second test cause all java patch >100 to be considered as inferior. One 
> correct syntax is "-lt" instead of "\<".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11670) Error while waiting on bootstrap to complete. Bootstrap will have to be restarted. Stream failed

2016-05-09 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276234#comment-15276234
 ] 

Alexander Heiß commented on CASSANDRA-11670:


We raised the *commitlog_segment_size_in_mb* to 128
Now we get another error:
{quote}
ERROR 10:30:29 [Stream #9e733a20-15be-11e6-9bb1-31c0715c4db0] Streaming error 
occurred
java.io.IOException: CF 64aecb30-11f7-11e6-89d2-9d1dd801d7e2 was dropped during 
streaming
at 
org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:76)
 ~[apache-cassandra-3.0.5.jar:3.0.5]
at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:50)
 ~[apache-cassandra-3.0.5.jar:3.0.5]
at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:39)
 ~[apache-cassandra-3.0.5.jar:3.0.5]
at 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:59)
 ~[apache-cassandra-3.0.5.jar:3.0.5]
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268)
 ~[apache-cassandra-3.0.5.jar:3.0.5]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_11]
INFO  10:30:29 [Stream #9e733a20-15be-11e6-9bb1-31c0715c4db0] Session with 
/176.9.99.140 is complete
WARN  10:30:29 [Stream #9e733a20-15be-11e6-9bb1-31c0715c4db0] Stream failed
ERROR 10:30:29 Error while waiting on bootstrap to complete. Bootstrap will 
have to be restarted.
{quote}

Approximately 2 Hours after the Bootstrap starts (2 Hours is the 
*streaming_socket_timeout_in_ms* could that have something to to with the 
problem ?)

> Error while waiting on bootstrap to complete. Bootstrap will have to be 
> restarted. Stream failed
> 
>
> Key: CASSANDRA-11670
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11670
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration, Streaming and Messaging
>Reporter: Anastasia Osintseva
>Assignee: Paulo Motta
> Fix For: 3.0.5
>
>
> I have in cluster 2 DC, in each DC - 2 Nodes. I wanted to add 1 node to each 
> DC. One node has been added successfully after I had made scrubing. 
> Now I'm trying to add node to another DC, but get error: 
> org.apache.cassandra.streaming.StreamException: Stream failed. 
> After scrubing and repair I get the same error.  
> {noformat}
> ERROR [StreamReceiveTask:5] 2016-04-27 00:33:21,082 Keyspace.java:492 - 
> Unknown exception caught while attempting to update MaterializedView! 
> messages_dump.messages
> java.lang.IllegalArgumentException: Mutation of 34974901 bytes is too large 
> for the maxiumum size of 33554432
>   at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:264) 
> ~[apache-cassandra-3.0.5.jar:3.0.5]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:469) 
> [apache-cassandra-3.0.5.jar:3.0.5]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:384) 
> [apache-cassandra-3.0.5.jar:3.0.5]
>   at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:205) 
> [apache-cassandra-3.0.5.jar:3.0.5]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:217) 
> [apache-cassandra-3.0.5.jar:3.0.5]
>   at 
> org.apache.cassandra.batchlog.BatchlogManager.store(BatchlogManager.java:146) 
> ~[apache-cassandra-3.0.5.jar:3.0.5]
>   at 
> org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:724) 
> ~[apache-cassandra-3.0.5.jar:3.0.5]
>   at 
> org.apache.cassandra.db.view.ViewManager.pushViewReplicaUpdates(ViewManager.java:149)
>  ~[apache-cassandra-3.0.5.jar:3.0.5]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:487) 
> [apache-cassandra-3.0.5.jar:3.0.5]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:384) 
> [apache-cassandra-3.0.5.jar:3.0.5]
>   at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:205) 
> [apache-cassandra-3.0.5.jar:3.0.5]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:217) 
> [apache-cassandra-3.0.5.jar:3.0.5]
>   at org.apache.cassandra.db.Mutation.applyUnsafe(Mutation.java:236) 
> [apache-cassandra-3.0.5.jar:3.0.5]
>   at 
> org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:169)
>  [apache-cassandra-3.0.5.jar:3.0.5]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_11]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_11]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_11]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  

[jira] [Comment Edited] (CASSANDRA-7826) support non-frozen, nested collections

2016-05-09 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276186#comment-15276186
 ] 

Alex Petrov edited comment on CASSANDRA-7826 at 5/9/16 11:23 AM:
-

After reviewing [https://issues.apache.org/jira/browse/CASSANDRA-7396] in more 
details and finding common grounds with that ticket, we need to allow 
{{SELECT}} , {{UPDATE}}and {{DELETE}} for individual fields of the nested 
collections (up to the deepest nesting level for maps, none for lists and 
sets), with syntax similar to 7396. Otherwise there's no big difference between 
frozen and non-frozen collections.

At this particular moment I'm not entirely certain about the slices of keys for 
maps. It seems that there are many more cases for fetching a particular key 
from the map by path than for slices.


was (Author: ifesdjeen):
After reviewing [https://issues.apache.org/jira/browse/CASSANDRA-7396] in more 
details and finding common grounds with that ticket, we need to allow 
{{SELECT}}ing, {{UPDATE}}ing and {{DELETE}}ing individual fields of the nested 
collections (up to the deepest nesting level for maps, none for lists and 
sets), with syntax similar to 7396. Otherwise there's no big difference between 
frozen and non-frozen collections.

At this particular moment I'm not entirely certain about the slices of keys for 
maps. It seems that there are many more cases for fetching a particular key 
from the map by path than for slices.

> support non-frozen, nested collections
> --
>
> Key: CASSANDRA-7826
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7826
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Tupshin Harper
>Assignee: Alex Petrov
>  Labels: ponies
> Fix For: 3.x
>
>
> The inability to nest collections is one of the bigger data modelling 
> limitations we have right now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11735) cassandra-env.sh doesn't test the correct java version

2016-05-09 Thread Maxime Bugeia (JIRA)
Maxime Bugeia created CASSANDRA-11735:
-

 Summary: cassandra-env.sh doesn't test the correct java version
 Key: CASSANDRA-11735
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11735
 Project: Cassandra
  Issue Type: Bug
 Environment: Ubuntu 14.04
openjdk 7 patch >=100
Reporter: Maxime Bugeia
Priority: Minor


With the latest patch of openjdk, all nodetool actions fail and display 
"Cassandra 2.0 and later require Java 7u25 or later." because cassandra-env.sh 
test of java version is broken.
Line 102:
if [ "$JVM_VERSION" \< "1.7" ] && [ "$JVM_PATCH_VERSION"  \< "25" ] ; then
echo "Cassandra 2.0 and later require Java 7u25 or later."
exit 1;
fi

The second test cause all java patch >100 to be considered as inferior. One 
correct syntax is "-lt" instead of "\<".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7826) support non-frozen, nested collections

2016-05-09 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276186#comment-15276186
 ] 

Alex Petrov commented on CASSANDRA-7826:


After reviewing [https://issues.apache.org/jira/browse/CASSANDRA-7396] in more 
details and finding common grounds with that ticket, we need to allow 
{{SELECT}}ing, {{UPDATE}}ing and {{DELETE}}ing individual fields of the nested 
collections (up to the deepest nesting level for maps, none for lists and 
sets), with syntax similar to 7396. Otherwise there's no big difference between 
frozen and non-frozen collections.

At this particular moment I'm not entirely certain about the slices of keys for 
maps. It seems that there are many more cases for fetching a particular key 
from the map by path than for slices.

> support non-frozen, nested collections
> --
>
> Key: CASSANDRA-7826
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7826
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Tupshin Harper
>Assignee: Alex Petrov
>  Labels: ponies
> Fix For: 3.x
>
>
> The inability to nest collections is one of the bigger data modelling 
> limitations we have right now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7396) Allow selecting Map key, List index

2016-05-09 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276175#comment-15276175
 ] 

Alex Petrov commented on CASSANDRA-7396:


After talking with [~snazy], there's an idea to postpone the slice deletes we 
have range tombstones supported for the Cells (which as I understood has to 
wait until the next version of the storage format). This will avoid read-before 
write for the delete operations.

> Allow selecting Map key, List index
> ---
>
> Key: CASSANDRA-7396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7396
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
>  Labels: cql, docs-impacting
> Fix For: 3.x
>
> Attachments: 7396_unit_tests.txt
>
>
> Allow "SELECT map['key]" and "SELECT list[index]."  (Selecting a UDT subfield 
> is already supported.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10292) java.lang.AssertionError: attempted to delete non-existing file CommitLog...

2016-05-09 Thread Youcef HILEM (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276098#comment-15276098
 ] 

Youcef HILEM commented on CASSANDRA-10292:
--

Thank you.
The fix in version 2.1.12 is 
https://issues.apache.org/jira/browse/CASSANDRA-10377


> java.lang.AssertionError: attempted to delete non-existing file CommitLog...
> 
>
> Key: CASSANDRA-10292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10292
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS Linux 7.1.1503, Cassandra 2.1.8 stable version, 6 
> nodes cluster
>Reporter: Dawid Szejnfeld
>Priority: Critical
>
> From time to time some nodes are stopping to work due to error in logs like 
> this:
> INFO  [CompactionExecutor:2475] 2015-09-09 12:36:50,363 
> CompactionTask.java:274 - Compacted 4 sstables to 
> [/mnt/cassandra--storage-machine/data/system/compactions_in_progress-55080ab05d9c38
> 8690a4acb25fe1f77b/system-compactions_in_progress-ka-126,].  419 bytes to 42 
> (~10% of original) in 33ms = 0.001214MB/s.  4 total partitions merged to 1.  
> Partition merge counts were {2:2, }
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,166 
> ColumnFamilyStore.java:912 - Enqueuing flush of settings: 78364 (0%) on-heap, 
> 0 (0%) off-heap
> INFO  [MemtableFlushWriter:301] 2015-09-09 12:52:34,172 Memtable.java:347 - 
> Writing Memtable-settings@1126939979(0.113KiB serialized bytes, 1850 ops, 
> 0%/0% of on/off-heap limit)
> INFO  [MemtableFlushWriter:301] 2015-09-09 12:52:34,174 Memtable.java:382 - 
> Completed flushing 
> /mnt/cassandra--storage-machine/data/OpsCenter/settings-464866c04b1311e590698d1a9fd4ba8b/OpsCe
> nter-settings-tmp-ka-12-Data.db (0.000KiB) for commitlog position 
> ReplayPosition(segmentId=1441362636571, position=33554415)
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,194 StorageService.java:453 
> - Stopping gossiper
> WARN  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 StorageService.java:359 
> - Stopping gossip by operator request
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 Gossiper.java:1410 - 
> Announcing shutdown
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,195 StorageService.java:458 
> - Stopping RPC server
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,196 ThriftServer.java:142 - 
> Stop listening to thrift clients
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,204 StorageService.java:463 
> - Stopping native transport
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,422 Server.java:213 - Stop 
> listening for CQL clients
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,423 CommitLog.java:397 - 
> Failed managing commit log segments. Commit disk failure policy is stop; 
> terminating thread
> java.lang.AssertionError: attempted to delete non-existing file 
> CommitLog-4-1441362636316.log
> at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:126) 
> ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegment.delete(CommitLogSegment.java:343)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:418)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:413)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:152)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> [apache-cassandra-2.1.8.jar:2.1.8]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85]
> After I create missing commit log file and restart cassandra service 
> everything is OK then.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11726) IndexOutOfBoundsException when selecting (distinct) row ids from counter table.

2016-05-09 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-11726:

Description: 
I have simple table containing counters:

{code}
CREATE TABLE tablename (
object_id ascii,
counter_id ascii,
count counter,
PRIMARY KEY (object_id, counter_id)
) WITH CLUSTERING ORDER BY (counter_id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'enabled': 'false'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
{code}

Counters are often inc/decreased, whole rows are queried, deleted sometimes.

After some time I tried to query all object_ids, but it failed with:

{code}
cqlsh:woc> consistency quorum;
cqlsh:woc> select object_id from tablename;
ServerError: 
{code}

select * from ..., select where .., updates works well..

With consistency one it works sometimes, so it seems something is broken at one 
server, but I tried to repair table there and it did not help. 

Whole exception from server log:

{code}
java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_73]
at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) 
~[na:1.8.0_73]
at 
org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:141)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.context.CounterContext.access$100(CounterContext.java:76)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.context.CounterContext$ContextState.(CounterContext.java:758)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.context.CounterContext$ContextState.wrap(CounterContext.java:765)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.context.CounterContext.merge(CounterContext.java:271) 
~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.Conflicts.mergeCounterValues(Conflicts.java:76) 
~[apache-cassandra-3.5.jar:3.5]
at org.apache.cassandra.db.rows.Cells.reconcile(Cells.java:143) 
~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:591)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.5.jar:3.5]
at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) 
~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:419)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:279)
 ~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:112) 
~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.transform.FilteredRows.isEmpty(FilteredRows.java:30) 
~[apache-cassandra-3.5.jar:3.5]
at 
org.apache.cassandra.db.transform.Filter.closeIfEmpty(Filter.java:49) 
~[apache-cassandra-3.5.jar:3.5]
at 

[jira] [Commented] (CASSANDRA-11726) IndexOutOfBoundsException when selecting (distinct) row ids from counter table.

2016-05-09 Thread Jaroslav Kamenik (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276003#comment-15276003
 ] 

Jaroslav Kamenik commented on CASSANDRA-11726:
--

Hi,
thank you for the advice, unfortunately it did not help:( . 

> IndexOutOfBoundsException when selecting (distinct) row ids from counter 
> table.
> ---
>
> Key: CASSANDRA-11726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11726
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 3.5, cluster of 4 nodes.
>Reporter: Jaroslav Kamenik
>
> I have simple table containing counters:
> CREATE TABLE tablename (
> object_id ascii,
> counter_id ascii,
> count counter,
> PRIMARY KEY (object_id, counter_id)
> ) WITH CLUSTERING ORDER BY (counter_id ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'enabled': 'false'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> Counters are often inc/decreased, whole rows are queried, deleted sometimes.
> After some time I tried to query all object_ids, but it failed with:
> cqlsh:woc> consistency quorum;
> cqlsh:woc> select object_id from tablename;
> ServerError:  message="java.lang.IndexOutOfBoundsException">
> select * from ..., select where .., updates works well..
> With consistency one it works sometimes, so it seems something is broken at 
> one server, but I tried to repair table there and it did not help. 
> Whole exception from server log:
> java.lang.IndexOutOfBoundsException: null
> at java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_73]
> at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) 
> ~[na:1.8.0_73]
> at 
> org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:141)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext.access$100(CounterContext.java:76)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext$ContextState.(CounterContext.java:758)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext$ContextState.wrap(CounterContext.java:765)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext.merge(CounterContext.java:271) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.Conflicts.mergeCounterValues(Conflicts.java:76) 
> ~[apache-cassandra-3.5.jar:3.5]
> at org.apache.cassandra.db.rows.Cells.reconcile(Cells.java:143) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:591)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> 

[jira] [Commented] (CASSANDRA-10292) java.lang.AssertionError: attempted to delete non-existing file CommitLog...

2016-05-09 Thread Dawid Szejnfeld (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276000#comment-15276000
 ] 

Dawid Szejnfeld commented on CASSANDRA-10292:
-

I think it was fixed also in version 2.1.12 as you can see here 
(https://github.com/apache/cassandra/blob/cassandra-2.1.12/CHANGES.txt). So 
simple update to version 2.1.14 should solve the problem as you have 2.1.9 
version currently and the upgrade is not needed.

> java.lang.AssertionError: attempted to delete non-existing file CommitLog...
> 
>
> Key: CASSANDRA-10292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10292
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS Linux 7.1.1503, Cassandra 2.1.8 stable version, 6 
> nodes cluster
>Reporter: Dawid Szejnfeld
>Priority: Critical
>
> From time to time some nodes are stopping to work due to error in logs like 
> this:
> INFO  [CompactionExecutor:2475] 2015-09-09 12:36:50,363 
> CompactionTask.java:274 - Compacted 4 sstables to 
> [/mnt/cassandra--storage-machine/data/system/compactions_in_progress-55080ab05d9c38
> 8690a4acb25fe1f77b/system-compactions_in_progress-ka-126,].  419 bytes to 42 
> (~10% of original) in 33ms = 0.001214MB/s.  4 total partitions merged to 1.  
> Partition merge counts were {2:2, }
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,166 
> ColumnFamilyStore.java:912 - Enqueuing flush of settings: 78364 (0%) on-heap, 
> 0 (0%) off-heap
> INFO  [MemtableFlushWriter:301] 2015-09-09 12:52:34,172 Memtable.java:347 - 
> Writing Memtable-settings@1126939979(0.113KiB serialized bytes, 1850 ops, 
> 0%/0% of on/off-heap limit)
> INFO  [MemtableFlushWriter:301] 2015-09-09 12:52:34,174 Memtable.java:382 - 
> Completed flushing 
> /mnt/cassandra--storage-machine/data/OpsCenter/settings-464866c04b1311e590698d1a9fd4ba8b/OpsCe
> nter-settings-tmp-ka-12-Data.db (0.000KiB) for commitlog position 
> ReplayPosition(segmentId=1441362636571, position=33554415)
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,194 StorageService.java:453 
> - Stopping gossiper
> WARN  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 StorageService.java:359 
> - Stopping gossip by operator request
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 Gossiper.java:1410 - 
> Announcing shutdown
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,195 StorageService.java:458 
> - Stopping RPC server
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,196 ThriftServer.java:142 - 
> Stop listening to thrift clients
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,204 StorageService.java:463 
> - Stopping native transport
> INFO  [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,422 Server.java:213 - Stop 
> listening for CQL clients
> ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,423 CommitLog.java:397 - 
> Failed managing commit log segments. Commit disk failure policy is stop; 
> terminating thread
> java.lang.AssertionError: attempted to delete non-existing file 
> CommitLog-4-1441362636316.log
> at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:126) 
> ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegment.delete(CommitLogSegment.java:343)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:418)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:413)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:152)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> [apache-cassandra-2.1.8.jar:2.1.8]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85]
> After I create missing commit log file and restart cassandra service 
> everything is OK then.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-11728) Incremental repair fails with vnodes+lcs+multi-dc

2016-05-09 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson resolved CASSANDRA-11728.
-
Resolution: Fixed

pretty sure this is a duplicate of CASSANDRA-10831 - could you reopen if it 
reproduces in 2.1.13+?

> Incremental repair fails with vnodes+lcs+multi-dc
> -
>
> Key: CASSANDRA-11728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11728
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nick Bailey
>
> Produced on 2.1.12
> We are seeing incremental repair fail with an error regarding creating 
> multiple repair sessions on overlapping sstables. This is happening in the 
> following setup
> * 6 nodes
> * 2 Datacenters
> * Vnodes enabled
> * Leveled compaction on the relevant tables
> When STCS is used instead, we don't hit an issue. This is slightly related to 
> https://issues.apache.org/jira/browse/CASSANDRA-11461, except in this case 
> OpsCenter repair service is running all repairs sequentially. Let me know 
> what other information we can provide. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)