[jira] [Created] (CASSANDRA-11741) Don't return data to the client when skipping a page
Boying Lu created CASSANDRA-11741: - Summary: Don't return data to the client when skipping a page Key: CASSANDRA-11741 URL: https://issues.apache.org/jira/browse/CASSANDRA-11741 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Boying Lu DataStax java driver support 'paging' but it doesn't support skip between pages. To go from page A to page B, user has to go through each page between A and B. Because user only interested in the "PagingState" object before reaching page B, it can save great bandwidth between server and client if the data of pages between A and B doesn't return to the client, especially when the page size is large. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11645) (single) dtest failure in snapshot_test.TestArchiveCommitlog.test_archive_commitlog_with_active_commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277369#comment-15277369 ] Russ Hatch commented on CASSANDRA-11645: 20 runs locally do not repro. > (single) dtest failure in > snapshot_test.TestArchiveCommitlog.test_archive_commitlog_with_active_commitlog > - > > Key: CASSANDRA-11645 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11645 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Russ Hatch > Labels: dtest > > This was a singular but pretty recent failure, so thought it might be worth > digging into to see if it repros. > http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/211/testReport/snapshot_test/TestArchiveCommitlog/test_archive_commitlog_with_active_commitlog > Failed on CassCI build cassandra-2.1_dtest_jdk8 #211 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-11645) (single) dtest failure in snapshot_test.TestArchiveCommitlog.test_archive_commitlog_with_active_commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russ Hatch reassigned CASSANDRA-11645: -- Assignee: Russ Hatch (was: DS Test Eng) > (single) dtest failure in > snapshot_test.TestArchiveCommitlog.test_archive_commitlog_with_active_commitlog > - > > Key: CASSANDRA-11645 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11645 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Russ Hatch > Labels: dtest > > This was a singular but pretty recent failure, so thought it might be worth > digging into to see if it repros. > http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/211/testReport/snapshot_test/TestArchiveCommitlog/test_archive_commitlog_with_active_commitlog > Failed on CassCI build cassandra-2.1_dtest_jdk8 #211 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11668) dtest failure in upgrade_tests.upgrade_through_versions_test.ProtoV4Upgrade_3_2_UpTo_3_3_HEAD.rolling_upgrade_with_internode_ssl_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277279#comment-15277279 ] Russ Hatch edited comment on CASSANDRA-11668 at 5/9/16 11:13 PM: - a few runs locally pass fine. trying a bulk run here: http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/95/ was (Author: rhatch): a few runs locally pass fine. trying a bulk run here: http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/ > dtest failure in > upgrade_tests.upgrade_through_versions_test.ProtoV4Upgrade_3_2_UpTo_3_3_HEAD.rolling_upgrade_with_internode_ssl_test > - > > Key: CASSANDRA-11668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11668 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Russ Hatch > Labels: dtest > > since this was on upgrade to 3.3 head, I doubt it's an actual problem > (assuming changes aren't actively happening there). Nevertheless, should take > a quick look and see if there's anything going on. > example failure: > http://cassci.datastax.com/job/upgrade_tests-all/39/testReport/upgrade_tests.upgrade_through_versions_test/ProtoV4Upgrade_3_2_UpTo_3_3_HEAD/rolling_upgrade_with_internode_ssl_test > Failed on CassCI build upgrade_tests-all #39 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11668) dtest failure in upgrade_tests.upgrade_through_versions_test.ProtoV4Upgrade_3_2_UpTo_3_3_HEAD.rolling_upgrade_with_internode_ssl_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277279#comment-15277279 ] Russ Hatch commented on CASSANDRA-11668: a few runs locally pass fine. trying a bulk run here: http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/ > dtest failure in > upgrade_tests.upgrade_through_versions_test.ProtoV4Upgrade_3_2_UpTo_3_3_HEAD.rolling_upgrade_with_internode_ssl_test > - > > Key: CASSANDRA-11668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11668 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Russ Hatch > Labels: dtest > > since this was on upgrade to 3.3 head, I doubt it's an actual problem > (assuming changes aren't actively happening there). Nevertheless, should take > a quick look and see if there's anything going on. > example failure: > http://cassci.datastax.com/job/upgrade_tests-all/39/testReport/upgrade_tests.upgrade_through_versions_test/ProtoV4Upgrade_3_2_UpTo_3_3_HEAD/rolling_upgrade_with_internode_ssl_test > Failed on CassCI build upgrade_tests-all #39 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11740) Nodes about wrong membership view of the cluster
Dikang Gu created CASSANDRA-11740: - Summary: Nodes about wrong membership view of the cluster Key: CASSANDRA-11740 URL: https://issues.apache.org/jira/browse/CASSANDRA-11740 Project: Cassandra Issue Type: Bug Reporter: Dikang Gu Fix For: 2.2.x, 3.x We have a few hundreds nodes across 3 data centers, and we are doing a few millions writes per second into the cluster. The problem we found is that there are some nodes (>10) have very wrong view of the cluster. For example, we have 3 data centers A, B and C. On the problem nodes, in the output of the 'nodetool status', it shows that ~100 nodes are not in data center A, B, or C. Instead, it shows nodes are in DC1, and rack r1, which is very wrong. And as a result, the node will return wrong results to client requests. Datacenter: DC1 === Status=Up/Down / State=Normal/Leaving/Joining/Moving – Address Load Tokens Owns Host ID Rack UN 2401:db00:11:6134:face:0:1:0 509.52 GB 256 ? e24656ac-c3b2-4117-b933-a5b06852c993 r1 UN 2401:db00:11:b218:face:0:5:0 510.01 GB 256 ? 53da2104-b1b5-4fa5-a3dd-52c7557149f9 r1 UN 2401:db00:2130:5133:face:0:4d:0 459.75 GB 256 ? ef8311f0-f6b8-491c-904d-baa925cdd7c2 r1 We are using GossipingPropertyFileSnitch. Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11740) Nodes have wrong membership view of the cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-11740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dikang Gu updated CASSANDRA-11740: -- Summary: Nodes have wrong membership view of the cluster (was: Nodes about wrong membership view of the cluster) > Nodes have wrong membership view of the cluster > --- > > Key: CASSANDRA-11740 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11740 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu > Fix For: 2.2.x, 3.x > > > We have a few hundreds nodes across 3 data centers, and we are doing a few > millions writes per second into the cluster. > The problem we found is that there are some nodes (>10) have very wrong view > of the cluster. > For example, we have 3 data centers A, B and C. On the problem nodes, in the > output of the 'nodetool status', it shows that ~100 nodes are not in data > center A, B, or C. Instead, it shows nodes are in DC1, and rack r1, which is > very wrong. And as a result, the node will return wrong results to client > requests. > Datacenter: DC1 > === > Status=Up/Down > / State=Normal/Leaving/Joining/Moving > – Address Load Tokens Owns Host ID Rack > UN 2401:db00:11:6134:face:0:1:0 509.52 GB 256 ? > e24656ac-c3b2-4117-b933-a5b06852c993 r1 > UN 2401:db00:11:b218:face:0:5:0 510.01 GB 256 ? > 53da2104-b1b5-4fa5-a3dd-52c7557149f9 r1 > UN 2401:db00:2130:5133:face:0:4d:0 459.75 GB 256 ? > ef8311f0-f6b8-491c-904d-baa925cdd7c2 r1 > We are using GossipingPropertyFileSnitch. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache
[ https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277113#comment-15277113 ] Jeremiah Jordan commented on CASSANDRA-11452: - [~blambov] I see this is resolved as fixed, but trunk is still using caffeine-2.2.6.jar where the commit in caffeine that talks about this discussion is in 2.2.7? Should we upgrade to 2.2.7? > Cache implementation using LIRS eviction for in-process page cache > -- > > Key: CASSANDRA-11452 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11452 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths >Reporter: Branimir Lambov >Assignee: Branimir Lambov > > Following up from CASSANDRA-5863, to make best use of caching and to avoid > having to explicitly marking compaction accesses as non-cacheable, we need a > cache implementation that uses an eviction algorithm that can better handle > non-recurring accesses. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11724) False Failure Detection in Big Cassandra Cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277100#comment-15277100 ] Jeffrey F. Lukman commented on CASSANDRA-11724: --- [~jeromatron] : okay, I will try this again and report the result later whether this config will cause a different result or not. For now, can you help me by confirming whether you also see the Workload-4 bug or not? The Workload-4 : running 512-nodes cluster with some data, then we decommissioned a node. In our place, we see a high numbers of wrong false failure detection. > False Failure Detection in Big Cassandra Cluster > > > Key: CASSANDRA-11724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jeffrey F. Lukman > Labels: gossip, node-failure > Attachments: Workload1.jpg, Workload2.jpg, Workload3.jpg, > Workload4.jpg, experiment-result.txt > > > We are running some testing on Cassandra v2.2.5 stable in a big cluster. The > setting in our testing is that each machine has 16-cores and runs 8 cassandra > instances, and our testing is 32, 64, 128, 256, and 512 instances of > Cassandra. We use the default number of vnodes for each instance which is > 256. The data and log directories are on in-memory tmpfs file system. > We run several types of workloads on this Cassandra cluster: > Workload1: Just start the cluster > Workload2: Start half of the cluster, wait until it gets into a stable > condition, and run another half of the cluster > Workload3: Start half of the cluster, wait until it gets into a stable > condition, load some data, and run another half of the cluster > Workload4: Start the cluster, wait until it gets into a stable condition, > load some data and decommission one node > For this testing, we measure the total numbers of false failure detection > inside the cluster. By false failure detection, we mean that, for example, > instance-1 marks the instance-2 down, but the instance-2 is not down. We dig > deeper into the root cause and find out that instance-1 has not received any > heartbeat after some time from instance-2 because the instance-2 run a long > computation process. > Here I attach the graphs of each workload result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11709) Lock contention when large number of dead nodes come back within short time
[ https://issues.apache.org/jira/browse/CASSANDRA-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277099#comment-15277099 ] Joel Knighton commented on CASSANDRA-11709: --- That sounds like a different issue for me - I'd recommend opening another issue with as much info about your set up (snitch, etc) as possible. > Lock contention when large number of dead nodes come back within short time > --- > > Key: CASSANDRA-11709 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11709 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Joel Knighton > Fix For: 2.2.x, 3.x > > > We have a few hundreds nodes across 3 data centers, and we are doing a few > millions writes per second into the cluster. > We were trying to simulate a data center failure, by disabling the gossip on > all the nodes in one data center. After ~20mins, I re-enabled the gossip on > those nodes, was doing 5 nodes in each batch, and sleep 5 seconds between the > batch. > After that, I saw the latency of read/write requests increased a lot, and > client requests started to timeout. > On the node, I can see there are huge number of pending tasks in GossipStage. > = > 2016-05-02_23:55:08.99515 WARN 23:55:08 Gossip stage has 36337 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:09.36009 INFO 23:55:09 Node > /2401:db00:2020:717a:face:0:41:0 state jump to normal > 2016-05-02_23:55:09.99057 INFO 23:55:09 Node > /2401:db00:2020:717a:face:0:43:0 state jump to normal > 2016-05-02_23:55:10.09742 WARN 23:55:10 Gossip stage has 36421 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:10.91860 INFO 23:55:10 Node > /2401:db00:2020:717a:face:0:45:0 state jump to normal > 2016-05-02_23:55:11.20100 WARN 23:55:11 Gossip stage has 36558 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:11.57893 INFO 23:55:11 Node > /2401:db00:2030:612a:face:0:49:0 state jump to normal > 2016-05-02_23:55:12.23405 INFO 23:55:12 Node /2401:db00:2020:7189:face:0:7:0 > state jump to normal > > And I took jstack of the node, I found the read/write threads are blocked by > a lock, > read thread == > "Thrift:7994" daemon prio=10 tid=0x7fde91080800 nid=0x5255 waiting for > monitor entry [0x7fde6f8a1000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.locator.TokenMetadata.cachedOnlyTokenMap(TokenMetadata.java:546) > - waiting to lock <0x7fe4faef4398> (a > org.apache.cassandra.locator.TokenMetadata) > at > org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:111) > at > org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:3155) > at > org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1526) > at > org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1521) > at > org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:155) > at > org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1328) > at > org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1270) > at > org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1195) > at > org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:118) > at > org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:275) > at > org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:457) > at > org.apache.cassandra.thrift.CassandraServer.getSliceInternal(CassandraServer.java:346) > at > org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:325) > at > org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3659) > at > org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3643) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > = writer === >
[jira] [Commented] (CASSANDRA-11709) Lock contention when large number of dead nodes come back within short time
[ https://issues.apache.org/jira/browse/CASSANDRA-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277090#comment-15277090 ] Dikang Gu commented on CASSANDRA-11709: --- [~jkni], cool, thanks! One more question, I also found a serious problem recently, that in "nodetool status", bunch of nodes are not shown in correct region. For example, we have three regions A, B and C, I found that there are almost 100 hundreds nodes are not shown in any of those regions, they are shown as in DC1, and Rack r1, which I think complete broke the replication, and return incorrect data to client requests. Do you think they are the same issue, or I'd better open another jira? Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 2401:db00:11:6134:face:0:1:0 509.52 GB 256 ? e24656ac-c3b2-4117-b933-a5b06852c993 r1 UN 2401:db00:11:b218:face:0:5:0 510.01 GB 256 ? 53da2104-b1b5-4fa5-a3dd-52c7557149f9 r1 UN 2401:db00:2130:5133:face:0:4d:0 459.75 GB 256 ? ef8311f0-f6b8-491c-904d-baa925cdd7c2 r1 > Lock contention when large number of dead nodes come back within short time > --- > > Key: CASSANDRA-11709 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11709 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Joel Knighton > Fix For: 2.2.x, 3.x > > > We have a few hundreds nodes across 3 data centers, and we are doing a few > millions writes per second into the cluster. > We were trying to simulate a data center failure, by disabling the gossip on > all the nodes in one data center. After ~20mins, I re-enabled the gossip on > those nodes, was doing 5 nodes in each batch, and sleep 5 seconds between the > batch. > After that, I saw the latency of read/write requests increased a lot, and > client requests started to timeout. > On the node, I can see there are huge number of pending tasks in GossipStage. > = > 2016-05-02_23:55:08.99515 WARN 23:55:08 Gossip stage has 36337 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:09.36009 INFO 23:55:09 Node > /2401:db00:2020:717a:face:0:41:0 state jump to normal > 2016-05-02_23:55:09.99057 INFO 23:55:09 Node > /2401:db00:2020:717a:face:0:43:0 state jump to normal > 2016-05-02_23:55:10.09742 WARN 23:55:10 Gossip stage has 36421 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:10.91860 INFO 23:55:10 Node > /2401:db00:2020:717a:face:0:45:0 state jump to normal > 2016-05-02_23:55:11.20100 WARN 23:55:11 Gossip stage has 36558 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:11.57893 INFO 23:55:11 Node > /2401:db00:2030:612a:face:0:49:0 state jump to normal > 2016-05-02_23:55:12.23405 INFO 23:55:12 Node /2401:db00:2020:7189:face:0:7:0 > state jump to normal > > And I took jstack of the node, I found the read/write threads are blocked by > a lock, > read thread == > "Thrift:7994" daemon prio=10 tid=0x7fde91080800 nid=0x5255 waiting for > monitor entry [0x7fde6f8a1000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.locator.TokenMetadata.cachedOnlyTokenMap(TokenMetadata.java:546) > - waiting to lock <0x7fe4faef4398> (a > org.apache.cassandra.locator.TokenMetadata) > at > org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:111) > at > org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:3155) > at > org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1526) > at > org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1521) > at > org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:155) > at > org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1328) > at > org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1270) > at > org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1195) > at > org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:118) > at > org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:275) > at > org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:457) > at > org.apache.cassandra.thrift.CassandraServer.getSliceInternal(CassandraServer.java:346) >
[jira] [Comment Edited] (CASSANDRA-11709) Lock contention when large number of dead nodes come back within short time
[ https://issues.apache.org/jira/browse/CASSANDRA-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277090#comment-15277090 ] Dikang Gu edited comment on CASSANDRA-11709 at 5/9/16 9:31 PM: --- [~jkni], cool, thanks! One more question, I also found a serious problem recently, that in "nodetool status", bunch of nodes are not shown in correct region. For example, we have three regions A, B and C, I found that there are almost 100 nodes are not shown in any of those regions, they are shown as in DC1, and Rack r1, which I think complete broke the replication, and return incorrect data to client requests. Do you think they are the same issue, or I'd better open another jira? Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 2401:db00:11:6134:face:0:1:0 509.52 GB 256 ? e24656ac-c3b2-4117-b933-a5b06852c993 r1 UN 2401:db00:11:b218:face:0:5:0 510.01 GB 256 ? 53da2104-b1b5-4fa5-a3dd-52c7557149f9 r1 UN 2401:db00:2130:5133:face:0:4d:0 459.75 GB 256 ? ef8311f0-f6b8-491c-904d-baa925cdd7c2 r1 was (Author: dikanggu): [~jkni], cool, thanks! One more question, I also found a serious problem recently, that in "nodetool status", bunch of nodes are not shown in correct region. For example, we have three regions A, B and C, I found that there are almost 100 hundreds nodes are not shown in any of those regions, they are shown as in DC1, and Rack r1, which I think complete broke the replication, and return incorrect data to client requests. Do you think they are the same issue, or I'd better open another jira? Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 2401:db00:11:6134:face:0:1:0 509.52 GB 256 ? e24656ac-c3b2-4117-b933-a5b06852c993 r1 UN 2401:db00:11:b218:face:0:5:0 510.01 GB 256 ? 53da2104-b1b5-4fa5-a3dd-52c7557149f9 r1 UN 2401:db00:2130:5133:face:0:4d:0 459.75 GB 256 ? ef8311f0-f6b8-491c-904d-baa925cdd7c2 r1 > Lock contention when large number of dead nodes come back within short time > --- > > Key: CASSANDRA-11709 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11709 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Joel Knighton > Fix For: 2.2.x, 3.x > > > We have a few hundreds nodes across 3 data centers, and we are doing a few > millions writes per second into the cluster. > We were trying to simulate a data center failure, by disabling the gossip on > all the nodes in one data center. After ~20mins, I re-enabled the gossip on > those nodes, was doing 5 nodes in each batch, and sleep 5 seconds between the > batch. > After that, I saw the latency of read/write requests increased a lot, and > client requests started to timeout. > On the node, I can see there are huge number of pending tasks in GossipStage. > = > 2016-05-02_23:55:08.99515 WARN 23:55:08 Gossip stage has 36337 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:09.36009 INFO 23:55:09 Node > /2401:db00:2020:717a:face:0:41:0 state jump to normal > 2016-05-02_23:55:09.99057 INFO 23:55:09 Node > /2401:db00:2020:717a:face:0:43:0 state jump to normal > 2016-05-02_23:55:10.09742 WARN 23:55:10 Gossip stage has 36421 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:10.91860 INFO 23:55:10 Node > /2401:db00:2020:717a:face:0:45:0 state jump to normal > 2016-05-02_23:55:11.20100 WARN 23:55:11 Gossip stage has 36558 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:11.57893 INFO 23:55:11 Node > /2401:db00:2030:612a:face:0:49:0 state jump to normal > 2016-05-02_23:55:12.23405 INFO 23:55:12 Node /2401:db00:2020:7189:face:0:7:0 > state jump to normal > > And I took jstack of the node, I found the read/write threads are blocked by > a lock, > read thread == > "Thrift:7994" daemon prio=10 tid=0x7fde91080800 nid=0x5255 waiting for > monitor entry [0x7fde6f8a1000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.locator.TokenMetadata.cachedOnlyTokenMap(TokenMetadata.java:546) > - waiting to lock <0x7fe4faef4398> (a > org.apache.cassandra.locator.TokenMetadata) > at > org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:111) > at >
[jira] [Commented] (CASSANDRA-11709) Lock contention when large number of dead nodes come back within short time
[ https://issues.apache.org/jira/browse/CASSANDRA-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277075#comment-15277075 ] Joel Knighton commented on CASSANDRA-11709: --- I've just started really looking at this now - I'll post an update regarding the strategy to fix this once I have one. > Lock contention when large number of dead nodes come back within short time > --- > > Key: CASSANDRA-11709 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11709 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Joel Knighton > Fix For: 2.2.x, 3.x > > > We have a few hundreds nodes across 3 data centers, and we are doing a few > millions writes per second into the cluster. > We were trying to simulate a data center failure, by disabling the gossip on > all the nodes in one data center. After ~20mins, I re-enabled the gossip on > those nodes, was doing 5 nodes in each batch, and sleep 5 seconds between the > batch. > After that, I saw the latency of read/write requests increased a lot, and > client requests started to timeout. > On the node, I can see there are huge number of pending tasks in GossipStage. > = > 2016-05-02_23:55:08.99515 WARN 23:55:08 Gossip stage has 36337 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:09.36009 INFO 23:55:09 Node > /2401:db00:2020:717a:face:0:41:0 state jump to normal > 2016-05-02_23:55:09.99057 INFO 23:55:09 Node > /2401:db00:2020:717a:face:0:43:0 state jump to normal > 2016-05-02_23:55:10.09742 WARN 23:55:10 Gossip stage has 36421 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:10.91860 INFO 23:55:10 Node > /2401:db00:2020:717a:face:0:45:0 state jump to normal > 2016-05-02_23:55:11.20100 WARN 23:55:11 Gossip stage has 36558 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:11.57893 INFO 23:55:11 Node > /2401:db00:2030:612a:face:0:49:0 state jump to normal > 2016-05-02_23:55:12.23405 INFO 23:55:12 Node /2401:db00:2020:7189:face:0:7:0 > state jump to normal > > And I took jstack of the node, I found the read/write threads are blocked by > a lock, > read thread == > "Thrift:7994" daemon prio=10 tid=0x7fde91080800 nid=0x5255 waiting for > monitor entry [0x7fde6f8a1000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.locator.TokenMetadata.cachedOnlyTokenMap(TokenMetadata.java:546) > - waiting to lock <0x7fe4faef4398> (a > org.apache.cassandra.locator.TokenMetadata) > at > org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:111) > at > org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:3155) > at > org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1526) > at > org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1521) > at > org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:155) > at > org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1328) > at > org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1270) > at > org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1195) > at > org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:118) > at > org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:275) > at > org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:457) > at > org.apache.cassandra.thrift.CassandraServer.getSliceInternal(CassandraServer.java:346) > at > org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:325) > at > org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3659) > at > org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3643) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > = writer === > "Thrift:7668" daemon prio=10
[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data
[ https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277068#comment-15277068 ] Brandon Williams commented on CASSANDRA-8523: - [~pauloricardomg] assigning to you, please do continue working on this. > Writes should be sent to a replacement node while it is streaming in data > - > > Key: CASSANDRA-8523 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8523 > Project: Cassandra > Issue Type: Improvement >Reporter: Richard Wagner >Assignee: Paulo Motta > Fix For: 2.1.x > > > In our operations, we make heavy use of replace_address (or > replace_address_first_boot) in order to replace broken nodes. We now realize > that writes are not sent to the replacement nodes while they are in hibernate > state and streaming in data. This runs counter to what our expectations were, > especially since we know that writes ARE sent to nodes when they are > bootstrapped into the ring. > It seems like cassandra should arrange to send writes to a node that is in > the process of replacing another node, just like it does for a nodes that are > bootstraping. I hesitate to phrase this as "we should send writes to a node > in hibernate" because the concept of hibernate may be useful in other > contexts, as per CASSANDRA-8336. Maybe a new state is needed here? > Among other things, the fact that we don't get writes during this period > makes subsequent repairs more expensive, proportional to the number of writes > that we miss (and depending on the amount of data that needs to be streamed > during replacement and the time it may take to rebuild secondary indexes, we > could miss many many hours worth of writes). It also leaves us more exposed > to consistency violations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11709) Lock contention when large number of dead nodes come back within short time
[ https://issues.apache.org/jira/browse/CASSANDRA-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277066#comment-15277066 ] Dikang Gu commented on CASSANDRA-11709: --- [~jkni], can you please share a bit about how are you going to fix this? Thanks! > Lock contention when large number of dead nodes come back within short time > --- > > Key: CASSANDRA-11709 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11709 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Joel Knighton > Fix For: 2.2.x, 3.x > > > We have a few hundreds nodes across 3 data centers, and we are doing a few > millions writes per second into the cluster. > We were trying to simulate a data center failure, by disabling the gossip on > all the nodes in one data center. After ~20mins, I re-enabled the gossip on > those nodes, was doing 5 nodes in each batch, and sleep 5 seconds between the > batch. > After that, I saw the latency of read/write requests increased a lot, and > client requests started to timeout. > On the node, I can see there are huge number of pending tasks in GossipStage. > = > 2016-05-02_23:55:08.99515 WARN 23:55:08 Gossip stage has 36337 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:09.36009 INFO 23:55:09 Node > /2401:db00:2020:717a:face:0:41:0 state jump to normal > 2016-05-02_23:55:09.99057 INFO 23:55:09 Node > /2401:db00:2020:717a:face:0:43:0 state jump to normal > 2016-05-02_23:55:10.09742 WARN 23:55:10 Gossip stage has 36421 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:10.91860 INFO 23:55:10 Node > /2401:db00:2020:717a:face:0:45:0 state jump to normal > 2016-05-02_23:55:11.20100 WARN 23:55:11 Gossip stage has 36558 pending > tasks; skipping status check (no nodes will be marked down) > 2016-05-02_23:55:11.57893 INFO 23:55:11 Node > /2401:db00:2030:612a:face:0:49:0 state jump to normal > 2016-05-02_23:55:12.23405 INFO 23:55:12 Node /2401:db00:2020:7189:face:0:7:0 > state jump to normal > > And I took jstack of the node, I found the read/write threads are blocked by > a lock, > read thread == > "Thrift:7994" daemon prio=10 tid=0x7fde91080800 nid=0x5255 waiting for > monitor entry [0x7fde6f8a1000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.locator.TokenMetadata.cachedOnlyTokenMap(TokenMetadata.java:546) > - waiting to lock <0x7fe4faef4398> (a > org.apache.cassandra.locator.TokenMetadata) > at > org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:111) > at > org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:3155) > at > org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1526) > at > org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1521) > at > org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:155) > at > org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1328) > at > org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1270) > at > org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1195) > at > org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:118) > at > org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:275) > at > org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:457) > at > org.apache.cassandra.thrift.CassandraServer.getSliceInternal(CassandraServer.java:346) > at > org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:325) > at > org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3659) > at > org.apache.cassandra.thrift.Cassandra$Processor$get_slice.getResult(Cassandra.java:3643) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > = writer === > "Thrift:7668" daemon prio=10 tid=0x7fde90d91000 nid=0x50e9 waiting for >
[jira] [Updated] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data
[ https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-8523: Assignee: Paulo Motta (was: Brandon Williams) > Writes should be sent to a replacement node while it is streaming in data > - > > Key: CASSANDRA-8523 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8523 > Project: Cassandra > Issue Type: Improvement >Reporter: Richard Wagner >Assignee: Paulo Motta > Fix For: 2.1.x > > > In our operations, we make heavy use of replace_address (or > replace_address_first_boot) in order to replace broken nodes. We now realize > that writes are not sent to the replacement nodes while they are in hibernate > state and streaming in data. This runs counter to what our expectations were, > especially since we know that writes ARE sent to nodes when they are > bootstrapped into the ring. > It seems like cassandra should arrange to send writes to a node that is in > the process of replacing another node, just like it does for a nodes that are > bootstraping. I hesitate to phrase this as "we should send writes to a node > in hibernate" because the concept of hibernate may be useful in other > contexts, as per CASSANDRA-8336. Maybe a new state is needed here? > Among other things, the fact that we don't get writes during this period > makes subsequent repairs more expensive, proportional to the number of writes > that we miss (and depending on the amount of data that needs to be streamed > during replacement and the time it may take to rebuild secondary indexes, we > could miss many many hours worth of writes). It also leaves us more exposed > to consistency violations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11606) Upgrade from 2.1.9 to 3.0.5 Fails with AssertionError
[ https://issues.apache.org/jira/browse/CASSANDRA-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276996#comment-15276996 ] Anthony Verslues commented on CASSANDRA-11606: -- Sorry for the late response, I have attached a scheme of the table. > Upgrade from 2.1.9 to 3.0.5 Fails with AssertionError > - > > Key: CASSANDRA-11606 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11606 > Project: Cassandra > Issue Type: Bug > Environment: Fedora 20, Oracle Java 8, Apache Cassandra 2.1.9 -> 3.0.5 >Reporter: Anthony Verslues > Fix For: 3.0.x > > Attachments: sample.txt > > > I get this error while upgrading sstables. I got the same error when > upgrading to 3.0.2 and 3.0.4. > error: null > -- StackTrace -- > java.lang.AssertionError > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1167) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1142) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:444) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:423) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:289) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.readStaticRow(SSTableSimpleIterator.java:133) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:57) > at > org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator$1.initializeIterator(BigTableScanner.java:334) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.isReverseOrder(LazilyInitializedUnfilteredRowIterator.java:65) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$1.reduce(UnfilteredPartitionIterators.java:109) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$1.reduce(UnfilteredPartitionIterators.java:100) > at > org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:442) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.hasNext(UnfilteredPartitionIterators.java:150) > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72) > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:177) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) > at > org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:416) > at > org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:313) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11734) Enable partition component index for SASI
[ https://issues.apache.org/jira/browse/CASSANDRA-11734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-11734: Reviewer: Pavel Yaskevich > Enable partition component index for SASI > - > > Key: CASSANDRA-11734 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11734 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: DOAN DuyHai >Assignee: DOAN DuyHai > Labels: doc-impacting, sasi, secondaryIndex > Fix For: 3.8 > > Attachments: patch.txt > > > Enable partition component index for SASI > For the given schema: > {code:sql} > CREATE TABLE test.comp ( > pk1 int, > pk2 text, > val text, > PRIMARY KEY ((pk1, pk2)) > ); > CREATE CUSTOM INDEX comp_val_idx ON test.comp (val) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > CREATE CUSTOM INDEX comp_pk2_idx ON test.comp (pk2) USING > 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'PREFIX', > 'analyzer_class': > 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', > 'case_sensitive': 'false'}; > CREATE CUSTOM INDEX comp_pk1_idx ON test.comp (pk1) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > {code} > The following queries are possible: > {code:sql} > SELECT * FROM test.comp WHERE pk1=1; > SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5; > SELECT * FROM test.comp WHERE pk1=1 AND val='xxx' ALLOW FILTERING; > SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND val='xxx' ALLOW FILTERING; > SELECT * FROM test.comp WHERE pk2='some text'; > SELECT * FROM test.comp WHERE pk2 LIKE 'prefix%'; > SELECT * FROM test.comp WHERE pk2='some text' AND val='xxx' ALLOW FILTERING; > SELECT * FROM test.comp WHERE pk2 LIKE 'prefix%' AND val='xxx' ALLOW > FILTERING; > //Without using SASI > SELECT * FROM test.comp WHERE pk1 = 1 AND pk2='some text'; > SELECT * FROM test.comp WHERE pk1 IN(1,2,3) AND pk2='some text'; > SELECT * FROM test.comp WHERE pk1 = 1 AND pk2 IN ('text1','text2'); > SELECT * FROM test.comp WHERE pk1 IN(1,2,3) AND pk2 IN ('text1','text2'); > {code} > However, the following queries *are not possible* > {code:sql} > SELECT * FROM test.comp WHERE pk1=1 AND pk2 LIKE 'prefix%'; > SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND pk2 = 'some text'; > SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND pk2 LIKE 'prefix%'; > {code} > All of them are throwing the following exception > {noformat} > ava.lang.UnsupportedOperationException: null > at > org.apache.cassandra.cql3.restrictions.SingleColumnRestriction$LikeRestriction.appendTo(SingleColumnRestriction.java:715) > ~[main/:na] > at > org.apache.cassandra.cql3.restrictions.PartitionKeySingleRestrictionSet.values(PartitionKeySingleRestrictionSet.java:86) > ~[main/:na] > at > org.apache.cassandra.cql3.restrictions.StatementRestrictions.getPartitionKeys(StatementRestrictions.java:585) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:473) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.getQuery(SelectStatement.java:265) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:230) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:79) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:239) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:224) > ~[main/:na] > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) > ~[main/:na] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507) > [main/:na] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401) > [main/:na] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.36.Final.jar:4.0.36.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292) > [netty-all-4.0.36.Final.jar:4.0.36.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:32) > [netty-all-4.0.36.Final.jar:4.0.36.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:283) > [netty-all-4.0.36.Final.jar:4.0.36.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_45] > at >
[jira] [Updated] (CASSANDRA-11604) select on table fails after changing user defined type in map
[ https://issues.apache.org/jira/browse/CASSANDRA-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-11604: Reviewer: Joel Knighton > select on table fails after changing user defined type in map > - > > Key: CASSANDRA-11604 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11604 > Project: Cassandra > Issue Type: Bug >Reporter: Andreas Jaekle >Assignee: Alex Petrov > Fix For: 3.x > > > in cassandra 3.5 i get the following exception when i run this cqls: > {code} > --DROP KEYSPACE bugtest ; > CREATE KEYSPACE bugtest > WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; > use bugtest; > CREATE TYPE tt ( > a boolean > ); > create table t1 ( > k text, > v map, > PRIMARY KEY(k) > ); > insert into t1 (k,v) values ('k2',{'mk':{a:false}}); > ALTER TYPE tt ADD b boolean; > UPDATE t1 SET v['mk'] = { b:true } WHERE k = 'k2'; > select * from t1; > {code} > the last select fails. > {code} > WARN [SharedPool-Worker-5] 2016-04-19 14:18:49,885 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-5,5,main]: {} > java.lang.AssertionError: null > at > org.apache.cassandra.db.rows.ComplexColumnData$Builder.addCell(ComplexColumnData.java:254) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:623) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:419) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:279) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:112) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:38) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:289) > ~[apache-cassandra-3.5.jar:3.5] > at >
[jira] [Updated] (CASSANDRA-11606) Upgrade from 2.1.9 to 3.0.5 Fails with AssertionError
[ https://issues.apache.org/jira/browse/CASSANDRA-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Verslues updated CASSANDRA-11606: - Attachment: sample.txt > Upgrade from 2.1.9 to 3.0.5 Fails with AssertionError > - > > Key: CASSANDRA-11606 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11606 > Project: Cassandra > Issue Type: Bug > Environment: Fedora 20, Oracle Java 8, Apache Cassandra 2.1.9 -> 3.0.5 >Reporter: Anthony Verslues > Fix For: 3.0.x > > Attachments: sample.txt > > > I get this error while upgrading sstables. I got the same error when > upgrading to 3.0.2 and 3.0.4. > error: null > -- StackTrace -- > java.lang.AssertionError > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1167) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1142) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:444) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:423) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:289) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.readStaticRow(SSTableSimpleIterator.java:133) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:57) > at > org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator$1.initializeIterator(BigTableScanner.java:334) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.isReverseOrder(LazilyInitializedUnfilteredRowIterator.java:65) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$1.reduce(UnfilteredPartitionIterators.java:109) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$1.reduce(UnfilteredPartitionIterators.java:100) > at > org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:442) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.hasNext(UnfilteredPartitionIterators.java:150) > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72) > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:177) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) > at > org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:416) > at > org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:313) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9613) Omit (de)serialization of state variable in UDAs
[ https://issues.apache.org/jira/browse/CASSANDRA-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-9613: --- Reviewer: Tyler Hobbs > Omit (de)serialization of state variable in UDAs > > > Key: CASSANDRA-9613 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9613 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Minor > Fix For: 3.x > > > Currently the result of each UDA's state function call is serialized and then > deserialized for the next state-function invocation and optionally final > function invocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11739) Cache key references might cause OOM on incremental repair
[ https://issues.apache.org/jira/browse/CASSANDRA-11739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276896#comment-15276896 ] Marcus Eriksson commented on CASSANDRA-11739: - sounds good we should probably bite the bullet and do CASSANDRA-8858 at some point as well > Cache key references might cause OOM on incremental repair > -- > > Key: CASSANDRA-11739 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11739 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Paulo Motta > Attachments: heapdump.png > > > We keep {{SSTableReader}} references for the duration of the repair to > anti-compact later, and their tidier keep references to cache keys to be > invalidated which are only cleaned up by GC after repair is finished. These > cache keys can accumulate while repair is being executed leading to OOM for > large tables/keyspaces. > Heap dump attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11734) Enable partition component index for SASI
[ https://issues.apache.org/jira/browse/CASSANDRA-11734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276894#comment-15276894 ] Pavel Yaskevich commented on CASSANDRA-11734: - Thanks for taking a stub at this, [~doanduyhai]! By the nature of changes it looks like we will have to postpone this until I'm done with QueryPlan porting (CASSANDRA-10765) which is going to make it more sane to have indexed restrictions on partitions with(-out) ranges. From the patch I see couple of things right away: CFMetaData.getLiveIndices() you mentioned goes against the fact that some of the queries don't even allow usage of the indexes, which there is no way (currently) to check from inside of the SingleColumnRestrictions, checking {{QueryController#hasIndexFor(ColumnDefinition)}} on every run of the results checking logic is very inefficient and I think instead of using DecoratedKey separately we might be better off providing {{Operation.satisfiedBy}} methods with {{UnfilteredRowIterator}} and let it iterate it if needed instead of involving {{QueryPlan}}. So I would rather have this after CASSANDRA-10765, looks like it would make everybody's life a bit easier :) > Enable partition component index for SASI > - > > Key: CASSANDRA-11734 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11734 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: DOAN DuyHai >Assignee: DOAN DuyHai > Labels: doc-impacting, sasi, secondaryIndex > Fix For: 3.8 > > Attachments: patch.txt > > > Enable partition component index for SASI > For the given schema: > {code:sql} > CREATE TABLE test.comp ( > pk1 int, > pk2 text, > val text, > PRIMARY KEY ((pk1, pk2)) > ); > CREATE CUSTOM INDEX comp_val_idx ON test.comp (val) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > CREATE CUSTOM INDEX comp_pk2_idx ON test.comp (pk2) USING > 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'PREFIX', > 'analyzer_class': > 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', > 'case_sensitive': 'false'}; > CREATE CUSTOM INDEX comp_pk1_idx ON test.comp (pk1) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > {code} > The following queries are possible: > {code:sql} > SELECT * FROM test.comp WHERE pk1=1; > SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5; > SELECT * FROM test.comp WHERE pk1=1 AND val='xxx' ALLOW FILTERING; > SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND val='xxx' ALLOW FILTERING; > SELECT * FROM test.comp WHERE pk2='some text'; > SELECT * FROM test.comp WHERE pk2 LIKE 'prefix%'; > SELECT * FROM test.comp WHERE pk2='some text' AND val='xxx' ALLOW FILTERING; > SELECT * FROM test.comp WHERE pk2 LIKE 'prefix%' AND val='xxx' ALLOW > FILTERING; > //Without using SASI > SELECT * FROM test.comp WHERE pk1 = 1 AND pk2='some text'; > SELECT * FROM test.comp WHERE pk1 IN(1,2,3) AND pk2='some text'; > SELECT * FROM test.comp WHERE pk1 = 1 AND pk2 IN ('text1','text2'); > SELECT * FROM test.comp WHERE pk1 IN(1,2,3) AND pk2 IN ('text1','text2'); > {code} > However, the following queries *are not possible* > {code:sql} > SELECT * FROM test.comp WHERE pk1=1 AND pk2 LIKE 'prefix%'; > SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND pk2 = 'some text'; > SELECT * FROM test.comp WHERE pk1>=1 AND pk1<=5 AND pk2 LIKE 'prefix%'; > {code} > All of them are throwing the following exception > {noformat} > ava.lang.UnsupportedOperationException: null > at > org.apache.cassandra.cql3.restrictions.SingleColumnRestriction$LikeRestriction.appendTo(SingleColumnRestriction.java:715) > ~[main/:na] > at > org.apache.cassandra.cql3.restrictions.PartitionKeySingleRestrictionSet.values(PartitionKeySingleRestrictionSet.java:86) > ~[main/:na] > at > org.apache.cassandra.cql3.restrictions.StatementRestrictions.getPartitionKeys(StatementRestrictions.java:585) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:473) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.getQuery(SelectStatement.java:265) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:230) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:79) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:239) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:224) > ~[main/:na] > at >
[jira] [Commented] (CASSANDRA-11739) Cache key references might cause OOM on incremental repair
[ https://issues.apache.org/jira/browse/CASSANDRA-11739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276886#comment-15276886 ] Paulo Motta commented on CASSANDRA-11739: - Idea is to track only SSTable data component filename on parent repair session, rather than {{SSTableReader}} references. When doing anti-compaction, get references of existing sstables from CFS. WDYT [~krummas]? > Cache key references might cause OOM on incremental repair > -- > > Key: CASSANDRA-11739 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11739 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Paulo Motta > Attachments: heapdump.png > > > We keep {{SSTableReader}} references for the duration of the repair to > anti-compact later, and their tidier keep references to cache keys to be > invalidated which are only cleaned up by GC after repair is finished. These > cache keys can accumulate while repair is being executed leading to OOM for > large tables/keyspaces. > Heap dump attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11739) Cache key references might cause OOM on incremental repair
Paulo Motta created CASSANDRA-11739: --- Summary: Cache key references might cause OOM on incremental repair Key: CASSANDRA-11739 URL: https://issues.apache.org/jira/browse/CASSANDRA-11739 Project: Cassandra Issue Type: Bug Reporter: Paulo Motta Assignee: Paulo Motta Attachments: heapdump.png We keep {{SSTableReader}} references for the duration of the repair to anti-compact later, and their tidier keep references to cache keys to be invalidated which are only cleaned up by GC after repair is finished. These cache keys can accumulate while repair is being executed leading to OOM for large tables/keyspaces. Heap dump attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11724) False Failure Detection in Big Cassandra Cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276804#comment-15276804 ] Jeremy Hanna commented on CASSANDRA-11724: -- I suppose I should just say, you should set auto_bootstrap=false in your cassandra.yaml and you wouldn't need to do the two minute intervals since this is a fresh cluster. > False Failure Detection in Big Cassandra Cluster > > > Key: CASSANDRA-11724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jeffrey F. Lukman > Labels: gossip, node-failure > Attachments: Workload1.jpg, Workload2.jpg, Workload3.jpg, > Workload4.jpg, experiment-result.txt > > > We are running some testing on Cassandra v2.2.5 stable in a big cluster. The > setting in our testing is that each machine has 16-cores and runs 8 cassandra > instances, and our testing is 32, 64, 128, 256, and 512 instances of > Cassandra. We use the default number of vnodes for each instance which is > 256. The data and log directories are on in-memory tmpfs file system. > We run several types of workloads on this Cassandra cluster: > Workload1: Just start the cluster > Workload2: Start half of the cluster, wait until it gets into a stable > condition, and run another half of the cluster > Workload3: Start half of the cluster, wait until it gets into a stable > condition, load some data, and run another half of the cluster > Workload4: Start the cluster, wait until it gets into a stable condition, > load some data and decommission one node > For this testing, we measure the total numbers of false failure detection > inside the cluster. By false failure detection, we mean that, for example, > instance-1 marks the instance-2 down, but the instance-2 is not down. We dig > deeper into the root cause and find out that instance-1 has not received any > heartbeat after some time from instance-2 because the instance-2 run a long > computation process. > Here I attach the graphs of each workload result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11725) Check for unnecessary JMX port setting in env vars at startup
[ https://issues.apache.org/jira/browse/CASSANDRA-11725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-11725: Assignee: Sam Tunnicliffe Reviewer: T Jake Luciani Status: Patch Available (was: Open) I've pushed a branch which adds a {{StartupCheck}} that warns if {{com.sun.management.jmxremote.port}} is set. As this issue also exposed the fact that a number of clients where relying on passing the JMX config directly via system properties, rather than using {{cassandra-env.sh}}, I've made startup more permissive to emulate previous behaviour. So, if the property is present, when it C* comes to init the JMX server it will log an additional warning and skip the setup. The additional warning is because at some point, we should remove this compatibility mode and go back to failing startup if the {{jmxremote.port}} is set directly, but we'll the {{StartupCheck}} should remain. ||branch||testall||dtest|| |[11725-3.6|https://github.com/beobal/cassandra/tree/11725-3.6]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11725-3.6-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11725-3.6-dtest]| > Check for unnecessary JMX port setting in env vars at startup > - > > Key: CASSANDRA-11725 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11725 > Project: Cassandra > Issue Type: Improvement > Components: Lifecycle >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Minor > Labels: lhf > Fix For: 3.x > > > Since CASSANDRA-10091, C* expects to always be in control of initializing its > JMX connector server. However, if {{com.sun.management.jmxremote.port}} is > set when the JVM is started, the bootstrap agent takes over and sets up the > server before any C* code runs. Because C* is then unable to bind the server > it creates to the specified port, startup is halted and the root cause is > somewhat unclear. > We should add a check at startup so a more informative message can be > provided. This would test for the presence of the system property which would > differentiate from the case where some other process is already bound to the > port. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11710) Cassandra dies with OOM when running stress
[ https://issues.apache.org/jira/browse/CASSANDRA-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-11710: --- Resolution: Fixed Reviewer: T Jake Luciani (was: Marcus Eriksson) Status: Resolved (was: Patch Available) +1 committed in {{31cab36b1800f2042623633445d8be944217d5a2}} > Cassandra dies with OOM when running stress > --- > > Key: CASSANDRA-11710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11710 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Branimir Lambov > Fix For: 3.6 > > > Running stress on trunk dies with OOM after about 3.5M ops: > {code} > ERROR [CompactionExecutor:1] 2016-05-04 15:01:31,231 > JVMStabilityInspector.java:137 - JVM state determined to be unstable. > Exiting forcefully due to: > java.lang.OutOfMemoryError: Direct buffer memory > at java.nio.Bits.reserveMemory(Bits.java:693) ~[na:1.8.0_91] > at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) > ~[na:1.8.0_91] > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) > ~[na:1.8.0_91] > at > org.apache.cassandra.utils.memory.BufferPool.allocateDirectAligned(BufferPool.java:519) > ~[main/:na] > at > org.apache.cassandra.utils.memory.BufferPool.access$600(BufferPool.java:46) > ~[main/:na] > at > org.apache.cassandra.utils.memory.BufferPool$GlobalPool.allocateMoreChunks(BufferPool.java:276) > ~[main/:na] > at > org.apache.cassandra.utils.memory.BufferPool$GlobalPool.get(BufferPool.java:249) > ~[main/:na] > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.addChunkFromGlobalPool(BufferPool.java:338) > ~[main/:na] > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.get(BufferPool.java:381) > ~[main/:na] > at > org.apache.cassandra.utils.memory.BufferPool.maybeTakeFromPool(BufferPool.java:142) > ~[main/:na] > at > org.apache.cassandra.utils.memory.BufferPool.takeFromPool(BufferPool.java:114) > ~[main/:na] > at > org.apache.cassandra.utils.memory.BufferPool.get(BufferPool.java:84) > ~[main/:na] > at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:135) > ~[main/:na] > at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:19) > ~[main/:na] > at > com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalLoadingCache.lambda$new$0(BoundedLocalCache.java:2949) > ~[caffeine-2.2.6.jar:na] > at > com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$15(BoundedLocalCache.java:1807) > ~[caffeine-2.2.6.jar:na] > at > java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) > ~[na:1.8.0_91] > at > com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:1805) > ~[caffeine-2.2.6.jar:na] > at > com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1788) > ~[caffeine-2.2.6.jar:na] > at > com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97) > ~[caffeine-2.2.6.jar:na] > at > com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66) > ~[caffeine-2.2.6.jar:na] > at > org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:215) > ~[main/:na] > at > org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:193) > ~[main/:na] > at > org.apache.cassandra.io.util.LimitingRebufferer.rebuffer(LimitingRebufferer.java:34) > ~[main/:na] > at > org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:78) > ~[main/:na] > at > org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:72) > ~[main/:na] > at > org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88) > ~[main/:na] > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66) > ~[main/:na] > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60) > ~[main/:na] > at > org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:400) > ~[main/:na] > at > org.apache.cassandra.utils.ByteBufferUtil.readWithVIntLength(ByteBufferUtil.java:338) > ~[main/:na] > at > org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:414) > ~[main/:na] > at > org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:243) > ~[main/:na] > at > org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:473) >
[5/5] cassandra git commit: Merge branch 'cassandra-3.7' into trunk
Merge branch 'cassandra-3.7' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/653d0bff Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/653d0bff Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/653d0bff Branch: refs/heads/trunk Commit: 653d0bffcf02e698ca727bb5c151dfea19202eb5 Parents: 640072b a093e8c Author: T Jake LucianiAuthored: Mon May 9 13:51:52 2016 -0400 Committer: T Jake Luciani Committed: Mon May 9 13:51:52 2016 -0400 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/config/Config.java | 2 +- .../cassandra/config/DatabaseDescriptor.java | 7 +++ .../apache/cassandra/utils/memory/BufferPool.java | 17 - 4 files changed, 25 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/653d0bff/CHANGES.txt -- diff --cc CHANGES.txt index d9f1688,6e60bba..731bfc8 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -12,8 -6,8 +12,9 @@@ Merged from 2.2 * Prohibit Reversed Counter type as part of the PK (CASSANDRA-9395) * cqlsh: correctly handle non-ascii chars in error messages (CASSANDRA-11626) + 3.6 + * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710) * Enhanced Compaction Logging (CASSANDRA-10805) * Make prepared statement cache size configurable (CASSANDRA-11555) * Integrated JMX authentication and authorization (CASSANDRA-10091) http://git-wip-us.apache.org/repos/asf/cassandra/blob/653d0bff/src/java/org/apache/cassandra/utils/memory/BufferPool.java --
[1/5] cassandra git commit: Prevent direct memory OOM on buffer pool allocations
Repository: cassandra Updated Branches: refs/heads/cassandra-3.7 a8a3a7338 -> a093e8cae refs/heads/trunk 640072b09 -> 653d0bffc Prevent direct memory OOM on buffer pool allocations Patch by Branimir Lambov; reviewed by tjake for (CASSANDRA-11710) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/31cab36b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/31cab36b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/31cab36b Branch: refs/heads/cassandra-3.7 Commit: 31cab36b1800f2042623633445d8be944217d5a2 Parents: 5634cea Author: Branimir LambovAuthored: Thu May 5 11:30:00 2016 +0300 Committer: T Jake Luciani Committed: Mon May 9 13:48:30 2016 -0400 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/config/Config.java | 2 +- .../cassandra/config/DatabaseDescriptor.java | 7 +++ .../apache/cassandra/utils/memory/BufferPool.java | 17 - 4 files changed, 25 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 4ff5b1a..b7715ba 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.6 + * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710) * Enhanced Compaction Logging (CASSANDRA-10805) * Make prepared statement cache size configurable (CASSANDRA-11555) * Integrated JMX authentication and authorization (CASSANDRA-10091) http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/Config.java -- diff --git a/src/java/org/apache/cassandra/config/Config.java b/src/java/org/apache/cassandra/config/Config.java index 02635bf..466b791 100644 --- a/src/java/org/apache/cassandra/config/Config.java +++ b/src/java/org/apache/cassandra/config/Config.java @@ -242,7 +242,7 @@ public class Config private static boolean isClientMode = false; -public Integer file_cache_size_in_mb = 512; +public Integer file_cache_size_in_mb; public boolean buffer_pool_use_heap_if_exhausted = true; http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index d8acdb8..3d38646 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -1776,6 +1776,13 @@ public class DatabaseDescriptor public static int getFileCacheSizeInMB() { +if (conf.file_cache_size_in_mb == null) +{ +// In client mode the value is not set. +assert Config.isClientMode(); +return 0; +} + return conf.file_cache_size_in_mb; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/utils/memory/BufferPool.java -- diff --git a/src/java/org/apache/cassandra/utils/memory/BufferPool.java b/src/java/org/apache/cassandra/utils/memory/BufferPool.java index ad2404f..5cd0051 100644 --- a/src/java/org/apache/cassandra/utils/memory/BufferPool.java +++ b/src/java/org/apache/cassandra/utils/memory/BufferPool.java @@ -273,7 +273,22 @@ public class BufferPool } // allocate a large chunk -Chunk chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE)); +Chunk chunk; +try +{ +chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE)); +} +catch (OutOfMemoryError oom) +{ +noSpamLogger.error("Buffer pool failed to allocate chunk of {}, current size {} ({}). " + + "Attempting to continue; buffers will be allocated in on-heap memory which can degrade performance. " + + "Make sure direct memory size (-XX:MaxDirectMemorySize) is large enough to accommodate off-heap memtables and caches.", + FBUtilities.prettyPrintMemory(MACRO_CHUNK_SIZE), + FBUtilities.prettyPrintMemory(sizeInBytes()), + oom.toString()); +return false; +} + chunk.acquire(null); macroChunks.add(chunk); for (int i = 0 ;
[2/5] cassandra git commit: Prevent direct memory OOM on buffer pool allocations
Prevent direct memory OOM on buffer pool allocations Patch by Branimir Lambov; reviewed by tjake for (CASSANDRA-11710) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/31cab36b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/31cab36b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/31cab36b Branch: refs/heads/trunk Commit: 31cab36b1800f2042623633445d8be944217d5a2 Parents: 5634cea Author: Branimir LambovAuthored: Thu May 5 11:30:00 2016 +0300 Committer: T Jake Luciani Committed: Mon May 9 13:48:30 2016 -0400 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/config/Config.java | 2 +- .../cassandra/config/DatabaseDescriptor.java | 7 +++ .../apache/cassandra/utils/memory/BufferPool.java | 17 - 4 files changed, 25 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 4ff5b1a..b7715ba 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.6 + * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710) * Enhanced Compaction Logging (CASSANDRA-10805) * Make prepared statement cache size configurable (CASSANDRA-11555) * Integrated JMX authentication and authorization (CASSANDRA-10091) http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/Config.java -- diff --git a/src/java/org/apache/cassandra/config/Config.java b/src/java/org/apache/cassandra/config/Config.java index 02635bf..466b791 100644 --- a/src/java/org/apache/cassandra/config/Config.java +++ b/src/java/org/apache/cassandra/config/Config.java @@ -242,7 +242,7 @@ public class Config private static boolean isClientMode = false; -public Integer file_cache_size_in_mb = 512; +public Integer file_cache_size_in_mb; public boolean buffer_pool_use_heap_if_exhausted = true; http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index d8acdb8..3d38646 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -1776,6 +1776,13 @@ public class DatabaseDescriptor public static int getFileCacheSizeInMB() { +if (conf.file_cache_size_in_mb == null) +{ +// In client mode the value is not set. +assert Config.isClientMode(); +return 0; +} + return conf.file_cache_size_in_mb; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/utils/memory/BufferPool.java -- diff --git a/src/java/org/apache/cassandra/utils/memory/BufferPool.java b/src/java/org/apache/cassandra/utils/memory/BufferPool.java index ad2404f..5cd0051 100644 --- a/src/java/org/apache/cassandra/utils/memory/BufferPool.java +++ b/src/java/org/apache/cassandra/utils/memory/BufferPool.java @@ -273,7 +273,22 @@ public class BufferPool } // allocate a large chunk -Chunk chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE)); +Chunk chunk; +try +{ +chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE)); +} +catch (OutOfMemoryError oom) +{ +noSpamLogger.error("Buffer pool failed to allocate chunk of {}, current size {} ({}). " + + "Attempting to continue; buffers will be allocated in on-heap memory which can degrade performance. " + + "Make sure direct memory size (-XX:MaxDirectMemorySize) is large enough to accommodate off-heap memtables and caches.", + FBUtilities.prettyPrintMemory(MACRO_CHUNK_SIZE), + FBUtilities.prettyPrintMemory(sizeInBytes()), + oom.toString()); +return false; +} + chunk.acquire(null); macroChunks.add(chunk); for (int i = 0 ; i < MACRO_CHUNK_SIZE ; i += CHUNK_SIZE)
[4/5] cassandra git commit: Merge branch '3.6-retag' into cassandra-3.7
Merge branch '3.6-retag' into cassandra-3.7 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a093e8ca Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a093e8ca Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a093e8ca Branch: refs/heads/cassandra-3.7 Commit: a093e8caeec431bdc8ec31efa8b3c1aeb9067285 Parents: a8a3a73 31cab36 Author: T Jake LucianiAuthored: Mon May 9 13:50:24 2016 -0400 Committer: T Jake Luciani Committed: Mon May 9 13:50:24 2016 -0400 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/config/Config.java | 2 +- .../cassandra/config/DatabaseDescriptor.java | 7 +++ .../apache/cassandra/utils/memory/BufferPool.java | 17 - 4 files changed, 25 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a093e8ca/CHANGES.txt -- diff --cc CHANGES.txt index 3cee7ae,b7715ba..6e60bba --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,12 -1,5 +1,13 @@@ +3.7 +Merged from 3.0: + * Refactor Materialized View code (CASSANDRA-11475) + * Update Java Driver (CASSANDRA-11615) +Merged from 2.2: + * Prohibit Reversed Counter type as part of the PK (CASSANDRA-9395) + * cqlsh: correctly handle non-ascii chars in error messages (CASSANDRA-11626) + 3.6 + * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710) * Enhanced Compaction Logging (CASSANDRA-10805) * Make prepared statement cache size configurable (CASSANDRA-11555) * Integrated JMX authentication and authorization (CASSANDRA-10091)
[3/5] cassandra git commit: Merge branch '3.6-retag' into cassandra-3.7
Merge branch '3.6-retag' into cassandra-3.7 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a093e8ca Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a093e8ca Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a093e8ca Branch: refs/heads/trunk Commit: a093e8caeec431bdc8ec31efa8b3c1aeb9067285 Parents: a8a3a73 31cab36 Author: T Jake LucianiAuthored: Mon May 9 13:50:24 2016 -0400 Committer: T Jake Luciani Committed: Mon May 9 13:50:24 2016 -0400 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/config/Config.java | 2 +- .../cassandra/config/DatabaseDescriptor.java | 7 +++ .../apache/cassandra/utils/memory/BufferPool.java | 17 - 4 files changed, 25 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a093e8ca/CHANGES.txt -- diff --cc CHANGES.txt index 3cee7ae,b7715ba..6e60bba --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,12 -1,5 +1,13 @@@ +3.7 +Merged from 3.0: + * Refactor Materialized View code (CASSANDRA-11475) + * Update Java Driver (CASSANDRA-11615) +Merged from 2.2: + * Prohibit Reversed Counter type as part of the PK (CASSANDRA-9395) + * cqlsh: correctly handle non-ascii chars in error messages (CASSANDRA-11626) + 3.6 + * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710) * Enhanced Compaction Logging (CASSANDRA-10805) * Make prepared statement cache size configurable (CASSANDRA-11555) * Integrated JMX authentication and authorization (CASSANDRA-10091)
cassandra git commit: Prevent direct memory OOM on buffer pool allocations
Repository: cassandra Updated Branches: refs/heads/cassandra-3.6 5634cea37 -> 31cab36b1 Prevent direct memory OOM on buffer pool allocations Patch by Branimir Lambov; reviewed by tjake for (CASSANDRA-11710) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/31cab36b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/31cab36b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/31cab36b Branch: refs/heads/cassandra-3.6 Commit: 31cab36b1800f2042623633445d8be944217d5a2 Parents: 5634cea Author: Branimir LambovAuthored: Thu May 5 11:30:00 2016 +0300 Committer: T Jake Luciani Committed: Mon May 9 13:48:30 2016 -0400 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/config/Config.java | 2 +- .../cassandra/config/DatabaseDescriptor.java | 7 +++ .../apache/cassandra/utils/memory/BufferPool.java | 17 - 4 files changed, 25 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 4ff5b1a..b7715ba 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.6 + * Prevent direct memory OOM on buffer pool allocations (CASSANDRA-11710) * Enhanced Compaction Logging (CASSANDRA-10805) * Make prepared statement cache size configurable (CASSANDRA-11555) * Integrated JMX authentication and authorization (CASSANDRA-10091) http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/Config.java -- diff --git a/src/java/org/apache/cassandra/config/Config.java b/src/java/org/apache/cassandra/config/Config.java index 02635bf..466b791 100644 --- a/src/java/org/apache/cassandra/config/Config.java +++ b/src/java/org/apache/cassandra/config/Config.java @@ -242,7 +242,7 @@ public class Config private static boolean isClientMode = false; -public Integer file_cache_size_in_mb = 512; +public Integer file_cache_size_in_mb; public boolean buffer_pool_use_heap_if_exhausted = true; http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index d8acdb8..3d38646 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -1776,6 +1776,13 @@ public class DatabaseDescriptor public static int getFileCacheSizeInMB() { +if (conf.file_cache_size_in_mb == null) +{ +// In client mode the value is not set. +assert Config.isClientMode(); +return 0; +} + return conf.file_cache_size_in_mb; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/31cab36b/src/java/org/apache/cassandra/utils/memory/BufferPool.java -- diff --git a/src/java/org/apache/cassandra/utils/memory/BufferPool.java b/src/java/org/apache/cassandra/utils/memory/BufferPool.java index ad2404f..5cd0051 100644 --- a/src/java/org/apache/cassandra/utils/memory/BufferPool.java +++ b/src/java/org/apache/cassandra/utils/memory/BufferPool.java @@ -273,7 +273,22 @@ public class BufferPool } // allocate a large chunk -Chunk chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE)); +Chunk chunk; +try +{ +chunk = new Chunk(allocateDirectAligned(MACRO_CHUNK_SIZE)); +} +catch (OutOfMemoryError oom) +{ +noSpamLogger.error("Buffer pool failed to allocate chunk of {}, current size {} ({}). " + + "Attempting to continue; buffers will be allocated in on-heap memory which can degrade performance. " + + "Make sure direct memory size (-XX:MaxDirectMemorySize) is large enough to accommodate off-heap memtables and caches.", + FBUtilities.prettyPrintMemory(MACRO_CHUNK_SIZE), + FBUtilities.prettyPrintMemory(sizeInBytes()), + oom.toString()); +return false; +} + chunk.acquire(null); macroChunks.add(chunk); for (int i = 0 ; i < MACRO_CHUNK_SIZE ; i += CHUNK_SIZE)
[cassandra] Git Push Summary
Repository: cassandra Updated Branches: refs/heads/cassandra-3.6 [created] 5634cea37
cassandra git commit: Use re-initialised headers for ColumnIndex for pre-3.0 sstables
Repository: cassandra Updated Branches: refs/heads/trunk 0f0b2dfce -> 640072b09 Use re-initialised headers for ColumnIndex for pre-3.0 sstables Patch by Alex Petrov; reviewed by tjake for CASSANDRA-11736 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/640072b0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/640072b0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/640072b0 Branch: refs/heads/trunk Commit: 640072b093ac7040a28ca932034e905935357ead Parents: 0f0b2df Author: Alex PetrovAuthored: Mon May 9 15:27:50 2016 +0200 Committer: T Jake Luciani Committed: Mon May 9 12:52:48 2016 -0400 -- .../org/apache/cassandra/io/sstable/format/big/BigTableWriter.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/640072b0/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java b/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java index 44b1c3a..39dc889 100644 --- a/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java +++ b/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java @@ -87,7 +87,7 @@ public class BigTableWriter extends SSTableWriter } iwriter = new IndexWriter(keyCount, dataFile); -columnIndexWriter = new ColumnIndex(header, dataFile, descriptor.version, observers, getRowIndexEntrySerializer().indexInfoSerializer()); +columnIndexWriter = new ColumnIndex(this.header, dataFile, descriptor.version, this.observers, getRowIndexEntrySerializer().indexInfoSerializer()); } public void mark()
[jira] [Updated] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails
[ https://issues.apache.org/jira/browse/CASSANDRA-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-11736: --- Resolution: Fixed Status: Resolved (was: Patch Available) committed {{640072b093ac7040a28ca932034e905935357ead}} > LegacySSTableTest::testStreamLegacyCqlTables fails > -- > > Key: CASSANDRA-11736 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11736 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Minor > Fix For: 3.7 > > > [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/] > > Error Message > {code} > org.apache.cassandra.streaming.StreamException: Stream failed > {code} > Stacktrace > {code} > java.util.concurrent.ExecutionException: > org.apache.cassandra.streaming.StreamException: Stream failed > at > com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145) > Caused by: org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429) > at > org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276) > at java.lang.Thread.run(Thread.java:745) > {code} > I've ran {{bisect}} against last commits and (given it fails constantly) it > started failing after [this > commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-11737: -- Reviewer: Aleksey Yeschenko > Add a way to disable severity in DynamicEndpointSnitch > -- > > Key: CASSANDRA-11737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11737 > Project: Cassandra > Issue Type: Bug >Reporter: Jeremiah Jordan >Assignee: Jeremiah Jordan > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > > I have seen in a few clusters now where severity can out weigh latency in > DynamicEndpointSnitch causing issues (a node that is completely overloaded > CPU wise and has super high latency will get selected for queries, even > though nodes with much lower latency exist, but they have a higher severity > score). There should be a way to disable the use of severity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-11737: Fix Version/s: 3.x 3.0.x 2.2.x 2.1.x > Add a way to disable severity in DynamicEndpointSnitch > -- > > Key: CASSANDRA-11737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11737 > Project: Cassandra > Issue Type: Bug >Reporter: Jeremiah Jordan >Assignee: Jeremiah Jordan > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > > I have seen in a few clusters now where severity can out weigh latency in > DynamicEndpointSnitch causing issues (a node that is completely overloaded > CPU wise and has super high latency will get selected for queries, even > though nodes with much lower latency exist, but they have a higher severity > score). There should be a way to disable the use of severity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11738) Re-think the use of Severity in the DynamicEndpointSnitch calculation
[ https://issues.apache.org/jira/browse/CASSANDRA-11738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-11738: Fix Version/s: 3.x > Re-think the use of Severity in the DynamicEndpointSnitch calculation > - > > Key: CASSANDRA-11738 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11738 > Project: Cassandra > Issue Type: Bug >Reporter: Jeremiah Jordan > Fix For: 3.x > > > CASSANDRA-11737 was opened to allow completely disabling the use of severity > in the DynamicEndpointSnitch calculation, but that is a pretty big hammer. > There is probably something we can do to better use the score. > The issue seems to be that severity is given equal weight with latency in the > current code, also that severity is only based on disk io. If you have a > node that is CPU bound on something (say catching up on LCS compactions > because of bootstrap/repair/replace) the IO wait can be low, but the latency > to the node is high. > Some ideas I had are: > 1. Allowing a yaml parameter to tune how much impact the severity score has > in the calculation. > 2. Taking CPU load into account as well as IO Wait (this would probably help > in the cases I have seen things go sideways) > 3. Move the -D from CASSANDRA-11737 to being a yaml level setting > 4. Go back to just relying on Latency and get rid of severity all together. > Now that we have rapid read protection, maybe just using latency is enough, > as it can help where the predictive nature of IO wait would have been useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11738) Re-think the use of Severity in the DynamicEndpointSnitch calculation
Jeremiah Jordan created CASSANDRA-11738: --- Summary: Re-think the use of Severity in the DynamicEndpointSnitch calculation Key: CASSANDRA-11738 URL: https://issues.apache.org/jira/browse/CASSANDRA-11738 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan CASSANDRA-11737 was opened to allow completely disabling the use of severity in the DynamicEndpointSnitch calculation, but that is a pretty big hammer. There is probably something we can do to better use the score. The issue seems to be that severity is given equal weight with latency in the current code, also that severity is only based on disk io. If you have a node that is CPU bound on something (say catching up on LCS compactions because of bootstrap/repair/replace) the IO wait can be low, but the latency to the node is high. Some ideas I had are: 1. Allowing a yaml parameter to tune how much impact the severity score has in the calculation. 2. Taking CPU load into account as well as IO Wait (this would probably help in the cases I have seen things go sideways) 3. Move the -D from CASSANDRA-11737 to being a yaml level setting 4. Go back to just relying on Latency and get rid of severity all together. Now that we have rapid read protection, maybe just using latency is enough, as it can help where the predictive nature of IO wait would have been useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-11737: Status: Patch Available (was: Open) > Add a way to disable severity in DynamicEndpointSnitch > -- > > Key: CASSANDRA-11737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11737 > Project: Cassandra > Issue Type: Bug >Reporter: Jeremiah Jordan >Assignee: Jeremiah Jordan > > I have seen in a few clusters now where severity can out weigh latency in > DynamicEndpointSnitch causing issues (a node that is completely overloaded > CPU wise and has super high latency will get selected for queries, even > though nodes with much lower latency exist, but they have a higher severity > score). There should be a way to disable the use of severity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276554#comment-15276554 ] Jeremiah Jordan edited comment on CASSANDRA-11737 at 5/9/16 4:06 PM: - https://github.com/JeremiahDJordan/cassandra/commits/CASSANDRA-11737-21 That merges up cleanly all the way to trunk. Adds a -D option to disable using severity in DynamicEndpointSnitch calculations. For the clusters where we were seeing massive latency degredation when a single node was under load applying this patch and settings the -D to disable severity brought things back to normal. was (Author: jjordan): https://github.com/JeremiahDJordan/cassandra/commits/CASSANDRA-11737-21 That merges up cleanly all the way to trunk. Adds a -D option to disable using severity in DynamicEndpointSnitch calculations. > Add a way to disable severity in DynamicEndpointSnitch > -- > > Key: CASSANDRA-11737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11737 > Project: Cassandra > Issue Type: Bug >Reporter: Jeremiah Jordan >Assignee: Jeremiah Jordan > > I have seen in a few clusters now where severity can out weigh latency in > DynamicEndpointSnitch causing issues (a node that is completely overloaded > CPU wise and has super high latency will get selected for queries, even > though nodes with much lower latency exist, but they have a higher severity > score). There should be a way to disable the use of severity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276554#comment-15276554 ] Jeremiah Jordan edited comment on CASSANDRA-11737 at 5/9/16 4:05 PM: - https://github.com/JeremiahDJordan/cassandra/commits/CASSANDRA-11737-21 That merges up cleanly all the way to trunk. Adds a -D option to disable using severity in DynamicEndpointSnitch calculations. was (Author: jjordan): https://github.com/JeremiahDJordan/cassandra/commit/47768377afe9aff93a7e3de8190bc3124c5cefe6 > Add a way to disable severity in DynamicEndpointSnitch > -- > > Key: CASSANDRA-11737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11737 > Project: Cassandra > Issue Type: Bug >Reporter: Jeremiah Jordan >Assignee: Jeremiah Jordan > > I have seen in a few clusters now where severity can out weigh latency in > DynamicEndpointSnitch causing issues (a node that is completely overloaded > CPU wise and has super high latency will get selected for queries, even > though nodes with much lower latency exist, but they have a higher severity > score). There should be a way to disable the use of severity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276554#comment-15276554 ] Jeremiah Jordan commented on CASSANDRA-11737: - https://github.com/JeremiahDJordan/cassandra/commit/47768377afe9aff93a7e3de8190bc3124c5cefe6 > Add a way to disable severity in DynamicEndpointSnitch > -- > > Key: CASSANDRA-11737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11737 > Project: Cassandra > Issue Type: Bug >Reporter: Jeremiah Jordan >Assignee: Jeremiah Jordan > > I have seen in a few clusters now where severity can out weigh latency in > DynamicEndpointSnitch causing issues (a node that is completely overloaded > CPU wise and has super high latency will get selected for queries, even > though nodes with much lower latency exist, but they have a higher severity > score). There should be a way to disable the use of severity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan reassigned CASSANDRA-11737: --- Assignee: Jeremiah Jordan > Add a way to disable severity in DynamicEndpointSnitch > -- > > Key: CASSANDRA-11737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11737 > Project: Cassandra > Issue Type: Bug >Reporter: Jeremiah Jordan >Assignee: Jeremiah Jordan > > I have seen in a few clusters now where severity can out weigh latency in > DynamicEndpointSnitch causing issues (a node that is completely overloaded > CPU wise and has super high latency will get selected for queries, even > though nodes with much lower latency exist, but they have a higher severity > score). There should be a way to disable the use of severity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11737) Add a way to disable severity in DynamicEndpointSnitch
Jeremiah Jordan created CASSANDRA-11737: --- Summary: Add a way to disable severity in DynamicEndpointSnitch Key: CASSANDRA-11737 URL: https://issues.apache.org/jira/browse/CASSANDRA-11737 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan I have seen in a few clusters now where severity can out weigh latency in DynamicEndpointSnitch causing issues (a node that is completely overloaded CPU wise and has super high latency will get selected for queries, even though nodes with much lower latency exist, but they have a higher severity score). There should be a way to disable the use of severity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11491) Split repair job into tasks per table
[ https://issues.apache.org/jira/browse/CASSANDRA-11491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276539#comment-15276539 ] Paulo Motta commented on CASSANDRA-11491: - Increasing priority of this since this will allow running anti-compactions after each table is finished, and release sstable references earlier to avoid OOMing when running incremental repair in multiple tables (or keyspace-level). > Split repair job into tasks per table > - > > Key: CASSANDRA-11491 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11491 > Project: Cassandra > Issue Type: Task > Components: Streaming and Messaging >Reporter: Paulo Motta >Priority: Minor > > We currently split a parent repair session into multiple repair sessions, one > per range. Each repair session is further split into multiple repair jobs, > one per table. > As we move into an auto-repair world with CASSANDRA-11190, with repair > settings per table, it will probably simplify things if we reason about > repair sessions on a per-table basis. > Besides simplifying current code, this will simplify adding more advanced > scheduling of repair tasks per table and other optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11491) Split repair job into tasks per table
[ https://issues.apache.org/jira/browse/CASSANDRA-11491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-11491: Assignee: Paulo Motta Priority: Major (was: Minor) > Split repair job into tasks per table > - > > Key: CASSANDRA-11491 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11491 > Project: Cassandra > Issue Type: Task > Components: Streaming and Messaging >Reporter: Paulo Motta >Assignee: Paulo Motta > > We currently split a parent repair session into multiple repair sessions, one > per range. Each repair session is further split into multiple repair jobs, > one per table. > As we move into an auto-repair world with CASSANDRA-11190, with repair > settings per table, it will probably simplify things if we reason about > repair sessions on a per-table basis. > Besides simplifying current code, this will simplify adding more advanced > scheduling of repair tasks per table and other optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails
[ https://issues.apache.org/jira/browse/CASSANDRA-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-11736: --- Reviewer: T Jake Luciani > LegacySSTableTest::testStreamLegacyCqlTables fails > -- > > Key: CASSANDRA-11736 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11736 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Minor > Fix For: 3.7 > > > [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/] > > Error Message > {code} > org.apache.cassandra.streaming.StreamException: Stream failed > {code} > Stacktrace > {code} > java.util.concurrent.ExecutionException: > org.apache.cassandra.streaming.StreamException: Stream failed > at > com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145) > Caused by: org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429) > at > org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276) > at java.lang.Thread.run(Thread.java:745) > {code} > I've ran {{bisect}} against last commits and (given it fails constantly) it > started failing after [this > commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails
[ https://issues.apache.org/jira/browse/CASSANDRA-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276484#comment-15276484 ] T Jake Luciani commented on CASSANDRA-11736: +1 will commit once CI passes. Thx > LegacySSTableTest::testStreamLegacyCqlTables fails > -- > > Key: CASSANDRA-11736 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11736 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Minor > Fix For: 3.7 > > > [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/] > > Error Message > {code} > org.apache.cassandra.streaming.StreamException: Stream failed > {code} > Stacktrace > {code} > java.util.concurrent.ExecutionException: > org.apache.cassandra.streaming.StreamException: Stream failed > at > com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145) > Caused by: org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429) > at > org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276) > at java.lang.Thread.run(Thread.java:745) > {code} > I've ran {{bisect}} against last commits and (given it fails constantly) it > started failing after [this > commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9613) Omit (de)serialization of state variable in UDAs
[ https://issues.apache.org/jira/browse/CASSANDRA-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-9613: Status: Patch Available (was: Open) > Omit (de)serialization of state variable in UDAs > > > Key: CASSANDRA-9613 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9613 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Minor > Fix For: 3.x > > > Currently the result of each UDA's state function call is serialized and then > deserialized for the next state-function invocation and optionally final > function invocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11031) MultiTenant : support “ALLOW FILTERING" for First Partition Key
[ https://issues.apache.org/jira/browse/CASSANDRA-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276448#comment-15276448 ] ZhaoYang commented on CASSANDRA-11031: -- I have updated the patch and dtest. thank you > MultiTenant : support “ALLOW FILTERING" for First Partition Key > --- > > Key: CASSANDRA-11031 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11031 > Project: Cassandra > Issue Type: New Feature > Components: CQL >Reporter: ZhaoYang >Assignee: ZhaoYang >Priority: Minor > Fix For: 3.x > > Attachments: CASSANDRA-11031-3.7.patch > > > Currently, Allow Filtering only works for secondary Index column or > clustering columns. And it's slow, because Cassandra will read all data from > SSTABLE from hard-disk to memory to filter. > But we can support allow filtering on Partition Key, as far as I know, > Partition Key is in memory, so we can easily filter them, and then read > required data from SSTable. > This will similar to "Select * from table" which scan through entire cluster. > CREATE TABLE multi_tenant_table ( > tenant_id text, > pk2 text, > c1 text, > c2 text, > v1 text, > v2 text, > PRIMARY KEY ((tenant_id,pk2),c1,c2) > ) ; > Select * from multi_tenant_table where tenant_id = "datastax" allow filtering; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11542) Create a benchmark to compare HDFS and Cassandra bulk read times
[ https://issues.apache.org/jira/browse/CASSANDRA-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-11542: Reviewer: T Jake Luciani > Create a benchmark to compare HDFS and Cassandra bulk read times > > > Key: CASSANDRA-11542 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11542 > Project: Cassandra > Issue Type: Sub-task > Components: Testing >Reporter: Stefania >Assignee: Stefania > Fix For: 3.x > > Attachments: jfr_recordings.zip, spark-load-perf-results-001.zip, > spark-load-perf-results-002.zip, spark-load-perf-results-003.zip > > > I propose creating a benchmark for comparing Cassandra and HDFS bulk reading > performance. Simple Spark queries will be performed on data stored in HDFS or > Cassandra, and the entire duration will be measured. An example query would > be the max or min of a column or a count\(*\). > This benchmark should allow determining the impact of: > * partition size > * number of clustering columns > * number of value columns (cells) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11728) Incremental repair fails with vnodes+lcs+multi-dc
[ https://issues.apache.org/jira/browse/CASSANDRA-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276376#comment-15276376 ] Marcus Eriksson commented on CASSANDRA-11728: - then you need to provide more details on how it fails > Incremental repair fails with vnodes+lcs+multi-dc > - > > Key: CASSANDRA-11728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11728 > Project: Cassandra > Issue Type: Bug >Reporter: Nick Bailey > > Produced on 2.1.12 > We are seeing incremental repair fail with an error regarding creating > multiple repair sessions on overlapping sstables. This is happening in the > following setup > * 6 nodes > * 2 Datacenters > * Vnodes enabled > * Leveled compaction on the relevant tables > When STCS is used instead, we don't hit an issue. This is slightly related to > https://issues.apache.org/jira/browse/CASSANDRA-11461, except in this case > OpsCenter repair service is running all repairs sequentially. Let me know > what other information we can provide. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (CASSANDRA-11728) Incremental repair fails with vnodes+lcs+multi-dc
[ https://issues.apache.org/jira/browse/CASSANDRA-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson reopened CASSANDRA-11728: - > Incremental repair fails with vnodes+lcs+multi-dc > - > > Key: CASSANDRA-11728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11728 > Project: Cassandra > Issue Type: Bug >Reporter: Nick Bailey > > Produced on 2.1.12 > We are seeing incremental repair fail with an error regarding creating > multiple repair sessions on overlapping sstables. This is happening in the > following setup > * 6 nodes > * 2 Datacenters > * Vnodes enabled > * Leveled compaction on the relevant tables > When STCS is used instead, we don't hit an issue. This is slightly related to > https://issues.apache.org/jira/browse/CASSANDRA-11461, except in this case > OpsCenter repair service is running all repairs sequentially. Let me know > what other information we can provide. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails
[ https://issues.apache.org/jira/browse/CASSANDRA-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-11736: Status: Patch Available (was: Open) Constructor for {{ColumnIndex}} was moved from [here|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a#diff-59e5dd00b6986242a4d247b405808b0bL158] to [here|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a#diff-59e5dd00b6986242a4d247b405808b0bR90], and re-initialised {{header}} from {{SStableWriter}} constructor (field) was never used, passed constructor argument was used instead. Same was happening with {{observers}} field. ||[trunk|https://github.com/ifesdjeen/cassandra/tree/11736-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11736-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11736-trunk-dtest/]| > LegacySSTableTest::testStreamLegacyCqlTables fails > -- > > Key: CASSANDRA-11736 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11736 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Minor > Fix For: 3.7 > > > [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/] > > Error Message > {code} > org.apache.cassandra.streaming.StreamException: Stream failed > {code} > Stacktrace > {code} > java.util.concurrent.ExecutionException: > org.apache.cassandra.streaming.StreamException: Stream failed > at > com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145) > Caused by: org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429) > at > org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276) > at java.lang.Thread.run(Thread.java:745) > {code} > I've ran {{bisect}} against last commits and (given it fails constantly) it > started failing after [this > commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11728) Incremental repair fails with vnodes+lcs+multi-dc
[ https://issues.apache.org/jira/browse/CASSANDRA-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276368#comment-15276368 ] Adam Hattrell commented on CASSANDRA-11728: --- We still see it - even with the fix. > Incremental repair fails with vnodes+lcs+multi-dc > - > > Key: CASSANDRA-11728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11728 > Project: Cassandra > Issue Type: Bug >Reporter: Nick Bailey > > Produced on 2.1.12 > We are seeing incremental repair fail with an error regarding creating > multiple repair sessions on overlapping sstables. This is happening in the > following setup > * 6 nodes > * 2 Datacenters > * Vnodes enabled > * Leveled compaction on the relevant tables > When STCS is used instead, we don't hit an issue. This is slightly related to > https://issues.apache.org/jira/browse/CASSANDRA-11461, except in this case > OpsCenter repair service is running all repairs sequentially. Let me know > what other information we can provide. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records
[ https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276365#comment-15276365 ] Branimir Lambov commented on CASSANDRA-9669: The test failures appear to be flakes -- mainly timeouts, and the failing tests pass when I run them locally. Patch is thus ready to commit. > If sstable flushes complete out of order, on restart we can fail to replay > necessary commit log records > --- > > Key: CASSANDRA-9669 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9669 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Benedict >Priority: Critical > Labels: correctness > Fix For: 2.2.x, 3.0.x, 3.x > > > While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, > on restart we simply take the maximum replay position of any sstable on disk, > and ignore anything prior. > It is quite possible for there to be two flushes triggered for a given table, > and for the second to finish first by virtue of containing a much smaller > quantity of live data (or perhaps the disk is just under less pressure). If > we crash before the first sstable has been written, then on restart the data > it would have represented will disappear, since we will not replay the CL > records. > This looks to be a bug present since time immemorial, and also seems pretty > serious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records
[ https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-9669: --- Status: Ready to Commit (was: Patch Available) > If sstable flushes complete out of order, on restart we can fail to replay > necessary commit log records > --- > > Key: CASSANDRA-9669 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9669 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Benedict >Priority: Critical > Labels: correctness > Fix For: 2.2.x, 3.0.x, 3.x > > > While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, > on restart we simply take the maximum replay position of any sstable on disk, > and ignore anything prior. > It is quite possible for there to be two flushes triggered for a given table, > and for the second to finish first by virtue of containing a much smaller > quantity of live data (or perhaps the disk is just under less pressure). If > we crash before the first sstable has been written, then on restart the data > it would have represented will disappear, since we will not replay the CL > records. > This looks to be a bug present since time immemorial, and also seems pretty > serious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10292) java.lang.AssertionError: attempted to delete non-existing file CommitLog...
[ https://issues.apache.org/jira/browse/CASSANDRA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276357#comment-15276357 ] Joshua McKenzie commented on CASSANDRA-10292: - The 2.2 fix of the bug is Windows-specific / nio related. Shouldn't be the issue here as env. is CentOS. > java.lang.AssertionError: attempted to delete non-existing file CommitLog... > > > Key: CASSANDRA-10292 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10292 > Project: Cassandra > Issue Type: Bug > Environment: CentOS Linux 7.1.1503, Cassandra 2.1.8 stable version, 6 > nodes cluster >Reporter: Dawid Szejnfeld >Priority: Critical > > From time to time some nodes are stopping to work due to error in logs like > this: > INFO [CompactionExecutor:2475] 2015-09-09 12:36:50,363 > CompactionTask.java:274 - Compacted 4 sstables to > [/mnt/cassandra--storage-machine/data/system/compactions_in_progress-55080ab05d9c38 > 8690a4acb25fe1f77b/system-compactions_in_progress-ka-126,]. 419 bytes to 42 > (~10% of original) in 33ms = 0.001214MB/s. 4 total partitions merged to 1. > Partition merge counts were {2:2, } > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,166 > ColumnFamilyStore.java:912 - Enqueuing flush of settings: 78364 (0%) on-heap, > 0 (0%) off-heap > INFO [MemtableFlushWriter:301] 2015-09-09 12:52:34,172 Memtable.java:347 - > Writing Memtable-settings@1126939979(0.113KiB serialized bytes, 1850 ops, > 0%/0% of on/off-heap limit) > INFO [MemtableFlushWriter:301] 2015-09-09 12:52:34,174 Memtable.java:382 - > Completed flushing > /mnt/cassandra--storage-machine/data/OpsCenter/settings-464866c04b1311e590698d1a9fd4ba8b/OpsCe > nter-settings-tmp-ka-12-Data.db (0.000KiB) for commitlog position > ReplayPosition(segmentId=1441362636571, position=33554415) > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,194 StorageService.java:453 > - Stopping gossiper > WARN [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 StorageService.java:359 > - Stopping gossip by operator request > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 Gossiper.java:1410 - > Announcing shutdown > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,195 StorageService.java:458 > - Stopping RPC server > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,196 ThriftServer.java:142 - > Stop listening to thrift clients > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,204 StorageService.java:463 > - Stopping native transport > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,422 Server.java:213 - Stop > listening for CQL clients > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,423 CommitLog.java:397 - > Failed managing commit log segments. Commit disk failure policy is stop; > terminating thread > java.lang.AssertionError: attempted to delete non-existing file > CommitLog-4-1441362636316.log > at > org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:126) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegment.delete(CommitLogSegment.java:343) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:418) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:413) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:152) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > [apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85] > After I create missing commit log file and restart cassandra service > everything is OK then. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails
[ https://issues.apache.org/jira/browse/CASSANDRA-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov reassigned CASSANDRA-11736: --- Assignee: Alex Petrov > LegacySSTableTest::testStreamLegacyCqlTables fails > -- > > Key: CASSANDRA-11736 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11736 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Minor > Fix For: 3.7 > > > [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/] > > Error Message > {code} > org.apache.cassandra.streaming.StreamException: Stream failed > {code} > Stacktrace > {code} > java.util.concurrent.ExecutionException: > org.apache.cassandra.streaming.StreamException: Stream failed > at > com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155) > at > org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145) > Caused by: org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429) > at > org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276) > at java.lang.Thread.run(Thread.java:745) > {code} > I've ran {{bisect}} against last commits and (given it fails constantly) it > started failing after [this > commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11604) select on table fails after changing user defined type in map
[ https://issues.apache.org/jira/browse/CASSANDRA-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-11604: Status: Patch Available (was: Open) This assertion was removed in [trunk|https://github.com/apache/cassandra/commit/677230df694752c7ecf6d5459eee60ad7cf45ecf#diff-bc19f192ef82fbca9abd27526054bb0fL254] (appl in 3.5 has similar effect). >From what I can say, scrub doesn't fix that issue. Node restart alone has the >same effect, or the flush: During the node restart, commit log will replay mutation with the same schema as the table itself. During the flush and consequent reads, all Cells will get the correct Column Definition. Although this assert doesn't change the behaviour, since ALTER statements only allow "backward-compatible" changes (after the schema change, it'll be possible to work with the old version, too). I've added the test for this particular edge case (updating UDT within inserted non-frozen map) for {{trunk}} and removed assert in {{3.0.x}} (along with adding the test): ||[3.0|https://github.com/ifesdjeen/cassandra/tree/11604-3.0]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-3.0-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-3.0-dtest/]| ||[trunk|https://github.com/ifesdjeen/cassandra/tree/11604-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-dtest/]| {{LegacySSTableTest}} is failing locally on trunk, too, although it happened before this commit (also, there were no code changes for trunk). The rest of tests are passing locally, too. > select on table fails after changing user defined type in map > - > > Key: CASSANDRA-11604 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11604 > Project: Cassandra > Issue Type: Bug >Reporter: Andreas Jaekle >Assignee: Alex Petrov > Fix For: 3.x > > > in cassandra 3.5 i get the following exception when i run this cqls: > {code} > --DROP KEYSPACE bugtest ; > CREATE KEYSPACE bugtest > WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; > use bugtest; > CREATE TYPE tt ( > a boolean > ); > create table t1 ( > k text, > v map, > PRIMARY KEY(k) > ); > insert into t1 (k,v) values ('k2',{'mk':{a:false}}); > ALTER TYPE tt ADD b boolean; > UPDATE t1 SET v['mk'] = { b:true } WHERE k = 'k2'; > select * from t1; > {code} > the last select fails. > {code} > WARN [SharedPool-Worker-5] 2016-04-19 14:18:49,885 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-5,5,main]: {} > java.lang.AssertionError: null > at > org.apache.cassandra.db.rows.ComplexColumnData$Builder.addCell(ComplexColumnData.java:254) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:623) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:419) > ~[apache-cassandra-3.5.jar:3.5] > at >
[jira] [Created] (CASSANDRA-11736) LegacySSTableTest::testStreamLegacyCqlTables fails
Alex Petrov created CASSANDRA-11736: --- Summary: LegacySSTableTest::testStreamLegacyCqlTables fails Key: CASSANDRA-11736 URL: https://issues.apache.org/jira/browse/CASSANDRA-11736 Project: Cassandra Issue Type: Bug Components: Testing Reporter: Alex Petrov Priority: Minor Fix For: 3.7 [example|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/org.apache.cassandra.io.sstable/LegacySSTableTest/testStreamLegacyCqlTables_compression/] Error Message {code} org.apache.cassandra.streaming.StreamException: Stream failed {code} Stacktrace {code} java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) at org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTable(LegacySSTableTest.java:175) at org.apache.cassandra.io.sstable.LegacySSTableTest.streamLegacyTables(LegacySSTableTest.java:155) at org.apache.cassandra.io.sstable.LegacySSTableTest.testStreamLegacyCqlTables(LegacySSTableTest.java:145) Caused by: org.apache.cassandra.streaming.StreamException: Stream failed at org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215) at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191) at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429) at org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:639) at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:489) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276) at java.lang.Thread.run(Thread.java:745) {code} I've ran {{bisect}} against last commits and (given it fails constantly) it started failing after [this commit|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11735) cassandra-env.sh doesn't test the correct java version
[ https://issues.apache.org/jira/browse/CASSANDRA-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe resolved CASSANDRA-11735. - Resolution: Duplicate > cassandra-env.sh doesn't test the correct java version > -- > > Key: CASSANDRA-11735 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11735 > Project: Cassandra > Issue Type: Bug > Environment: Ubuntu 14.04 > openjdk 7 patch >=100 >Reporter: Maxime Bugeia >Priority: Minor > > With the latest patch of openjdk, all nodetool actions fail and display > "Cassandra 2.0 and later require Java 7u25 or later." because > cassandra-env.sh test of java version is broken. > Line 102: > if [ "$JVM_VERSION" \< "1.7" ] && [ "$JVM_PATCH_VERSION" \< "25" ] ; then > echo "Cassandra 2.0 and later require Java 7u25 or later." > exit 1; > fi > The second test cause all java patch >100 to be considered as inferior. One > correct syntax is "-lt" instead of "\<". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11670) Error while waiting on bootstrap to complete. Bootstrap will have to be restarted. Stream failed
[ https://issues.apache.org/jira/browse/CASSANDRA-11670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276234#comment-15276234 ] Alexander Heiß commented on CASSANDRA-11670: We raised the *commitlog_segment_size_in_mb* to 128 Now we get another error: {quote} ERROR 10:30:29 [Stream #9e733a20-15be-11e6-9bb1-31c0715c4db0] Streaming error occurred java.io.IOException: CF 64aecb30-11f7-11e6-89d2-9d1dd801d7e2 was dropped during streaming at org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:76) ~[apache-cassandra-3.0.5.jar:3.0.5] at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:50) ~[apache-cassandra-3.0.5.jar:3.0.5] at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:39) ~[apache-cassandra-3.0.5.jar:3.0.5] at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:59) ~[apache-cassandra-3.0.5.jar:3.0.5] at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268) ~[apache-cassandra-3.0.5.jar:3.0.5] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_11] INFO 10:30:29 [Stream #9e733a20-15be-11e6-9bb1-31c0715c4db0] Session with /176.9.99.140 is complete WARN 10:30:29 [Stream #9e733a20-15be-11e6-9bb1-31c0715c4db0] Stream failed ERROR 10:30:29 Error while waiting on bootstrap to complete. Bootstrap will have to be restarted. {quote} Approximately 2 Hours after the Bootstrap starts (2 Hours is the *streaming_socket_timeout_in_ms* could that have something to to with the problem ?) > Error while waiting on bootstrap to complete. Bootstrap will have to be > restarted. Stream failed > > > Key: CASSANDRA-11670 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11670 > Project: Cassandra > Issue Type: Bug > Components: Configuration, Streaming and Messaging >Reporter: Anastasia Osintseva >Assignee: Paulo Motta > Fix For: 3.0.5 > > > I have in cluster 2 DC, in each DC - 2 Nodes. I wanted to add 1 node to each > DC. One node has been added successfully after I had made scrubing. > Now I'm trying to add node to another DC, but get error: > org.apache.cassandra.streaming.StreamException: Stream failed. > After scrubing and repair I get the same error. > {noformat} > ERROR [StreamReceiveTask:5] 2016-04-27 00:33:21,082 Keyspace.java:492 - > Unknown exception caught while attempting to update MaterializedView! > messages_dump.messages > java.lang.IllegalArgumentException: Mutation of 34974901 bytes is too large > for the maxiumum size of 33554432 > at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:264) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:469) > [apache-cassandra-3.0.5.jar:3.0.5] > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:384) > [apache-cassandra-3.0.5.jar:3.0.5] > at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:205) > [apache-cassandra-3.0.5.jar:3.0.5] > at org.apache.cassandra.db.Mutation.apply(Mutation.java:217) > [apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.batchlog.BatchlogManager.store(BatchlogManager.java:146) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:724) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.db.view.ViewManager.pushViewReplicaUpdates(ViewManager.java:149) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:487) > [apache-cassandra-3.0.5.jar:3.0.5] > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:384) > [apache-cassandra-3.0.5.jar:3.0.5] > at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:205) > [apache-cassandra-3.0.5.jar:3.0.5] > at org.apache.cassandra.db.Mutation.apply(Mutation.java:217) > [apache-cassandra-3.0.5.jar:3.0.5] > at org.apache.cassandra.db.Mutation.applyUnsafe(Mutation.java:236) > [apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:169) > [apache-cassandra-3.0.5.jar:3.0.5] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_11] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [na:1.8.0_11] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_11] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >
[jira] [Comment Edited] (CASSANDRA-7826) support non-frozen, nested collections
[ https://issues.apache.org/jira/browse/CASSANDRA-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276186#comment-15276186 ] Alex Petrov edited comment on CASSANDRA-7826 at 5/9/16 11:23 AM: - After reviewing [https://issues.apache.org/jira/browse/CASSANDRA-7396] in more details and finding common grounds with that ticket, we need to allow {{SELECT}} , {{UPDATE}}and {{DELETE}} for individual fields of the nested collections (up to the deepest nesting level for maps, none for lists and sets), with syntax similar to 7396. Otherwise there's no big difference between frozen and non-frozen collections. At this particular moment I'm not entirely certain about the slices of keys for maps. It seems that there are many more cases for fetching a particular key from the map by path than for slices. was (Author: ifesdjeen): After reviewing [https://issues.apache.org/jira/browse/CASSANDRA-7396] in more details and finding common grounds with that ticket, we need to allow {{SELECT}}ing, {{UPDATE}}ing and {{DELETE}}ing individual fields of the nested collections (up to the deepest nesting level for maps, none for lists and sets), with syntax similar to 7396. Otherwise there's no big difference between frozen and non-frozen collections. At this particular moment I'm not entirely certain about the slices of keys for maps. It seems that there are many more cases for fetching a particular key from the map by path than for slices. > support non-frozen, nested collections > -- > > Key: CASSANDRA-7826 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7826 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Tupshin Harper >Assignee: Alex Petrov > Labels: ponies > Fix For: 3.x > > > The inability to nest collections is one of the bigger data modelling > limitations we have right now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11735) cassandra-env.sh doesn't test the correct java version
Maxime Bugeia created CASSANDRA-11735: - Summary: cassandra-env.sh doesn't test the correct java version Key: CASSANDRA-11735 URL: https://issues.apache.org/jira/browse/CASSANDRA-11735 Project: Cassandra Issue Type: Bug Environment: Ubuntu 14.04 openjdk 7 patch >=100 Reporter: Maxime Bugeia Priority: Minor With the latest patch of openjdk, all nodetool actions fail and display "Cassandra 2.0 and later require Java 7u25 or later." because cassandra-env.sh test of java version is broken. Line 102: if [ "$JVM_VERSION" \< "1.7" ] && [ "$JVM_PATCH_VERSION" \< "25" ] ; then echo "Cassandra 2.0 and later require Java 7u25 or later." exit 1; fi The second test cause all java patch >100 to be considered as inferior. One correct syntax is "-lt" instead of "\<". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7826) support non-frozen, nested collections
[ https://issues.apache.org/jira/browse/CASSANDRA-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276186#comment-15276186 ] Alex Petrov commented on CASSANDRA-7826: After reviewing [https://issues.apache.org/jira/browse/CASSANDRA-7396] in more details and finding common grounds with that ticket, we need to allow {{SELECT}}ing, {{UPDATE}}ing and {{DELETE}}ing individual fields of the nested collections (up to the deepest nesting level for maps, none for lists and sets), with syntax similar to 7396. Otherwise there's no big difference between frozen and non-frozen collections. At this particular moment I'm not entirely certain about the slices of keys for maps. It seems that there are many more cases for fetching a particular key from the map by path than for slices. > support non-frozen, nested collections > -- > > Key: CASSANDRA-7826 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7826 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Tupshin Harper >Assignee: Alex Petrov > Labels: ponies > Fix For: 3.x > > > The inability to nest collections is one of the bigger data modelling > limitations we have right now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7396) Allow selecting Map key, List index
[ https://issues.apache.org/jira/browse/CASSANDRA-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276175#comment-15276175 ] Alex Petrov commented on CASSANDRA-7396: After talking with [~snazy], there's an idea to postpone the slice deletes we have range tombstones supported for the Cells (which as I understood has to wait until the next version of the storage format). This will avoid read-before write for the delete operations. > Allow selecting Map key, List index > --- > > Key: CASSANDRA-7396 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7396 > Project: Cassandra > Issue Type: New Feature > Components: CQL >Reporter: Jonathan Ellis >Assignee: Robert Stupp > Labels: cql, docs-impacting > Fix For: 3.x > > Attachments: 7396_unit_tests.txt > > > Allow "SELECT map['key]" and "SELECT list[index]." (Selecting a UDT subfield > is already supported.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10292) java.lang.AssertionError: attempted to delete non-existing file CommitLog...
[ https://issues.apache.org/jira/browse/CASSANDRA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276098#comment-15276098 ] Youcef HILEM commented on CASSANDRA-10292: -- Thank you. The fix in version 2.1.12 is https://issues.apache.org/jira/browse/CASSANDRA-10377 > java.lang.AssertionError: attempted to delete non-existing file CommitLog... > > > Key: CASSANDRA-10292 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10292 > Project: Cassandra > Issue Type: Bug > Environment: CentOS Linux 7.1.1503, Cassandra 2.1.8 stable version, 6 > nodes cluster >Reporter: Dawid Szejnfeld >Priority: Critical > > From time to time some nodes are stopping to work due to error in logs like > this: > INFO [CompactionExecutor:2475] 2015-09-09 12:36:50,363 > CompactionTask.java:274 - Compacted 4 sstables to > [/mnt/cassandra--storage-machine/data/system/compactions_in_progress-55080ab05d9c38 > 8690a4acb25fe1f77b/system-compactions_in_progress-ka-126,]. 419 bytes to 42 > (~10% of original) in 33ms = 0.001214MB/s. 4 total partitions merged to 1. > Partition merge counts were {2:2, } > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,166 > ColumnFamilyStore.java:912 - Enqueuing flush of settings: 78364 (0%) on-heap, > 0 (0%) off-heap > INFO [MemtableFlushWriter:301] 2015-09-09 12:52:34,172 Memtable.java:347 - > Writing Memtable-settings@1126939979(0.113KiB serialized bytes, 1850 ops, > 0%/0% of on/off-heap limit) > INFO [MemtableFlushWriter:301] 2015-09-09 12:52:34,174 Memtable.java:382 - > Completed flushing > /mnt/cassandra--storage-machine/data/OpsCenter/settings-464866c04b1311e590698d1a9fd4ba8b/OpsCe > nter-settings-tmp-ka-12-Data.db (0.000KiB) for commitlog position > ReplayPosition(segmentId=1441362636571, position=33554415) > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,194 StorageService.java:453 > - Stopping gossiper > WARN [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 StorageService.java:359 > - Stopping gossip by operator request > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 Gossiper.java:1410 - > Announcing shutdown > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,195 StorageService.java:458 > - Stopping RPC server > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,196 ThriftServer.java:142 - > Stop listening to thrift clients > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,204 StorageService.java:463 > - Stopping native transport > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,422 Server.java:213 - Stop > listening for CQL clients > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,423 CommitLog.java:397 - > Failed managing commit log segments. Commit disk failure policy is stop; > terminating thread > java.lang.AssertionError: attempted to delete non-existing file > CommitLog-4-1441362636316.log > at > org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:126) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegment.delete(CommitLogSegment.java:343) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:418) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:413) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:152) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > [apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85] > After I create missing commit log file and restart cassandra service > everything is OK then. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11726) IndexOutOfBoundsException when selecting (distinct) row ids from counter table.
[ https://issues.apache.org/jira/browse/CASSANDRA-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-11726: Description: I have simple table containing counters: {code} CREATE TABLE tablename ( object_id ascii, counter_id ascii, count counter, PRIMARY KEY (object_id, counter_id) ) WITH CLUSTERING ORDER BY (counter_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'enabled': 'false'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; {code} Counters are often inc/decreased, whole rows are queried, deleted sometimes. After some time I tried to query all object_ids, but it failed with: {code} cqlsh:woc> consistency quorum; cqlsh:woc> select object_id from tablename; ServerError: {code} select * from ..., select where .., updates works well.. With consistency one it works sometimes, so it seems something is broken at one server, but I tried to repair table there and it did not help. Whole exception from server log: {code} java.lang.IndexOutOfBoundsException: null at java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_73] at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) ~[na:1.8.0_73] at org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:141) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.context.CounterContext.access$100(CounterContext.java:76) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.context.CounterContext$ContextState.(CounterContext.java:758) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.context.CounterContext$ContextState.wrap(CounterContext.java:765) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.context.CounterContext.merge(CounterContext.java:271) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.Conflicts.mergeCounterValues(Conflicts.java:76) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.rows.Cells.reconcile(Cells.java:143) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:591) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:419) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:279) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:112) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.transform.FilteredRows.isEmpty(FilteredRows.java:30) ~[apache-cassandra-3.5.jar:3.5] at org.apache.cassandra.db.transform.Filter.closeIfEmpty(Filter.java:49) ~[apache-cassandra-3.5.jar:3.5] at
[jira] [Commented] (CASSANDRA-11726) IndexOutOfBoundsException when selecting (distinct) row ids from counter table.
[ https://issues.apache.org/jira/browse/CASSANDRA-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276003#comment-15276003 ] Jaroslav Kamenik commented on CASSANDRA-11726: -- Hi, thank you for the advice, unfortunately it did not help:( . > IndexOutOfBoundsException when selecting (distinct) row ids from counter > table. > --- > > Key: CASSANDRA-11726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11726 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: C* 3.5, cluster of 4 nodes. >Reporter: Jaroslav Kamenik > > I have simple table containing counters: > CREATE TABLE tablename ( > object_id ascii, > counter_id ascii, > count counter, > PRIMARY KEY (object_id, counter_id) > ) WITH CLUSTERING ORDER BY (counter_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'enabled': 'false'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > Counters are often inc/decreased, whole rows are queried, deleted sometimes. > After some time I tried to query all object_ids, but it failed with: > cqlsh:woc> consistency quorum; > cqlsh:woc> select object_id from tablename; > ServerError: message="java.lang.IndexOutOfBoundsException"> > select * from ..., select where .., updates works well.. > With consistency one it works sometimes, so it seems something is broken at > one server, but I tried to repair table there and it did not help. > Whole exception from server log: > java.lang.IndexOutOfBoundsException: null > at java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_73] > at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) > ~[na:1.8.0_73] > at > org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:141) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext.access$100(CounterContext.java:76) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext$ContextState.(CounterContext.java:758) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext$ContextState.wrap(CounterContext.java:765) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext.merge(CounterContext.java:271) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.Conflicts.mergeCounterValues(Conflicts.java:76) > ~[apache-cassandra-3.5.jar:3.5] > at org.apache.cassandra.db.rows.Cells.reconcile(Cells.java:143) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:591) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at >
[jira] [Commented] (CASSANDRA-10292) java.lang.AssertionError: attempted to delete non-existing file CommitLog...
[ https://issues.apache.org/jira/browse/CASSANDRA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276000#comment-15276000 ] Dawid Szejnfeld commented on CASSANDRA-10292: - I think it was fixed also in version 2.1.12 as you can see here (https://github.com/apache/cassandra/blob/cassandra-2.1.12/CHANGES.txt). So simple update to version 2.1.14 should solve the problem as you have 2.1.9 version currently and the upgrade is not needed. > java.lang.AssertionError: attempted to delete non-existing file CommitLog... > > > Key: CASSANDRA-10292 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10292 > Project: Cassandra > Issue Type: Bug > Environment: CentOS Linux 7.1.1503, Cassandra 2.1.8 stable version, 6 > nodes cluster >Reporter: Dawid Szejnfeld >Priority: Critical > > From time to time some nodes are stopping to work due to error in logs like > this: > INFO [CompactionExecutor:2475] 2015-09-09 12:36:50,363 > CompactionTask.java:274 - Compacted 4 sstables to > [/mnt/cassandra--storage-machine/data/system/compactions_in_progress-55080ab05d9c38 > 8690a4acb25fe1f77b/system-compactions_in_progress-ka-126,]. 419 bytes to 42 > (~10% of original) in 33ms = 0.001214MB/s. 4 total partitions merged to 1. > Partition merge counts were {2:2, } > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,166 > ColumnFamilyStore.java:912 - Enqueuing flush of settings: 78364 (0%) on-heap, > 0 (0%) off-heap > INFO [MemtableFlushWriter:301] 2015-09-09 12:52:34,172 Memtable.java:347 - > Writing Memtable-settings@1126939979(0.113KiB serialized bytes, 1850 ops, > 0%/0% of on/off-heap limit) > INFO [MemtableFlushWriter:301] 2015-09-09 12:52:34,174 Memtable.java:382 - > Completed flushing > /mnt/cassandra--storage-machine/data/OpsCenter/settings-464866c04b1311e590698d1a9fd4ba8b/OpsCe > nter-settings-tmp-ka-12-Data.db (0.000KiB) for commitlog position > ReplayPosition(segmentId=1441362636571, position=33554415) > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,194 StorageService.java:453 > - Stopping gossiper > WARN [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 StorageService.java:359 > - Stopping gossip by operator request > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 Gossiper.java:1410 - > Announcing shutdown > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,195 StorageService.java:458 > - Stopping RPC server > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,196 ThriftServer.java:142 - > Stop listening to thrift clients > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,204 StorageService.java:463 > - Stopping native transport > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,422 Server.java:213 - Stop > listening for CQL clients > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,423 CommitLog.java:397 - > Failed managing commit log segments. Commit disk failure policy is stop; > terminating thread > java.lang.AssertionError: attempted to delete non-existing file > CommitLog-4-1441362636316.log > at > org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:126) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegment.delete(CommitLogSegment.java:343) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:418) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:413) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:152) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > [apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85] > After I create missing commit log file and restart cassandra service > everything is OK then. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11728) Incremental repair fails with vnodes+lcs+multi-dc
[ https://issues.apache.org/jira/browse/CASSANDRA-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson resolved CASSANDRA-11728. - Resolution: Fixed pretty sure this is a duplicate of CASSANDRA-10831 - could you reopen if it reproduces in 2.1.13+? > Incremental repair fails with vnodes+lcs+multi-dc > - > > Key: CASSANDRA-11728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11728 > Project: Cassandra > Issue Type: Bug >Reporter: Nick Bailey > > Produced on 2.1.12 > We are seeing incremental repair fail with an error regarding creating > multiple repair sessions on overlapping sstables. This is happening in the > following setup > * 6 nodes > * 2 Datacenters > * Vnodes enabled > * Leveled compaction on the relevant tables > When STCS is used instead, we don't hit an issue. This is slightly related to > https://issues.apache.org/jira/browse/CASSANDRA-11461, except in this case > OpsCenter repair service is running all repairs sequentially. Let me know > what other information we can provide. -- This message was sent by Atlassian JIRA (v6.3.4#6332)