[jira] [Commented] (CASSANDRA-9683) Get mucher higher load and latencies after upgrading from 2.1.6 to cassandra 2.1.7
[ https://issues.apache.org/jira/browse/CASSANDRA-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631408#comment-14631408 ] Loic Lambiel commented on CASSANDRA-9683: - That stats comes from Opscenter. We're using durable_writes = false in the blobstore Keyspace, were most data are written. This may explain the low write latency. I'm going to try to reproduce this on a new single node setup as I don't want to kill this cluster. I'll do it in the coming days, as soon as I have the pipe. We're using Cassandra as a backend for our object storage service based on our pithos (http://pithos.io) API frontend. Data can then be uploaded using any S3 compatible tools like s3cmd. I don't know if you want to go into such setup. (we could help or do it remotely if needed). I don't know either how to reproduce the data pattern and usage without it. Get mucher higher load and latencies after upgrading from 2.1.6 to cassandra 2.1.7 -- Key: CASSANDRA-9683 URL: https://issues.apache.org/jira/browse/CASSANDRA-9683 Project: Cassandra Issue Type: Bug Environment: Ubuntu 12.04 (3.13 Kernel) * 3 JDK: Oracle JDK 7 RAM: 32GB Cores 4 (+4 HT) Reporter: Loic Lambiel Assignee: Ariel Weisberg Fix For: 2.1.x Attachments: cassandra-env.sh, cassandra.yaml, cfstats.txt, os_load.png, pending_compactions.png, read_latency.png, schema.txt, system.log, write_latency.png After upgrading our cassandra staging cluster version from 2.1.6 to 2.1.7, the average load grows from 0.1-0.3 to 1.8. Latencies did increase as well. We see an increase of pending compactions, probably due to CASSANDRA-9592. This cluster has almost no workload (staging environment) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631410#comment-14631410 ] Jonathan Ellis commented on CASSANDRA-6477: --- bq. From a user's perspective, I agree with Sylvain that the MV should respect the CL. I wouldn't expect to do a write at ALL, then do a read and get an old record back. But the other side of that coin is is, we're effectively promoting all operations to at least QUORUM regardless of what the user asked for... Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9838) Unable to update an element in a static list
[ https://issues.apache.org/jira/browse/CASSANDRA-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631417#comment-14631417 ] Philip Thompson commented on CASSANDRA-9838: I'm getting {{InvalidRequest: code=2200 [Invalid query] message=Attempted to set an element on a list which is null}} instead, on the same operations. [~thobbs], are these operations valid? Unable to update an element in a static list Key: CASSANDRA-9838 URL: https://issues.apache.org/jira/browse/CASSANDRA-9838 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5 on Linux Reporter: Mahesh Datt Fix For: 2.1.x I created a table in cassandra (my_table) which has a static list column sizes_list. I created a new row and initialized the list sizes_list as having one element. {{UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01}} Now I m trying to update the element at index '0' with a statement like this {code}insert into my_table (my_id, is_deleted , col_id1, col_id2) values (0x01, False, 0x00, 0x00); UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ; {code} Now I see an error like this: {{InvalidRequest: code=2200 [Invalid query] message=List index 0 out of bound, list has size 0}} If I change my list to a non-static list, it works fine! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631420#comment-14631420 ] Sylvain Lebresne commented on CASSANDRA-6477: - bq. But the other side of that coin is is, we're effectively promoting all operations to at least QUORUM regardless of what the user asked for... We're not. In the description I made above, we need to wait on QUORUM response to remove from the batchlog, but we don't need to wait on QUORUM to respond to the user. Unless my reasoning is broken, we do respect the CL levels exactly as we should. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631422#comment-14631422 ] Jonathan Ellis commented on CASSANDRA-6477: --- 1. Paired replica? What? 2. Under what conditions does replica BL save you from replaying coordinator BL? Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631426#comment-14631426 ] Jonathan Ellis commented on CASSANDRA-6477: --- Pedantically you are correct. Which is why I said effectively and not literally. :) Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9669) Commit Log Replay is Broken
[ https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631236#comment-14631236 ] Benedict commented on CASSANDRA-9669: - Ick. So, thinking about it from a 2.0 perspective, this is even more of a problem for counters. Since CL replay of a counter that is already persisted causes a double-count. Question is: do we care? If we do, we should probably stick with the solution I already posted for 2.0. For 2.1+ I think a ledger is a better route. Commit Log Replay is Broken --- Key: CASSANDRA-9669 URL: https://issues.apache.org/jira/browse/CASSANDRA-9669 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Critical Labels: correctness Fix For: 3.x, 2.1.x, 2.2.x, 3.0.x While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, on restart we simply take the maximum replay position of any sstable on disk, and ignore anything prior. It is quite possible for there to be two flushes triggered for a given table, and for the second to finish first by virtue of containing a much smaller quantity of live data (or perhaps the disk is just under less pressure). If we crash before the first sstable has been written, then on restart the data it would have represented will disappear, since we will not replay the CL records. This looks to be a bug present since time immemorial, and also seems pretty serious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9723) UDF / UDA execution time in trace
[ https://issues.apache.org/jira/browse/CASSANDRA-9723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631224#comment-14631224 ] Christopher Batey commented on CASSANDRA-9723: -- Ready for review: https://github.com/chbatey/cassandra-1/tree/udf-trace UDF / UDA execution time in trace - Key: CASSANDRA-9723 URL: https://issues.apache.org/jira/browse/CASSANDRA-9723 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Christopher Batey Assignee: Christopher Batey Priority: Minor Fix For: 2.2.x I'd like to see how long my UDF/As take in the trace. Checked in 2.2rc1 and doesn't appear to be mentioned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631226#comment-14631226 ] Jon Haddad commented on CASSANDRA-6477: --- From a user's perspective, I agree with Sylvain that the MV should respect the CL. I wouldn't expect to do a write at ALL, then do a read and get an old record back. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631267#comment-14631267 ] Brian Hess commented on CASSANDRA-6477: +1 I think that is the promise of the MV. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size
[ https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631274#comment-14631274 ] Ryan McGuire commented on CASSANDRA-8894: - Yep, I can add that to the gui. I'll probably just add a section for Extra cstar_perf settings and document what can be put in there, rather than calling out blockdevice readahead explicitly in the interface. (can put a help link next to it to make it easy.) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size -- Key: CASSANDRA-8894 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Labels: benedict-to-commit Fix For: 3.x Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml A large contributor to slower buffered reads than mmapped is likely that we read a full 64Kb at once, when average record sizes may be as low as 140 bytes on our stress tests. The TLB has only 128 entries on a modern core, and each read will touch 32 of these, meaning we are unlikely to almost ever be hitting the TLB, and will be incurring at least 30 unnecessary misses each time (as well as the other costs of larger than necessary accesses). When working with an SSD there is little to no benefit reading more than 4Kb at once, and in either case reading more data than we need is wasteful. So, I propose selecting a buffer size that is the next larger power of 2 than our average record size (with a minimum of 4Kb), so that we expect to read in one operation. I also propose that we create a pool of these buffers up-front, and that we ensure they are all exactly aligned to a virtual page, so that the source and target operations each touch exactly one virtual page per 4Kb of expected record size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9838) Unable to update an element in a static list
[ https://issues.apache.org/jira/browse/CASSANDRA-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9838: --- Description: I created a table in cassandra (my_table) which has a static list column sizes_list. I created a new row and initialized the list sizes_list as having one element. {{UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01}} Now I m trying to update the element at index '0' with a statement like this {code}insert into my_table (my_id, is_deleted , col_id1, col_id2) values (0x01, False, 0x00, 0x00); UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ; {code} Now I see an error like this: {{InvalidRequest: code=2200 [Invalid query] message=List index 0 out of bound, list has size 0}} If I change my list to a non-static list, it works fine! was: I created a table in cassandra (my_table) which has a static list column sizes_list. I created a new row and initialized the list sizes_list as having one element. UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01 Now I m trying to update the element at index '0' with a statement like this insert into my_table (my_id, is_deleted , col_id1, col_id2) values (0x01, False, 0x00, 0x00); UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ; Now I see an error like this: InvalidRequest: code=2200 [Invalid query] message=List index 0 out of bound, list has size 0 If I change my list to a non-static list, it works fine! Unable to update an element in a static list Key: CASSANDRA-9838 URL: https://issues.apache.org/jira/browse/CASSANDRA-9838 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5 on Linux Reporter: Mahesh Datt Fix For: 2.1.x I created a table in cassandra (my_table) which has a static list column sizes_list. I created a new row and initialized the list sizes_list as having one element. {{UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01}} Now I m trying to update the element at index '0' with a statement like this {code}insert into my_table (my_id, is_deleted , col_id1, col_id2) values (0x01, False, 0x00, 0x00); UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ; {code} Now I see an error like this: {{InvalidRequest: code=2200 [Invalid query] message=List index 0 out of bound, list has size 0}} If I change my list to a non-static list, it works fine! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9838) Unable to update an element in a static list
[ https://issues.apache.org/jira/browse/CASSANDRA-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9838: --- Reproduced In: 2.1.5 Fix Version/s: 2.1.x Unable to update an element in a static list Key: CASSANDRA-9838 URL: https://issues.apache.org/jira/browse/CASSANDRA-9838 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5 on Linux Reporter: Mahesh Datt Fix For: 2.1.x I created a table in cassandra (my_table) which has a static list column sizes_list. I created a new row and initialized the list sizes_list as having one element. UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01 Now I m trying to update the element at index '0' with a statement like this insert into my_table (my_id, is_deleted , col_id1, col_id2) values (0x01, False, 0x00, 0x00); UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ; Now I see an error like this: InvalidRequest: code=2200 [Invalid query] message=List index 0 out of bound, list has size 0 If I change my list to a non-static list, it works fine! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9519) CASSANDRA-8448 Doesn't seem to be fixed
[ https://issues.apache.org/jira/browse/CASSANDRA-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-9519: Fix Version/s: 2.0.17 CASSANDRA-8448 Doesn't seem to be fixed --- Key: CASSANDRA-9519 URL: https://issues.apache.org/jira/browse/CASSANDRA-9519 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jeremiah Jordan Assignee: Sylvain Lebresne Fix For: 2.1.9, 2.0.17, 2.2.0 Attachments: 9519.txt Still seeing the Comparison method violates its general contract! in 2.1.5 {code} java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeHi(TimSort.java:895) ~[na:1.8.0_45] at java.util.TimSort.mergeAt(TimSort.java:512) ~[na:1.8.0_45] at java.util.TimSort.mergeCollapse(TimSort.java:437) ~[na:1.8.0_45] at java.util.TimSort.sort(TimSort.java:241) ~[na:1.8.0_45] at java.util.Arrays.sort(Arrays.java:1512) ~[na:1.8.0_45] at java.util.ArrayList.sort(ArrayList.java:1454) ~[na:1.8.0_45] at java.util.Collections.sort(Collections.java:175) ~[na:1.8.0_45] at org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:158) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:187) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:152) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1530) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1688) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:256) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:63) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:260) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:272) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/5] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2d462c04 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2d462c04 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2d462c04 Branch: refs/heads/cassandra-2.2 Commit: 2d462c04973a15e84ca550ce3913d08d7c5ee8c8 Parents: 1eda7cb 0ef1888 Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri Jul 17 15:36:24 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:36:24 2015 +0200 -- CHANGES.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/2d462c04/CHANGES.txt -- diff --cc CHANGES.txt index 49cc850,f20fad8..c6774c2 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,32 -1,7 +1,32 @@@ -2.0.17 +2.1.9 + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) - * Complete CASSANDRA-8448 fix (CASSANDRA-9519) + * Handle corrupt files on startup (CASSANDRA-9686) + * Fix clientutil jar and tests (CASSANDRA-9760) + * (cqlsh) Allow the SSL protocol version to be specified through the + config file or environment variables (CASSANDRA-9544) +Merged from 2.0: + * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Don't include auth credentials in debug log (CASSANDRA-9682) * Can't transition from write survey to normal mode (CASSANDRA-9740) + * Scrub (recover) sstables even when -Index.db is missing, (CASSANDRA-9591) + * Fix growing pending background compaction (CASSANDRA-9662) + + +2.1.8 + * (cqlsh) Fix bad check for CQL compatibility when DESCRIBE'ing + COMPACT STORAGE tables with no clustering columns + * Warn when an extra-large partition is compacted (CASSANDRA-9643) + * Eliminate strong self-reference chains in sstable ref tidiers (CASSANDRA-9656) + * Ensure StreamSession uses canonical sstable reader instances (CASSANDRA-9700) + * Ensure memtable book keeping is not corrupted in the event we shrink usage (CASSANDRA-9681) + * Update internal python driver for cqlsh (CASSANDRA-9064) + * Fix IndexOutOfBoundsException when inserting tuple with too many + elements using the string literal notation (CASSANDRA-9559) + * Allow JMX over SSL directly from nodetool (CASSANDRA-9090) + * Fix incorrect result for IN queries where column not found (CASSANDRA-9540) + * Enable describe on indices (CASSANDRA-7814) + * ColumnFamilyStore.selectAndReference may block during compaction (CASSANDRA-9637) +Merged from 2.0: * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727) * Add listen_address to system.local (CASSANDRA-9603) * Bug fixes to resultset metadata construction (CASSANDRA-9636)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631354#comment-14631354 ] T Jake Luciani commented on CASSANDRA-6477: --- bq. I've actually never understood why we do a batchlog update on the base table replicas (and so I think we should remove it, even though that's likely not the most costly one). Why do we need it? If my reasoning above is correct, the coordinator batchlog is enough to guarantee durability and eventual consistency because we will replay the whole mutation until a QUORUM of replica acknowledges success. Yes, if we error out if the base is unable to replicate to the view then the second BL is redundant. However there are a few reasons why we did what we did. 1. Your availability is cut in half when you use a MV with these guarantees. I have a 5 node cluster RF=3 and I want to write at CL.ONE. If I have an MV I can no longer handle two down nodes. Since the paired replica for the one base node might be down. 2. The cost of replaying the coordinator BL is much higher than replaying the base to replica BL since the latter is 1:1. I do agree there is a disconnect in terms of consistency level when using the MV but the batchlog feature was written to handle this. We could support both approaches in terms of a new flag? Or are we willing to take a hit on availability? Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/5] cassandra git commit: Fix comparison contract violation in the dynamic snitch sorting
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 f74419cd2 - f60e4ad42 Fix comparison contract violation in the dynamic snitch sorting patch by slebresne; reviewed by benedict for CASSANDRA-9519 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a9b9e627 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a9b9e627 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a9b9e627 Branch: refs/heads/cassandra-2.2 Commit: a9b9e627b0256a7b55dbfefa6960e1e5b8379e64 Parents: 1d54fc3 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Jul 9 13:28:38 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:35:07 2015 +0200 -- CHANGES.txt | 1 + .../locator/DynamicEndpointSnitch.java | 34 -- .../locator/DynamicEndpointSnitchTest.java | 69 +++- 3 files changed, 95 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a755cb9..f20fad8 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.17 + * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Don't include auth credentials in debug log (CASSANDRA-9682) * Can't transition from write survey to normal mode (CASSANDRA-9740) * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java -- diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java index 3469847..f226989 100644 --- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java +++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java @@ -42,9 +42,9 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa private static final double ALPHA = 0.75; // set to 0.75 to make EDS more biased to towards the newer values private static final int WINDOW_SIZE = 100; -private int UPDATE_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicUpdateInterval(); -private int RESET_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicResetInterval(); -private double BADNESS_THRESHOLD = DatabaseDescriptor.getDynamicBadnessThreshold(); +private final int UPDATE_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicUpdateInterval(); +private final int RESET_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicResetInterval(); +private final double BADNESS_THRESHOLD = DatabaseDescriptor.getDynamicBadnessThreshold(); // the score for a merged set of endpoints must be this much worse than the score for separate endpoints to // warrant not merging two ranges into a single range @@ -154,7 +154,18 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa private void sortByProximityWithScore(final InetAddress address, ListInetAddress addresses) { -super.sortByProximity(address, addresses); +// Scores can change concurrently from a call to this method. But Collections.sort() expects +// its comparator to be stable, that is 2 endpoint should compare the same way for the duration +// of the sort() call. As we copy the scores map on write, it is thus enough to alias the current +// version of it during this call. +final HashMapInetAddress, Double scores = this.scores; +Collections.sort(addresses, new ComparatorInetAddress() +{ +public int compare(InetAddress a1, InetAddress a2) +{ +return compareEndpoints(address, a1, a2, scores); +} +}); } private void sortByProximityWithBadness(final InetAddress address, ListInetAddress addresses) @@ -163,6 +174,8 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa return; subsnitch.sortByProximity(address, addresses); +HashMapInetAddress, Double scores = this.scores; // Make sure the score don't change in the middle of the loop below + // (which wouldn't really matter here but its cleaner that way). ArrayListDouble subsnitchOrderedScores = new ArrayList(addresses.size()); for (InetAddress inet : addresses) { @@ -189,7 +202,8 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa } } -public int
[5/5] cassandra git commit: Don't wrap byte arrays in SequentialWriter
Don't wrap byte arrays in SequentialWriter patch by slebresne; reviewed by snazy benedict for CASSANDRA-9797 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f60e4ad4 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f60e4ad4 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f60e4ad4 Branch: refs/heads/cassandra-2.2 Commit: f60e4ad4298725dac57c36da8427d992be19eb8a Parents: 22c97bc Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri Jul 17 15:39:32 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:39:32 2015 +0200 -- CHANGES.txt | 1 + .../cassandra/io/util/SequentialWriter.java | 22 ++-- 2 files changed, 21 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f60e4ad4/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 9a262dc..47d1db5 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.0-rc3 + * Don't wrap byte arrays in SequentialWriter (CASSANDRA-9797) * sum() and avg() functions missing for smallint and tinyint types (CASSANDRA-9671) * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771) Merged from 2.1: http://git-wip-us.apache.org/repos/asf/cassandra/blob/f60e4ad4/src/java/org/apache/cassandra/io/util/SequentialWriter.java -- diff --git a/src/java/org/apache/cassandra/io/util/SequentialWriter.java b/src/java/org/apache/cassandra/io/util/SequentialWriter.java index f3268a2..915133f 100644 --- a/src/java/org/apache/cassandra/io/util/SequentialWriter.java +++ b/src/java/org/apache/cassandra/io/util/SequentialWriter.java @@ -185,12 +185,30 @@ public class SequentialWriter extends OutputStream implements WritableByteChanne public void write(byte[] buffer) throws IOException { -write(ByteBuffer.wrap(buffer, 0, buffer.length)); +write(buffer, 0, buffer.length); } public void write(byte[] data, int offset, int length) throws IOException { -write(ByteBuffer.wrap(data, offset, length)); +if (buffer == null) +throw new ClosedChannelException(); + +int position = offset; +int remaining = length; +while (remaining 0) +{ +if (!buffer.hasRemaining()) +reBuffer(); + +int toCopy = Math.min(remaining, buffer.remaining()); +buffer.put(data, position, toCopy); + +remaining -= toCopy; +position += toCopy; + +isDirty = true; +syncNeeded = true; +} } public int write(ByteBuffer src) throws IOException
[2/5] cassandra git commit: Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)
Move CASSANDRA-9519 test in long tests (and reduce the size of the list used) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ef18886 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ef18886 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ef18886 Branch: refs/heads/cassandra-2.2 Commit: 0ef188869049ec6233d115f7a46c25f492e8fa42 Parents: a9b9e62 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Jul 16 15:14:54 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:35:24 2015 +0200 -- .../locator/DynamicEndpointSnitchLongTest.java | 104 +++ .../locator/DynamicEndpointSnitchTest.java | 64 2 files changed, 104 insertions(+), 64 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ef18886/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java -- diff --git a/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java new file mode 100644 index 000..1c628fa --- /dev/null +++ b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java @@ -0,0 +1,104 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one +* or more contributor license agreements. See the NOTICE file +* distributed with this work for additional information +* regarding copyright ownership. The ASF licenses this file +* to you under the Apache License, Version 2.0 (the +* License); you may not use this file except in compliance +* with the License. You may obtain a copy of the License at +* +*http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, +* software distributed under the License is distributed on an +* AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +* KIND, either express or implied. See the License for the +* specific language governing permissions and limitations +* under the License. +*/ + +package org.apache.cassandra.locator; + +import java.io.IOException; +import java.net.InetAddress; +import java.util.*; + +import org.apache.cassandra.config.DatabaseDescriptor; +import org.apache.cassandra.exceptions.ConfigurationException; +import org.apache.cassandra.service.StorageService; +import org.junit.Test; + +import org.apache.cassandra.utils.FBUtilities; + +import static org.junit.Assert.assertEquals; + +public class DynamicEndpointSnitchLongTest +{ +@Test +public void testConcurrency() throws InterruptedException, IOException, ConfigurationException +{ +// The goal of this test is to check for CASSANDRA-8448/CASSANDRA-9519 +double badness = DatabaseDescriptor.getDynamicBadnessThreshold(); +DatabaseDescriptor.setDynamicBadnessThreshold(0.0); + +try +{ +final int ITERATIONS = 1; + +// do this because SS needs to be initialized before DES can work properly. +StorageService.instance.initClient(0); +SimpleSnitch ss = new SimpleSnitch(); +DynamicEndpointSnitch dsnitch = new DynamicEndpointSnitch(ss, String.valueOf(ss.hashCode())); +InetAddress self = FBUtilities.getBroadcastAddress(); + +ListInetAddress hosts = new ArrayList(); +// We want a big list of hosts so sorting takes time, making it much more likely to reproduce the +// problem we're looking for. +for (int i = 0; i 100; i++) +for (int j = 0; j 256; j++) +hosts.add(InetAddress.getByAddress(new byte[]{127, 0, (byte)i, (byte)j})); + +ScoreUpdater updater = new ScoreUpdater(dsnitch, hosts); +updater.start(); + +ListInetAddress result = null; +for (int i = 0; i ITERATIONS; i++) +result = dsnitch.getSortedListByProximity(self, hosts); + +updater.stopped = true; +updater.join(); +} +finally +{ +DatabaseDescriptor.setDynamicBadnessThreshold(badness); +} +} + +public static class ScoreUpdater extends Thread +{ +private static final int SCORE_RANGE = 100; + +public volatile boolean stopped; + +private final DynamicEndpointSnitch dsnitch; +private final ListInetAddress hosts; +private final Random random = new Random(); + +public ScoreUpdater(DynamicEndpointSnitch dsnitch, ListInetAddress hosts) +{ +this.dsnitch = dsnitch; +this.hosts = hosts; +} + +public void run() +{ +while (!stopped) +{ +
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631353#comment-14631353 ] Benedict commented on CASSANDRA-6477: - bq. I've actually never understood why we do a batchlog update on the base table replicas (and so I think we should remove it, even though that's likely not the most costly one). Why do we need it? The thing is, the coordinator-level batchlog write is quite expensive. It seems we've paired each node with one MV node, but here's an idea: why not also pair it with RF-2 (or 1, and only support RF=3 for now) partners, to whom it requires the first write to be propagated, without which it does not acknowledge? This could be done with a specialised batchlog write, that goes to the local node _and_ the paired MV node. That way, most importantly, we do not have to wait synchronously for the batchlog records to be written: if they're lost, then the corruption caused by their loss is also lost. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[4/6] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22c97bc5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22c97bc5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22c97bc5 Branch: refs/heads/trunk Commit: 22c97bc5ef5017663a40d25bd5b7283c09e25dd5 Parents: f74419c 2d462c0 Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri Jul 17 15:36:49 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:36:49 2015 +0200 -- CHANGES.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/22c97bc5/CHANGES.txt -- diff --cc CHANGES.txt index e6c093d,c6774c2..9a262dc --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,17 -1,14 +1,17 @@@ -2.1.9 +2.2.0-rc3 + * sum() and avg() functions missing for smallint and tinyint types (CASSANDRA-9671) + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771) +Merged from 2.1: * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) - * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Handle corrupt files on startup (CASSANDRA-9686) * Fix clientutil jar and tests (CASSANDRA-9760) * (cqlsh) Allow the SSL protocol version to be specified through the config file or environment variables (CASSANDRA-9544) Merged from 2.0: + * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Don't include auth credentials in debug log (CASSANDRA-9682) * Can't transition from write survey to normal mode (CASSANDRA-9740) - * Scrub (recover) sstables even when -Index.db is missing, (CASSANDRA-9591) + * Scrub (recover) sstables even when -Index.db is missing (CASSANDRA-9591) * Fix growing pending background compaction (CASSANDRA-9662)
[1/6] cassandra git commit: Fix comparison contract violation in the dynamic snitch sorting
Repository: cassandra Updated Branches: refs/heads/trunk 412e8743d - 05a5fb4f8 Fix comparison contract violation in the dynamic snitch sorting patch by slebresne; reviewed by benedict for CASSANDRA-9519 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a9b9e627 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a9b9e627 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a9b9e627 Branch: refs/heads/trunk Commit: a9b9e627b0256a7b55dbfefa6960e1e5b8379e64 Parents: 1d54fc3 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Jul 9 13:28:38 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:35:07 2015 +0200 -- CHANGES.txt | 1 + .../locator/DynamicEndpointSnitch.java | 34 -- .../locator/DynamicEndpointSnitchTest.java | 69 +++- 3 files changed, 95 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a755cb9..f20fad8 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.17 + * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Don't include auth credentials in debug log (CASSANDRA-9682) * Can't transition from write survey to normal mode (CASSANDRA-9740) * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java -- diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java index 3469847..f226989 100644 --- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java +++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java @@ -42,9 +42,9 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa private static final double ALPHA = 0.75; // set to 0.75 to make EDS more biased to towards the newer values private static final int WINDOW_SIZE = 100; -private int UPDATE_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicUpdateInterval(); -private int RESET_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicResetInterval(); -private double BADNESS_THRESHOLD = DatabaseDescriptor.getDynamicBadnessThreshold(); +private final int UPDATE_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicUpdateInterval(); +private final int RESET_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicResetInterval(); +private final double BADNESS_THRESHOLD = DatabaseDescriptor.getDynamicBadnessThreshold(); // the score for a merged set of endpoints must be this much worse than the score for separate endpoints to // warrant not merging two ranges into a single range @@ -154,7 +154,18 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa private void sortByProximityWithScore(final InetAddress address, ListInetAddress addresses) { -super.sortByProximity(address, addresses); +// Scores can change concurrently from a call to this method. But Collections.sort() expects +// its comparator to be stable, that is 2 endpoint should compare the same way for the duration +// of the sort() call. As we copy the scores map on write, it is thus enough to alias the current +// version of it during this call. +final HashMapInetAddress, Double scores = this.scores; +Collections.sort(addresses, new ComparatorInetAddress() +{ +public int compare(InetAddress a1, InetAddress a2) +{ +return compareEndpoints(address, a1, a2, scores); +} +}); } private void sortByProximityWithBadness(final InetAddress address, ListInetAddress addresses) @@ -163,6 +174,8 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa return; subsnitch.sortByProximity(address, addresses); +HashMapInetAddress, Double scores = this.scores; // Make sure the score don't change in the middle of the loop below + // (which wouldn't really matter here but its cleaner that way). ArrayListDouble subsnitchOrderedScores = new ArrayList(addresses.size()); for (InetAddress inet : addresses) { @@ -189,7 +202,8 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa } } -public int compareEndpoints(InetAddress target,
[2/6] cassandra git commit: Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)
Move CASSANDRA-9519 test in long tests (and reduce the size of the list used) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ef18886 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ef18886 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ef18886 Branch: refs/heads/trunk Commit: 0ef188869049ec6233d115f7a46c25f492e8fa42 Parents: a9b9e62 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Jul 16 15:14:54 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:35:24 2015 +0200 -- .../locator/DynamicEndpointSnitchLongTest.java | 104 +++ .../locator/DynamicEndpointSnitchTest.java | 64 2 files changed, 104 insertions(+), 64 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ef18886/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java -- diff --git a/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java new file mode 100644 index 000..1c628fa --- /dev/null +++ b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java @@ -0,0 +1,104 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one +* or more contributor license agreements. See the NOTICE file +* distributed with this work for additional information +* regarding copyright ownership. The ASF licenses this file +* to you under the Apache License, Version 2.0 (the +* License); you may not use this file except in compliance +* with the License. You may obtain a copy of the License at +* +*http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, +* software distributed under the License is distributed on an +* AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +* KIND, either express or implied. See the License for the +* specific language governing permissions and limitations +* under the License. +*/ + +package org.apache.cassandra.locator; + +import java.io.IOException; +import java.net.InetAddress; +import java.util.*; + +import org.apache.cassandra.config.DatabaseDescriptor; +import org.apache.cassandra.exceptions.ConfigurationException; +import org.apache.cassandra.service.StorageService; +import org.junit.Test; + +import org.apache.cassandra.utils.FBUtilities; + +import static org.junit.Assert.assertEquals; + +public class DynamicEndpointSnitchLongTest +{ +@Test +public void testConcurrency() throws InterruptedException, IOException, ConfigurationException +{ +// The goal of this test is to check for CASSANDRA-8448/CASSANDRA-9519 +double badness = DatabaseDescriptor.getDynamicBadnessThreshold(); +DatabaseDescriptor.setDynamicBadnessThreshold(0.0); + +try +{ +final int ITERATIONS = 1; + +// do this because SS needs to be initialized before DES can work properly. +StorageService.instance.initClient(0); +SimpleSnitch ss = new SimpleSnitch(); +DynamicEndpointSnitch dsnitch = new DynamicEndpointSnitch(ss, String.valueOf(ss.hashCode())); +InetAddress self = FBUtilities.getBroadcastAddress(); + +ListInetAddress hosts = new ArrayList(); +// We want a big list of hosts so sorting takes time, making it much more likely to reproduce the +// problem we're looking for. +for (int i = 0; i 100; i++) +for (int j = 0; j 256; j++) +hosts.add(InetAddress.getByAddress(new byte[]{127, 0, (byte)i, (byte)j})); + +ScoreUpdater updater = new ScoreUpdater(dsnitch, hosts); +updater.start(); + +ListInetAddress result = null; +for (int i = 0; i ITERATIONS; i++) +result = dsnitch.getSortedListByProximity(self, hosts); + +updater.stopped = true; +updater.join(); +} +finally +{ +DatabaseDescriptor.setDynamicBadnessThreshold(badness); +} +} + +public static class ScoreUpdater extends Thread +{ +private static final int SCORE_RANGE = 100; + +public volatile boolean stopped; + +private final DynamicEndpointSnitch dsnitch; +private final ListInetAddress hosts; +private final Random random = new Random(); + +public ScoreUpdater(DynamicEndpointSnitch dsnitch, ListInetAddress hosts) +{ +this.dsnitch = dsnitch; +this.hosts = hosts; +} + +public void run() +{ +while (!stopped) +{ +
[6/6] cassandra git commit: Merge branch 'cassandra-2.2' into trunk
Merge branch 'cassandra-2.2' into trunk Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/05a5fb4f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/05a5fb4f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/05a5fb4f Branch: refs/heads/trunk Commit: 05a5fb4f8b7bf76be0d95196a3231c5be61ee978 Parents: 412e874 f60e4ad Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri Jul 17 15:40:40 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:40:40 2015 +0200 -- CHANGES.txt | 4 +++- .../cassandra/io/util/SequentialWriter.java | 22 ++-- 2 files changed, 23 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/05a5fb4f/CHANGES.txt -- diff --cc CHANGES.txt index db306ea,47d1db5..b2abd10 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,34 -1,15 +1,36 @@@ +3.0 + * Metrics should use up to date nomenclature (CASSANDRA-9448) + * Change CREATE/ALTER TABLE syntax for compression (CASSANDRA-8384) + * Cleanup crc and adler code for java 8 (CASSANDRA-9650) + * Storage engine refactor (CASSANDRA-8099, 9743, 9746, 9759, 9781, 9808, 9825) + * Update Guava to 18.0 (CASSANDRA-9653) + * Bloom filter false positive ratio is not honoured (CASSANDRA-8413) + * New option for cassandra-stress to leave a ratio of columns null (CASSANDRA-9522) + * Change hinted_handoff_enabled yaml setting, JMX (CASSANDRA-9035) + * Add algorithmic token allocation (CASSANDRA-7032) + * Add nodetool command to replay batchlog (CASSANDRA-9547) + * Make file buffer cache independent of paths being read (CASSANDRA-8897) + * Remove deprecated legacy Hadoop code (CASSANDRA-9353) + * Decommissioned nodes will not rejoin the cluster (CASSANDRA-8801) + * Change gossip stabilization to use endpoit size (CASSANDRA-9401) + * Change default garbage collector to G1 (CASSANDRA-7486) + * Populate TokenMetadata early during startup (CASSANDRA-9317) + * undeprecate cache recentHitRate (CASSANDRA-6591) + * Add support for selectively varint encoding fields (CASSANDRA-9499) + + 2.2.0-rc3 + * Don't wrap byte arrays in SequentialWriter (CASSANDRA-9797) + * sum() and avg() functions missing for smallint and tinyint types (CASSANDRA-9671) * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771) Merged from 2.1: * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) * Handle corrupt files on startup (CASSANDRA-9686) * Fix clientutil jar and tests (CASSANDRA-9760) * (cqlsh) Allow the SSL protocol version to be specified through the - config file or environment variables (CASSANDRA-9544) +config file or environment variables (CASSANDRA-9544) Merged from 2.0: + * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Don't include auth credentials in debug log (CASSANDRA-9682) * Can't transition from write survey to normal mode (CASSANDRA-9740) * Scrub (recover) sstables even when -Index.db is missing (CASSANDRA-9591) http://git-wip-us.apache.org/repos/asf/cassandra/blob/05a5fb4f/src/java/org/apache/cassandra/io/util/SequentialWriter.java --
[3/6] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2d462c04 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2d462c04 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2d462c04 Branch: refs/heads/trunk Commit: 2d462c04973a15e84ca550ce3913d08d7c5ee8c8 Parents: 1eda7cb 0ef1888 Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri Jul 17 15:36:24 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:36:24 2015 +0200 -- CHANGES.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/2d462c04/CHANGES.txt -- diff --cc CHANGES.txt index 49cc850,f20fad8..c6774c2 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,32 -1,7 +1,32 @@@ -2.0.17 +2.1.9 + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) - * Complete CASSANDRA-8448 fix (CASSANDRA-9519) + * Handle corrupt files on startup (CASSANDRA-9686) + * Fix clientutil jar and tests (CASSANDRA-9760) + * (cqlsh) Allow the SSL protocol version to be specified through the + config file or environment variables (CASSANDRA-9544) +Merged from 2.0: + * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Don't include auth credentials in debug log (CASSANDRA-9682) * Can't transition from write survey to normal mode (CASSANDRA-9740) + * Scrub (recover) sstables even when -Index.db is missing, (CASSANDRA-9591) + * Fix growing pending background compaction (CASSANDRA-9662) + + +2.1.8 + * (cqlsh) Fix bad check for CQL compatibility when DESCRIBE'ing + COMPACT STORAGE tables with no clustering columns + * Warn when an extra-large partition is compacted (CASSANDRA-9643) + * Eliminate strong self-reference chains in sstable ref tidiers (CASSANDRA-9656) + * Ensure StreamSession uses canonical sstable reader instances (CASSANDRA-9700) + * Ensure memtable book keeping is not corrupted in the event we shrink usage (CASSANDRA-9681) + * Update internal python driver for cqlsh (CASSANDRA-9064) + * Fix IndexOutOfBoundsException when inserting tuple with too many + elements using the string literal notation (CASSANDRA-9559) + * Allow JMX over SSL directly from nodetool (CASSANDRA-9090) + * Fix incorrect result for IN queries where column not found (CASSANDRA-9540) + * Enable describe on indices (CASSANDRA-7814) + * ColumnFamilyStore.selectAndReference may block during compaction (CASSANDRA-9637) +Merged from 2.0: * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727) * Add listen_address to system.local (CASSANDRA-9603) * Bug fixes to resultset metadata construction (CASSANDRA-9636)
[4/5] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22c97bc5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22c97bc5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22c97bc5 Branch: refs/heads/cassandra-2.2 Commit: 22c97bc5ef5017663a40d25bd5b7283c09e25dd5 Parents: f74419c 2d462c0 Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri Jul 17 15:36:49 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:36:49 2015 +0200 -- CHANGES.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/22c97bc5/CHANGES.txt -- diff --cc CHANGES.txt index e6c093d,c6774c2..9a262dc --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,17 -1,14 +1,17 @@@ -2.1.9 +2.2.0-rc3 + * sum() and avg() functions missing for smallint and tinyint types (CASSANDRA-9671) + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771) +Merged from 2.1: * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) - * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Handle corrupt files on startup (CASSANDRA-9686) * Fix clientutil jar and tests (CASSANDRA-9760) * (cqlsh) Allow the SSL protocol version to be specified through the config file or environment variables (CASSANDRA-9544) Merged from 2.0: + * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Don't include auth credentials in debug log (CASSANDRA-9682) * Can't transition from write survey to normal mode (CASSANDRA-9740) - * Scrub (recover) sstables even when -Index.db is missing, (CASSANDRA-9591) + * Scrub (recover) sstables even when -Index.db is missing (CASSANDRA-9591) * Fix growing pending background compaction (CASSANDRA-9662)
[jira] [Updated] (CASSANDRA-9798) Cassandra seems to have deadlocks during flush operations
[ https://issues.apache.org/jira/browse/CASSANDRA-9798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9798: --- Reproduced In: 2.1.6 Fix Version/s: 2.1.x Cassandra seems to have deadlocks during flush operations - Key: CASSANDRA-9798 URL: https://issues.apache.org/jira/browse/CASSANDRA-9798 Project: Cassandra Issue Type: Bug Components: Core Environment: 4x HP Gen9 dl 360 servers 2x8 cpu each (Intel(R) Xeon E5-2667 v3 @ 3.20GHz) 6x900GB 10kRPM disk for data 1x900GB 10kRPM disk for commitlog 64GB ram ETH: 10Gb/s Red Hat Enterprise Linux Server release 6.6 (Santiago) 2.6.32-504.el6.x86_64 java build 1.8.0_45-b14 (openjdk) (tested on oracle java 8 too) Reporter: Łukasz Mrożkiewicz Fix For: 2.1.x Attachments: cassandra.log, cassandra.yaml, gc.log.0.current Hi, We noticed some problem with dropped mutationstages. Usually on one random node there is a situation that: MutationStage active is full, pending is increasing completed is stalled. MemtableFlushWriter active 6, pending: 25 completed: stalled MemtablePostFlush active is 1, pending 29 completed: stalled after a some time (30s-10min) pending mutations are dropped and everything is working. When it happened: 1. Cpu idle is ~95% 2. no gc long pauses or more activity. 3. memory usage 3.5GB form 8GB 4. only writes is processed by cassandra 5. when LOAD 400GB/node problems appeared 6. cassandra 2.1.6 There is gap in logs: INFO 08:47:01 Timed out replaying hints to /192.168.100.83; aborting (0 delivered) INFO 08:47:01 Enqueuing flush of hints: 7870567 (0%) on-heap, 0 (0%) off-heap INFO 08:47:30 Enqueuing flush of table1: 95301807 (4%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table1: 60462632 (3%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table2: 76973746 (4%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table1: 84290135 (4%) on-heap, 0 (0%) off-heap INFO 08:47:32 Enqueuing flush of table3: 56926652 (3%) on-heap, 0 (0%) off-heap INFO 08:47:32 Enqueuing flush of table1: 85124218 (4%) on-heap, 0 (0%) off-heap INFO 08:47:33 Enqueuing flush of table2: 95663415 (4%) on-heap, 0 (0%) off-heap INFO 08:47:58 CompactionManager 239 INFO 08:47:58 Writing Memtable-table2@1767938721(13843064 serialized bytes, 162359 ops, 4%/0% of on/off-heap l imit) INFO 08:47:58 Writing Memtable-hints@1433125911(478703 serialized bytes, 424 ops, 0%/0% of on/off-heap limit) INFO 08:47:58 Writing Memtable-table2@1318583275(11783615 serialized bytes, 137378 ops, 4%/0% of on/off-heap l imit) INFO 08:47:58 Enqueuing flush of compactions_in_progress: 969 (0%) on-heap, 0 (0%) off-heap INFO 08:47:58 Writing Memtable-table1@541175113(17221327 serialized bytes, 180792 ops, 4%/0% of on/off-heap limit) INFO 08:47:58 Writing Memtable-table1@1361154669(27138519 serialized bytes, 273472 ops, 6%/0% of on/off-hea p limit) INFO 08:48:03 2176 MUTATION messages dropped in last 5000ms use case: 100% write - 100Mb/s, couples of CF ~10column each. max cell size 100B CMS and G1GC tested - no difference -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631360#comment-14631360 ] Benedict commented on CASSANDRA-6477: - bq. We could support both approaches in terms of a new flag? I think _permitting_ faster operation with only eventual consistency guarantees is a good thing, since most users doing their own denormalisation probably get no better than that. A flag on construction? Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[5/6] cassandra git commit: Don't wrap byte arrays in SequentialWriter
Don't wrap byte arrays in SequentialWriter patch by slebresne; reviewed by snazy benedict for CASSANDRA-9797 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f60e4ad4 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f60e4ad4 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f60e4ad4 Branch: refs/heads/trunk Commit: f60e4ad4298725dac57c36da8427d992be19eb8a Parents: 22c97bc Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri Jul 17 15:39:32 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:39:32 2015 +0200 -- CHANGES.txt | 1 + .../cassandra/io/util/SequentialWriter.java | 22 ++-- 2 files changed, 21 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f60e4ad4/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 9a262dc..47d1db5 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.0-rc3 + * Don't wrap byte arrays in SequentialWriter (CASSANDRA-9797) * sum() and avg() functions missing for smallint and tinyint types (CASSANDRA-9671) * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771) Merged from 2.1: http://git-wip-us.apache.org/repos/asf/cassandra/blob/f60e4ad4/src/java/org/apache/cassandra/io/util/SequentialWriter.java -- diff --git a/src/java/org/apache/cassandra/io/util/SequentialWriter.java b/src/java/org/apache/cassandra/io/util/SequentialWriter.java index f3268a2..915133f 100644 --- a/src/java/org/apache/cassandra/io/util/SequentialWriter.java +++ b/src/java/org/apache/cassandra/io/util/SequentialWriter.java @@ -185,12 +185,30 @@ public class SequentialWriter extends OutputStream implements WritableByteChanne public void write(byte[] buffer) throws IOException { -write(ByteBuffer.wrap(buffer, 0, buffer.length)); +write(buffer, 0, buffer.length); } public void write(byte[] data, int offset, int length) throws IOException { -write(ByteBuffer.wrap(data, offset, length)); +if (buffer == null) +throw new ClosedChannelException(); + +int position = offset; +int remaining = length; +while (remaining 0) +{ +if (!buffer.hasRemaining()) +reBuffer(); + +int toCopy = Math.min(remaining, buffer.remaining()); +buffer.put(data, position, toCopy); + +remaining -= toCopy; +position += toCopy; + +isDirty = true; +syncNeeded = true; +} } public int write(ByteBuffer src) throws IOException
[jira] [Commented] (CASSANDRA-9795) Fix cqlsh dtests on windows
[ https://issues.apache.org/jira/browse/CASSANDRA-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631366#comment-14631366 ] Joshua McKenzie commented on CASSANDRA-9795: +1 on C* changes, ccm changes look to already be merged (not showing diff on github), and my only concern: on the dtests, could we manually delete the NamedTemporaryFile when we're done with it rather than leaving them littered around? Fix cqlsh dtests on windows --- Key: CASSANDRA-9795 URL: https://issues.apache.org/jira/browse/CASSANDRA-9795 Project: Cassandra Issue Type: Sub-task Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 2.2.x There are a number of portability problems with python on win32 as I've learned over the past few days. * Our use of multiprocess is broken in cqlsh for windows. https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming The code was passing self to the sub-process which on windows must be pickleable (it's not). So I refactored to be a class which is initialized in the parent. Also, when the windows process starts it needs to load our cqlsh as a module. So I moved cqlsh - cqlsh.py and added a tiny wrapper for bin/cqlsh * Our use of strftime is broken on windows The default timezone information %z in strftime isn't valid on windows. I added code to the date format parser in C* to support windows timezone labels. * We have a number of file access issues in dtest * csv import/export is broken on windows and requires all file be opened with mode 'wb' or 'rb' http://stackoverflow.com/questions/1170214/pythons-csv-writer-produces-wrong-line-terminator/1170297#1170297 * CCM's use of popen required the univeral_newline=True flag to work on windows -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9839) Move crc_check_chance out of compressions options
[ https://issues.apache.org/jira/browse/CASSANDRA-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-9839: - Labels: client-impacting docs-impacting (was: ) Move crc_check_chance out of compressions options - Key: CASSANDRA-9839 URL: https://issues.apache.org/jira/browse/CASSANDRA-9839 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Priority: Minor Labels: client-impacting, docs-impacting Fix For: 3.0.0 rc1 Follow up to CASSANDRA-8384. The option doesn't belong to compression params - it doesn't affect compression, itself, and isn't passed to compressors upon initialization. While it's true that it is (currently) only being honored when reading compressed sstables, it still doesn't belong to compression params (and is causing CASSANDRA-7978 -like issues). [~tjake] suggested we should make it an option of its own, and I think we should. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/2] cassandra git commit: Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)
Move CASSANDRA-9519 test in long tests (and reduce the size of the list used) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ef18886 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ef18886 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ef18886 Branch: refs/heads/cassandra-2.0 Commit: 0ef188869049ec6233d115f7a46c25f492e8fa42 Parents: a9b9e62 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Jul 16 15:14:54 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:35:24 2015 +0200 -- .../locator/DynamicEndpointSnitchLongTest.java | 104 +++ .../locator/DynamicEndpointSnitchTest.java | 64 2 files changed, 104 insertions(+), 64 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ef18886/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java -- diff --git a/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java new file mode 100644 index 000..1c628fa --- /dev/null +++ b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java @@ -0,0 +1,104 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one +* or more contributor license agreements. See the NOTICE file +* distributed with this work for additional information +* regarding copyright ownership. The ASF licenses this file +* to you under the Apache License, Version 2.0 (the +* License); you may not use this file except in compliance +* with the License. You may obtain a copy of the License at +* +*http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, +* software distributed under the License is distributed on an +* AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +* KIND, either express or implied. See the License for the +* specific language governing permissions and limitations +* under the License. +*/ + +package org.apache.cassandra.locator; + +import java.io.IOException; +import java.net.InetAddress; +import java.util.*; + +import org.apache.cassandra.config.DatabaseDescriptor; +import org.apache.cassandra.exceptions.ConfigurationException; +import org.apache.cassandra.service.StorageService; +import org.junit.Test; + +import org.apache.cassandra.utils.FBUtilities; + +import static org.junit.Assert.assertEquals; + +public class DynamicEndpointSnitchLongTest +{ +@Test +public void testConcurrency() throws InterruptedException, IOException, ConfigurationException +{ +// The goal of this test is to check for CASSANDRA-8448/CASSANDRA-9519 +double badness = DatabaseDescriptor.getDynamicBadnessThreshold(); +DatabaseDescriptor.setDynamicBadnessThreshold(0.0); + +try +{ +final int ITERATIONS = 1; + +// do this because SS needs to be initialized before DES can work properly. +StorageService.instance.initClient(0); +SimpleSnitch ss = new SimpleSnitch(); +DynamicEndpointSnitch dsnitch = new DynamicEndpointSnitch(ss, String.valueOf(ss.hashCode())); +InetAddress self = FBUtilities.getBroadcastAddress(); + +ListInetAddress hosts = new ArrayList(); +// We want a big list of hosts so sorting takes time, making it much more likely to reproduce the +// problem we're looking for. +for (int i = 0; i 100; i++) +for (int j = 0; j 256; j++) +hosts.add(InetAddress.getByAddress(new byte[]{127, 0, (byte)i, (byte)j})); + +ScoreUpdater updater = new ScoreUpdater(dsnitch, hosts); +updater.start(); + +ListInetAddress result = null; +for (int i = 0; i ITERATIONS; i++) +result = dsnitch.getSortedListByProximity(self, hosts); + +updater.stopped = true; +updater.join(); +} +finally +{ +DatabaseDescriptor.setDynamicBadnessThreshold(badness); +} +} + +public static class ScoreUpdater extends Thread +{ +private static final int SCORE_RANGE = 100; + +public volatile boolean stopped; + +private final DynamicEndpointSnitch dsnitch; +private final ListInetAddress hosts; +private final Random random = new Random(); + +public ScoreUpdater(DynamicEndpointSnitch dsnitch, ListInetAddress hosts) +{ +this.dsnitch = dsnitch; +this.hosts = hosts; +} + +public void run() +{ +while (!stopped) +{ +
[jira] [Created] (CASSANDRA-9839) Move crc_check_chance out of compressions options
Aleksey Yeschenko created CASSANDRA-9839: Summary: Move crc_check_chance out of compressions options Key: CASSANDRA-9839 URL: https://issues.apache.org/jira/browse/CASSANDRA-9839 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Priority: Minor Fix For: 3.0.0 rc1 Follow up to CASSANDRA-8384. The option doesn't belong to compression params - it doesn't affect compression, itself, and isn't passed to compressors upon initialization. While it's true that it is (currently) only being honored when reading compressed sstables, it still doesn't belong to compression params (and is causing CASSANDRA-7978 -like issues). [~tjake] suggested we should make it an option of its own, and I think we should. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9831) Hanging dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-9831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-9831: -- Description: This is the current list of dtests over the last week that have completely hung a test server, ending in an aborted job and incomplete test results in jenkins. I'll be excluding these tests in the cassandra-dtest repo with {{@require}} annotations (or fully excluding paging_test in trunk, I think) *2.0* metadata_reset_while_compact_test (metadata_tests.TestMetadata) *2.1* metadata_reset_while_compact_test (metadata_tests.TestMetadata) *2.2* metadata_reset_while_compact_test (metadata_tests.TestMetadata) test_ttl_deletions (paging_test.TestPagingWithDeletions) test_network_topology_strategy (consistency_test.TestAvailability) test_network_topology_strategy_counters (consistency_test.TestAccuracy) *trunk* metadata_reset_while_compact_test (metadata_tests.TestMetadata) putget_2dc_rf2_test (multidc_putget_test.TestMultiDCPutGet) test_network_topology_strategy_users (consistency_test.TestAccuracy) test_network_topology_strategy (consistency_test.TestAvailability) test_single_row_deletions (paging_test.TestPagingWithDeletions) test_with_more_results_than_page_size (paging_test.TestPagingSize) test_query_isolation (paging_test.TestPagingQueryIsolation) test_node_unavailabe_during_paging (paging_test.TestPagingDatasetChanges) test_with_no_results (paging_test.TestPagingSize) test_data_change_impacting_later_page (paging_test.TestPagingDatasetChanges) test_multiple_partition_deletions (paging_test.TestPagingWithDeletions) sstableloader_compression_snappy_to_none_test (sstable_generation_loading_test.TestSSTableGenerationAndLoading) (edit: new hung test nodes 07/16-17) *2.2* dc_repair_test (repair_test.TestRepair) wide_row_test (putget_test.TestPutGet) putget_snappy_test (putget_test.TestPutGet) *trunk* force_repair_async_1_test (deprecated_repair_test.TestDeprecatedRepairAPI) test_nested_user_types (user_types_test.TestUserTypes) sstableloader_compression_snappy_to_deflate_test (sstable_generation_loading_test.TestSSTableGenerationAndLoading) test_paging_across_multi_wide_rows (paging_test.TestPagingData) resumable_replace_test (replace_address_test.TestReplaceAddress) was: This is the current list of dtests over the last week that have completely hung a test server, ending in an aborted job and incomplete test results in jenkins. I'll be excluding these tests in the cassandra-dtest repo with {{@require}} annotations (or fully excluding paging_test in trunk, I think) *2.0* metadata_reset_while_compact_test (metadata_tests.TestMetadata) *2.1* metadata_reset_while_compact_test (metadata_tests.TestMetadata) *2.2* metadata_reset_while_compact_test (metadata_tests.TestMetadata) test_ttl_deletions (paging_test.TestPagingWithDeletions) test_network_topology_strategy (consistency_test.TestAvailability) test_network_topology_strategy_counters (consistency_test.TestAccuracy) *trunk* metadata_reset_while_compact_test (metadata_tests.TestMetadata) putget_2dc_rf2_test (multidc_putget_test.TestMultiDCPutGet) test_network_topology_strategy_users (consistency_test.TestAccuracy) test_network_topology_strategy (consistency_test.TestAvailability) test_single_row_deletions (paging_test.TestPagingWithDeletions) test_with_more_results_than_page_size (paging_test.TestPagingSize) test_query_isolation (paging_test.TestPagingQueryIsolation) test_node_unavailabe_during_paging (paging_test.TestPagingDatasetChanges) test_with_no_results (paging_test.TestPagingSize) test_data_change_impacting_later_page (paging_test.TestPagingDatasetChanges) test_multiple_partition_deletions (paging_test.TestPagingWithDeletions) sstableloader_compression_snappy_to_none_test (sstable_generation_loading_test.TestSSTableGenerationAndLoading) (edit: new hung test nodes 07/16) *2.2* dc_repair_test (repair_test.TestRepair) *trunk* force_repair_async_1_test (deprecated_repair_test.TestDeprecatedRepairAPI) test_nested_user_types (user_types_test.TestUserTypes) sstableloader_compression_snappy_to_deflate_test (sstable_generation_loading_test.TestSSTableGenerationAndLoading) test_paging_across_multi_wide_rows (paging_test.TestPagingData) Hanging dtests -- Key: CASSANDRA-9831 URL: https://issues.apache.org/jira/browse/CASSANDRA-9831 Project: Cassandra Issue Type: Bug Components: Tests Reporter: Michael Shuler Assignee: Michael Shuler Labels: test-failure This is the current list of dtests over the last week that have completely hung a test server, ending in an aborted job and incomplete test results in jenkins. I'll be excluding these tests in the cassandra-dtest repo with {{@require}} annotations (or fully excluding paging_test in trunk, I think) *2.0* metadata_reset_while_compact_test
[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size
[ https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631295#comment-14631295 ] Benedict commented on CASSANDRA-8894: - Sounds perfect, thanks. Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size -- Key: CASSANDRA-8894 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Labels: benedict-to-commit Fix For: 3.x Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml A large contributor to slower buffered reads than mmapped is likely that we read a full 64Kb at once, when average record sizes may be as low as 140 bytes on our stress tests. The TLB has only 128 entries on a modern core, and each read will touch 32 of these, meaning we are unlikely to almost ever be hitting the TLB, and will be incurring at least 30 unnecessary misses each time (as well as the other costs of larger than necessary accesses). When working with an SSD there is little to no benefit reading more than 4Kb at once, and in either case reading more data than we need is wasteful. So, I propose selecting a buffer size that is the next larger power of 2 than our average record size (with a minimum of 4Kb), so that we expect to read in one operation. I also propose that we create a pool of these buffers up-front, and that we ensure they are all exactly aligned to a virtual page, so that the source and target operations each touch exactly one virtual page per 4Kb of expected record size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/2] cassandra git commit: Fix comparison contract violation in the dynamic snitch sorting
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 1d54fc339 - 0ef188869 Fix comparison contract violation in the dynamic snitch sorting patch by slebresne; reviewed by benedict for CASSANDRA-9519 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a9b9e627 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a9b9e627 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a9b9e627 Branch: refs/heads/cassandra-2.0 Commit: a9b9e627b0256a7b55dbfefa6960e1e5b8379e64 Parents: 1d54fc3 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Jul 9 13:28:38 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:35:07 2015 +0200 -- CHANGES.txt | 1 + .../locator/DynamicEndpointSnitch.java | 34 -- .../locator/DynamicEndpointSnitchTest.java | 69 +++- 3 files changed, 95 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a755cb9..f20fad8 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.17 + * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Don't include auth credentials in debug log (CASSANDRA-9682) * Can't transition from write survey to normal mode (CASSANDRA-9740) * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java -- diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java index 3469847..f226989 100644 --- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java +++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java @@ -42,9 +42,9 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa private static final double ALPHA = 0.75; // set to 0.75 to make EDS more biased to towards the newer values private static final int WINDOW_SIZE = 100; -private int UPDATE_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicUpdateInterval(); -private int RESET_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicResetInterval(); -private double BADNESS_THRESHOLD = DatabaseDescriptor.getDynamicBadnessThreshold(); +private final int UPDATE_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicUpdateInterval(); +private final int RESET_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicResetInterval(); +private final double BADNESS_THRESHOLD = DatabaseDescriptor.getDynamicBadnessThreshold(); // the score for a merged set of endpoints must be this much worse than the score for separate endpoints to // warrant not merging two ranges into a single range @@ -154,7 +154,18 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa private void sortByProximityWithScore(final InetAddress address, ListInetAddress addresses) { -super.sortByProximity(address, addresses); +// Scores can change concurrently from a call to this method. But Collections.sort() expects +// its comparator to be stable, that is 2 endpoint should compare the same way for the duration +// of the sort() call. As we copy the scores map on write, it is thus enough to alias the current +// version of it during this call. +final HashMapInetAddress, Double scores = this.scores; +Collections.sort(addresses, new ComparatorInetAddress() +{ +public int compare(InetAddress a1, InetAddress a2) +{ +return compareEndpoints(address, a1, a2, scores); +} +}); } private void sortByProximityWithBadness(final InetAddress address, ListInetAddress addresses) @@ -163,6 +174,8 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa return; subsnitch.sortByProximity(address, addresses); +HashMapInetAddress, Double scores = this.scores; // Make sure the score don't change in the middle of the loop below + // (which wouldn't really matter here but its cleaner that way). ArrayListDouble subsnitchOrderedScores = new ArrayList(addresses.size()); for (InetAddress inet : addresses) { @@ -189,7 +202,8 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa } } -public int
[1/3] cassandra git commit: Fix comparison contract violation in the dynamic snitch sorting
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 1eda7cb55 - 2d462c049 Fix comparison contract violation in the dynamic snitch sorting patch by slebresne; reviewed by benedict for CASSANDRA-9519 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a9b9e627 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a9b9e627 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a9b9e627 Branch: refs/heads/cassandra-2.1 Commit: a9b9e627b0256a7b55dbfefa6960e1e5b8379e64 Parents: 1d54fc3 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Jul 9 13:28:38 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:35:07 2015 +0200 -- CHANGES.txt | 1 + .../locator/DynamicEndpointSnitch.java | 34 -- .../locator/DynamicEndpointSnitchTest.java | 69 +++- 3 files changed, 95 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a755cb9..f20fad8 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.17 + * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Don't include auth credentials in debug log (CASSANDRA-9682) * Can't transition from write survey to normal mode (CASSANDRA-9740) * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java -- diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java index 3469847..f226989 100644 --- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java +++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java @@ -42,9 +42,9 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa private static final double ALPHA = 0.75; // set to 0.75 to make EDS more biased to towards the newer values private static final int WINDOW_SIZE = 100; -private int UPDATE_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicUpdateInterval(); -private int RESET_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicResetInterval(); -private double BADNESS_THRESHOLD = DatabaseDescriptor.getDynamicBadnessThreshold(); +private final int UPDATE_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicUpdateInterval(); +private final int RESET_INTERVAL_IN_MS = DatabaseDescriptor.getDynamicResetInterval(); +private final double BADNESS_THRESHOLD = DatabaseDescriptor.getDynamicBadnessThreshold(); // the score for a merged set of endpoints must be this much worse than the score for separate endpoints to // warrant not merging two ranges into a single range @@ -154,7 +154,18 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa private void sortByProximityWithScore(final InetAddress address, ListInetAddress addresses) { -super.sortByProximity(address, addresses); +// Scores can change concurrently from a call to this method. But Collections.sort() expects +// its comparator to be stable, that is 2 endpoint should compare the same way for the duration +// of the sort() call. As we copy the scores map on write, it is thus enough to alias the current +// version of it during this call. +final HashMapInetAddress, Double scores = this.scores; +Collections.sort(addresses, new ComparatorInetAddress() +{ +public int compare(InetAddress a1, InetAddress a2) +{ +return compareEndpoints(address, a1, a2, scores); +} +}); } private void sortByProximityWithBadness(final InetAddress address, ListInetAddress addresses) @@ -163,6 +174,8 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa return; subsnitch.sortByProximity(address, addresses); +HashMapInetAddress, Double scores = this.scores; // Make sure the score don't change in the middle of the loop below + // (which wouldn't really matter here but its cleaner that way). ArrayListDouble subsnitchOrderedScores = new ArrayList(addresses.size()); for (InetAddress inet : addresses) { @@ -189,7 +202,8 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa } } -public int
[3/3] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2d462c04 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2d462c04 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2d462c04 Branch: refs/heads/cassandra-2.1 Commit: 2d462c04973a15e84ca550ce3913d08d7c5ee8c8 Parents: 1eda7cb 0ef1888 Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri Jul 17 15:36:24 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:36:24 2015 +0200 -- CHANGES.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/2d462c04/CHANGES.txt -- diff --cc CHANGES.txt index 49cc850,f20fad8..c6774c2 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,32 -1,7 +1,32 @@@ -2.0.17 +2.1.9 + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) - * Complete CASSANDRA-8448 fix (CASSANDRA-9519) + * Handle corrupt files on startup (CASSANDRA-9686) + * Fix clientutil jar and tests (CASSANDRA-9760) + * (cqlsh) Allow the SSL protocol version to be specified through the + config file or environment variables (CASSANDRA-9544) +Merged from 2.0: + * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Don't include auth credentials in debug log (CASSANDRA-9682) * Can't transition from write survey to normal mode (CASSANDRA-9740) + * Scrub (recover) sstables even when -Index.db is missing, (CASSANDRA-9591) + * Fix growing pending background compaction (CASSANDRA-9662) + + +2.1.8 + * (cqlsh) Fix bad check for CQL compatibility when DESCRIBE'ing + COMPACT STORAGE tables with no clustering columns + * Warn when an extra-large partition is compacted (CASSANDRA-9643) + * Eliminate strong self-reference chains in sstable ref tidiers (CASSANDRA-9656) + * Ensure StreamSession uses canonical sstable reader instances (CASSANDRA-9700) + * Ensure memtable book keeping is not corrupted in the event we shrink usage (CASSANDRA-9681) + * Update internal python driver for cqlsh (CASSANDRA-9064) + * Fix IndexOutOfBoundsException when inserting tuple with too many + elements using the string literal notation (CASSANDRA-9559) + * Allow JMX over SSL directly from nodetool (CASSANDRA-9090) + * Fix incorrect result for IN queries where column not found (CASSANDRA-9540) + * Enable describe on indices (CASSANDRA-7814) + * ColumnFamilyStore.selectAndReference may block during compaction (CASSANDRA-9637) +Merged from 2.0: * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727) * Add listen_address to system.local (CASSANDRA-9603) * Bug fixes to resultset metadata construction (CASSANDRA-9636)
[2/3] cassandra git commit: Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)
Move CASSANDRA-9519 test in long tests (and reduce the size of the list used) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ef18886 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ef18886 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ef18886 Branch: refs/heads/cassandra-2.1 Commit: 0ef188869049ec6233d115f7a46c25f492e8fa42 Parents: a9b9e62 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Jul 16 15:14:54 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 15:35:24 2015 +0200 -- .../locator/DynamicEndpointSnitchLongTest.java | 104 +++ .../locator/DynamicEndpointSnitchTest.java | 64 2 files changed, 104 insertions(+), 64 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ef18886/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java -- diff --git a/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java new file mode 100644 index 000..1c628fa --- /dev/null +++ b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java @@ -0,0 +1,104 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one +* or more contributor license agreements. See the NOTICE file +* distributed with this work for additional information +* regarding copyright ownership. The ASF licenses this file +* to you under the Apache License, Version 2.0 (the +* License); you may not use this file except in compliance +* with the License. You may obtain a copy of the License at +* +*http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, +* software distributed under the License is distributed on an +* AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +* KIND, either express or implied. See the License for the +* specific language governing permissions and limitations +* under the License. +*/ + +package org.apache.cassandra.locator; + +import java.io.IOException; +import java.net.InetAddress; +import java.util.*; + +import org.apache.cassandra.config.DatabaseDescriptor; +import org.apache.cassandra.exceptions.ConfigurationException; +import org.apache.cassandra.service.StorageService; +import org.junit.Test; + +import org.apache.cassandra.utils.FBUtilities; + +import static org.junit.Assert.assertEquals; + +public class DynamicEndpointSnitchLongTest +{ +@Test +public void testConcurrency() throws InterruptedException, IOException, ConfigurationException +{ +// The goal of this test is to check for CASSANDRA-8448/CASSANDRA-9519 +double badness = DatabaseDescriptor.getDynamicBadnessThreshold(); +DatabaseDescriptor.setDynamicBadnessThreshold(0.0); + +try +{ +final int ITERATIONS = 1; + +// do this because SS needs to be initialized before DES can work properly. +StorageService.instance.initClient(0); +SimpleSnitch ss = new SimpleSnitch(); +DynamicEndpointSnitch dsnitch = new DynamicEndpointSnitch(ss, String.valueOf(ss.hashCode())); +InetAddress self = FBUtilities.getBroadcastAddress(); + +ListInetAddress hosts = new ArrayList(); +// We want a big list of hosts so sorting takes time, making it much more likely to reproduce the +// problem we're looking for. +for (int i = 0; i 100; i++) +for (int j = 0; j 256; j++) +hosts.add(InetAddress.getByAddress(new byte[]{127, 0, (byte)i, (byte)j})); + +ScoreUpdater updater = new ScoreUpdater(dsnitch, hosts); +updater.start(); + +ListInetAddress result = null; +for (int i = 0; i ITERATIONS; i++) +result = dsnitch.getSortedListByProximity(self, hosts); + +updater.stopped = true; +updater.join(); +} +finally +{ +DatabaseDescriptor.setDynamicBadnessThreshold(badness); +} +} + +public static class ScoreUpdater extends Thread +{ +private static final int SCORE_RANGE = 100; + +public volatile boolean stopped; + +private final DynamicEndpointSnitch dsnitch; +private final ListInetAddress hosts; +private final Random random = new Random(); + +public ScoreUpdater(DynamicEndpointSnitch dsnitch, ListInetAddress hosts) +{ +this.dsnitch = dsnitch; +this.hosts = hosts; +} + +public void run() +{ +while (!stopped) +{ +
[jira] [Commented] (CASSANDRA-9838) Unable to update an element in a static list
[ https://issues.apache.org/jira/browse/CASSANDRA-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631429#comment-14631429 ] Tyler Hobbs commented on CASSANDRA-9838: I think we recently changed the error message around that, which would explain why you are seeing a slightly different error. Regardless, this should be working (especially since the same thing works for non-static columns), so it's definitely a bug. Unable to update an element in a static list Key: CASSANDRA-9838 URL: https://issues.apache.org/jira/browse/CASSANDRA-9838 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5 on Linux Reporter: Mahesh Datt Fix For: 2.1.x I created a table in cassandra (my_table) which has a static list column sizes_list. I created a new row and initialized the list sizes_list as having one element. {{UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01}} Now I m trying to update the element at index '0' with a statement like this {code}insert into my_table (my_id, is_deleted , col_id1, col_id2) values (0x01, False, 0x00, 0x00); UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ; {code} Now I see an error like this: {{InvalidRequest: code=2200 [Invalid query] message=List index 0 out of bound, list has size 0}} If I change my list to a non-static list, it works fine! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9798) Cassandra seems to have deadlocks during flush operations
[ https://issues.apache.org/jira/browse/CASSANDRA-9798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9798: --- Description: Hi, We noticed some problem with dropped mutationstages. Usually on one random node there is a situation that: MutationStage active is full, pending is increasing completed is stalled. MemtableFlushWriter active 6, pending: 25 completed: stalled MemtablePostFlush active is 1, pending 29 completed: stalled after a some time (30s-10min) pending mutations are dropped and everything is working. When it happened: 1. Cpu idle is ~95% 2. no gc long pauses or more activity. 3. memory usage 3.5GB form 8GB 4. only writes is processed by cassandra 5. when LOAD 400GB/node problems appeared 6. cassandra 2.1.6 There is gap in logs: {code} INFO 08:47:01 Timed out replaying hints to /192.168.100.83; aborting (0 delivered) INFO 08:47:01 Enqueuing flush of hints: 7870567 (0%) on-heap, 0 (0%) off-heap INFO 08:47:30 Enqueuing flush of table1: 95301807 (4%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table1: 60462632 (3%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table2: 76973746 (4%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table1: 84290135 (4%) on-heap, 0 (0%) off-heap INFO 08:47:32 Enqueuing flush of table3: 56926652 (3%) on-heap, 0 (0%) off-heap INFO 08:47:32 Enqueuing flush of table1: 85124218 (4%) on-heap, 0 (0%) off-heap INFO 08:47:33 Enqueuing flush of table2: 95663415 (4%) on-heap, 0 (0%) off-heap INFO 08:47:58 CompactionManager 239 INFO 08:47:58 Writing Memtable-table2@1767938721(13843064 serialized bytes, 162359 ops, 4%/0% of on/off-heap l imit) INFO 08:47:58 Writing Memtable-hints@1433125911(478703 serialized bytes, 424 ops, 0%/0% of on/off-heap limit) INFO 08:47:58 Writing Memtable-table2@1318583275(11783615 serialized bytes, 137378 ops, 4%/0% of on/off-heap l imit) INFO 08:47:58 Enqueuing flush of compactions_in_progress: 969 (0%) on-heap, 0 (0%) off-heap INFO 08:47:58 Writing Memtable-table1@541175113(17221327 serialized bytes, 180792 ops, 4%/0% of on/off-heap limit) INFO 08:47:58 Writing Memtable-table1@1361154669(27138519 serialized bytes, 273472 ops, 6%/0% of on/off-hea p limit) INFO 08:48:03 2176 MUTATION messages dropped in last 5000ms {code} use case: 100% write - 100Mb/s, couples of CF ~10column each. max cell size 100B CMS and G1GC tested - no difference was: Hi, We noticed some problem with dropped mutationstages. Usually on one random node there is a situation that: MutationStage active is full, pending is increasing completed is stalled. MemtableFlushWriter active 6, pending: 25 completed: stalled MemtablePostFlush active is 1, pending 29 completed: stalled after a some time (30s-10min) pending mutations are dropped and everything is working. When it happened: 1. Cpu idle is ~95% 2. no gc long pauses or more activity. 3. memory usage 3.5GB form 8GB 4. only writes is processed by cassandra 5. when LOAD 400GB/node problems appeared 6. cassandra 2.1.6 There is gap in logs: INFO 08:47:01 Timed out replaying hints to /192.168.100.83; aborting (0 delivered) INFO 08:47:01 Enqueuing flush of hints: 7870567 (0%) on-heap, 0 (0%) off-heap INFO 08:47:30 Enqueuing flush of table1: 95301807 (4%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table1: 60462632 (3%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table2: 76973746 (4%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table1: 84290135 (4%) on-heap, 0 (0%) off-heap INFO 08:47:32 Enqueuing flush of table3: 56926652 (3%) on-heap, 0 (0%) off-heap INFO 08:47:32 Enqueuing flush of table1: 85124218 (4%) on-heap, 0 (0%) off-heap INFO 08:47:33 Enqueuing flush of table2: 95663415 (4%) on-heap, 0 (0%) off-heap INFO 08:47:58 CompactionManager 239 INFO 08:47:58 Writing Memtable-table2@1767938721(13843064 serialized bytes, 162359 ops, 4%/0% of on/off-heap l imit) INFO 08:47:58 Writing Memtable-hints@1433125911(478703 serialized bytes, 424 ops, 0%/0% of on/off-heap limit) INFO 08:47:58 Writing Memtable-table2@1318583275(11783615 serialized bytes, 137378 ops, 4%/0% of on/off-heap l imit) INFO 08:47:58 Enqueuing flush of compactions_in_progress: 969 (0%) on-heap, 0 (0%) off-heap INFO 08:47:58 Writing Memtable-table1@541175113(17221327 serialized bytes, 180792 ops, 4%/0% of on/off-heap limit) INFO 08:47:58 Writing Memtable-table1@1361154669(27138519 serialized bytes, 273472 ops, 6%/0% of on/off-hea p limit) INFO 08:48:03 2176 MUTATION messages dropped in last 5000ms use case: 100% write - 100Mb/s, couples of CF ~10column each. max cell size 100B CMS and G1GC tested - no difference Cassandra seems to have deadlocks during flush operations - Key: CASSANDRA-9798
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631435#comment-14631435 ] T Jake Luciani commented on CASSANDRA-6477: --- 1. This goes way back to benedicts main concept https://issues.apache.org/jira/browse/CASSANDRA-6477?focusedCommentId=14039757page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14039757 We have each replica on the base table send the mv update to a single mv replica. So replicas are paired 1:1 2. Since the coordinator is a BL against a QUORUM of all base replicas which will always send to MV replicas we have a lot more work todo than a only sending a failed base to view update. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631436#comment-14631436 ] Sylvain Lebresne commented on CASSANDRA-6477: - bq. Pedantically you are correct. Which is why I said effectively and not literally. Well, I mean, CL is always more about when does we answer the client than what amount of work we do internally. Every write is always written to every replica for instance, the CL is just a matter of how long we wait before answering the client. I'm arguing this is very exactly the case here too. Anyway, your the other side of that coin made it sounds like we were doing unusual regarding the CL, something that may not be desirable. I don't understand what that is if that's the case. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631440#comment-14631440 ] Sylvain Lebresne commented on CASSANDRA-6477: - I don't really think the cost of replaying the coordinator BL matters that much. We'll only if less than a quorum of node don't answer a particular query, which should be pretty rare unless you have bigger problems with your cluster. And given the local BL has a cost on every write, even if small, I don't think that from a performance perspective a local BL is a win. That said, I hadn't seen we'd decided to go with pairing of base replica to MV replica. Doing so does justify a local BL (another option has always been to fan out to every MV replica, and since this ticket desperately miss a good description of what exact algorithm is actually implemented, I wasn't sure which option we went with). Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9798) Cassandra seems to have deadlocks during flush operations
[ https://issues.apache.org/jira/browse/CASSANDRA-9798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631442#comment-14631442 ] Philip Thompson commented on CASSANDRA-9798: Have you encountered the error described in CASSANDRA-7275 in your logs at all? Cassandra seems to have deadlocks during flush operations - Key: CASSANDRA-9798 URL: https://issues.apache.org/jira/browse/CASSANDRA-9798 Project: Cassandra Issue Type: Bug Components: Core Environment: 4x HP Gen9 dl 360 servers 2x8 cpu each (Intel(R) Xeon E5-2667 v3 @ 3.20GHz) 6x900GB 10kRPM disk for data 1x900GB 10kRPM disk for commitlog 64GB ram ETH: 10Gb/s Red Hat Enterprise Linux Server release 6.6 (Santiago) 2.6.32-504.el6.x86_64 java build 1.8.0_45-b14 (openjdk) (tested on oracle java 8 too) Reporter: Łukasz Mrożkiewicz Fix For: 2.1.x Attachments: cassandra.log, cassandra.yaml, gc.log.0.current Hi, We noticed some problem with dropped mutationstages. Usually on one random node there is a situation that: MutationStage active is full, pending is increasing completed is stalled. MemtableFlushWriter active 6, pending: 25 completed: stalled MemtablePostFlush active is 1, pending 29 completed: stalled after a some time (30s-10min) pending mutations are dropped and everything is working. When it happened: 1. Cpu idle is ~95% 2. no gc long pauses or more activity. 3. memory usage 3.5GB form 8GB 4. only writes is processed by cassandra 5. when LOAD 400GB/node problems appeared 6. cassandra 2.1.6 There is gap in logs: {code} INFO 08:47:01 Timed out replaying hints to /192.168.100.83; aborting (0 delivered) INFO 08:47:01 Enqueuing flush of hints: 7870567 (0%) on-heap, 0 (0%) off-heap INFO 08:47:30 Enqueuing flush of table1: 95301807 (4%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table1: 60462632 (3%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table2: 76973746 (4%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table1: 84290135 (4%) on-heap, 0 (0%) off-heap INFO 08:47:32 Enqueuing flush of table3: 56926652 (3%) on-heap, 0 (0%) off-heap INFO 08:47:32 Enqueuing flush of table1: 85124218 (4%) on-heap, 0 (0%) off-heap INFO 08:47:33 Enqueuing flush of table2: 95663415 (4%) on-heap, 0 (0%) off-heap INFO 08:47:58 CompactionManager 239 INFO 08:47:58 Writing Memtable-table2@1767938721(13843064 serialized bytes, 162359 ops, 4%/0% of on/off-heap l imit) INFO 08:47:58 Writing Memtable-hints@1433125911(478703 serialized bytes, 424 ops, 0%/0% of on/off-heap limit) INFO 08:47:58 Writing Memtable-table2@1318583275(11783615 serialized bytes, 137378 ops, 4%/0% of on/off-heap l imit) INFO 08:47:58 Enqueuing flush of compactions_in_progress: 969 (0%) on-heap, 0 (0%) off-heap INFO 08:47:58 Writing Memtable-table1@541175113(17221327 serialized bytes, 180792 ops, 4%/0% of on/off-heap limit) INFO 08:47:58 Writing Memtable-table1@1361154669(27138519 serialized bytes, 273472 ops, 6%/0% of on/off-hea p limit) INFO 08:48:03 2176 MUTATION messages dropped in last 5000ms {code} use case: 100% write - 100Mb/s, couples of CF ~10column each. max cell size 100B CMS and G1GC tested - no difference -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9798) Cassandra seems to have deadlocks during flush operations
[ https://issues.apache.org/jira/browse/CASSANDRA-9798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631444#comment-14631444 ] Benedict commented on CASSANDRA-9798: - [http://www.infoq.com/news/2015/05/redhat-futex] There is a serious bug in some latest kernels, that is more common to encounter under certain CPUs. The result is lost thread signals, and hence stalled threads. GC may result in these wakeups being received, which could explain why the resolution corresponds with the mutation dropped message (which occurs when under GC load) Cassandra seems to have deadlocks during flush operations - Key: CASSANDRA-9798 URL: https://issues.apache.org/jira/browse/CASSANDRA-9798 Project: Cassandra Issue Type: Bug Components: Core Environment: 4x HP Gen9 dl 360 servers 2x8 cpu each (Intel(R) Xeon E5-2667 v3 @ 3.20GHz) 6x900GB 10kRPM disk for data 1x900GB 10kRPM disk for commitlog 64GB ram ETH: 10Gb/s Red Hat Enterprise Linux Server release 6.6 (Santiago) 2.6.32-504.el6.x86_64 java build 1.8.0_45-b14 (openjdk) (tested on oracle java 8 too) Reporter: Łukasz Mrożkiewicz Fix For: 2.1.x Attachments: cassandra.log, cassandra.yaml, gc.log.0.current Hi, We noticed some problem with dropped mutationstages. Usually on one random node there is a situation that: MutationStage active is full, pending is increasing completed is stalled. MemtableFlushWriter active 6, pending: 25 completed: stalled MemtablePostFlush active is 1, pending 29 completed: stalled after a some time (30s-10min) pending mutations are dropped and everything is working. When it happened: 1. Cpu idle is ~95% 2. no gc long pauses or more activity. 3. memory usage 3.5GB form 8GB 4. only writes is processed by cassandra 5. when LOAD 400GB/node problems appeared 6. cassandra 2.1.6 There is gap in logs: {code} INFO 08:47:01 Timed out replaying hints to /192.168.100.83; aborting (0 delivered) INFO 08:47:01 Enqueuing flush of hints: 7870567 (0%) on-heap, 0 (0%) off-heap INFO 08:47:30 Enqueuing flush of table1: 95301807 (4%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table1: 60462632 (3%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table2: 76973746 (4%) on-heap, 0 (0%) off-heap INFO 08:47:31 Enqueuing flush of table1: 84290135 (4%) on-heap, 0 (0%) off-heap INFO 08:47:32 Enqueuing flush of table3: 56926652 (3%) on-heap, 0 (0%) off-heap INFO 08:47:32 Enqueuing flush of table1: 85124218 (4%) on-heap, 0 (0%) off-heap INFO 08:47:33 Enqueuing flush of table2: 95663415 (4%) on-heap, 0 (0%) off-heap INFO 08:47:58 CompactionManager 239 INFO 08:47:58 Writing Memtable-table2@1767938721(13843064 serialized bytes, 162359 ops, 4%/0% of on/off-heap l imit) INFO 08:47:58 Writing Memtable-hints@1433125911(478703 serialized bytes, 424 ops, 0%/0% of on/off-heap limit) INFO 08:47:58 Writing Memtable-table2@1318583275(11783615 serialized bytes, 137378 ops, 4%/0% of on/off-heap l imit) INFO 08:47:58 Enqueuing flush of compactions_in_progress: 969 (0%) on-heap, 0 (0%) off-heap INFO 08:47:58 Writing Memtable-table1@541175113(17221327 serialized bytes, 180792 ops, 4%/0% of on/off-heap limit) INFO 08:47:58 Writing Memtable-table1@1361154669(27138519 serialized bytes, 273472 ops, 6%/0% of on/off-hea p limit) INFO 08:48:03 2176 MUTATION messages dropped in last 5000ms {code} use case: 100% write - 100Mb/s, couples of CF ~10column each. max cell size 100B CMS and G1GC tested - no difference -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631446#comment-14631446 ] Sylvain Lebresne commented on CASSANDRA-6477: - bq. why not also pair it with RF-2 (or 1, and only support RF=3 for now) partners, to whom it requires the first write to be propagated, without which it does not acknowledge? This could be done with a specialised batchlog write, that goes to the local node and the paired MV node. I _think_ I get a vague idea of what you mean but I'm not fully sure (and I'm not fully sure it's practical). So lets first make sure I understand. Is the suggestion that to guarantee that if base-table replica applies an update, then RF/2 other ones also do it, we'd send the update to all base table replicas normally (without coordinator batchlog), but each replica would 1) write the update to a local-only batchlog and 2) forward the update to RF/2 other base table replicas? Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631459#comment-14631459 ] T Jake Luciani commented on CASSANDRA-6477: --- bq. We'll only if less than a quorum of node don't answer a particular query, which should be pretty rare unless you have bigger problems with your cluster. bq. That said, I hadn't seen we'd decided to go with pairing of base replica to MV replica. If we replicate to every MV replica from every base replica the write amplification gets much worse causing more timeouts. So it makes sense to have replication paired. I do think waiting for the MV updates to be synchronous will cause a lot more timeouts and write latency (on top of what we have now). But if it's optional then people can choose. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631466#comment-14631466 ] Jonathan Ellis commented on CASSANDRA-6477: --- No, you're right. Synchronous MV updates is a terrible idea, which is more obvious when considering the case of more than one MV. In the extreme case you could touch every node in the cluster. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631472#comment-14631472 ] Sylvain Lebresne commented on CASSANDRA-6477: - bq. So it makes sense to have replication paired. Sure, didn't implied otherwise, I just wasn't aware we were doing it. bq. I do think waiting for the MV updates to be synchronous will cause a lot more timeouts and write latency (on top of what we have now).But if it's optional then people can choose. Frankly, I'm pretty negative on adding such option. I think there is some basic guarantees that shouldn't be optional, and the CL ones are amongst those. Making it optional will have people shoot themselves in the foo all the time. At the very least, I would aks that we don't include such option on this ticket (there is enough stuff to deal with) and open a separate ticket to discuss it (one on which we'd actually benchmark thinks before assuming there will be timeouts). Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631471#comment-14631471 ] Jonathan Ellis commented on CASSANDRA-6477: --- If there are multiple MVs being updated, do they get merged into a single set of batchlogs? (I.e. Just one on coordinator, one on each base replica, instead of one per MV.) Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9837) Bad logging interpolation string in Memtable:
[ https://issues.apache.org/jira/browse/CASSANDRA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630846#comment-14630846 ] Robert Stupp commented on CASSANDRA-9837: - [~mckibben] thanks for the patch! Will review it soon. Bad logging interpolation string in Memtable: -- Key: CASSANDRA-9837 URL: https://issues.apache.org/jira/browse/CASSANDRA-9837 Project: Cassandra Issue Type: Bug Components: Core Reporter: Michael McKibben Priority: Trivial Attachments: trunk-9837.txt Notice the following non-interpolated log entry showing up in our logs Completed flushing %s. Looking at the source it appears to be a mix between logback style {} substitution vs String.format %s style. Attaching a trivial patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/6] cassandra git commit: Fix broken logging for empty flushes in Memtable
Fix broken logging for empty flushes in Memtable patch by Michael McKibben; reviewed by Robert Stupp for CASSANDRA-9837 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1eda7cb5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1eda7cb5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1eda7cb5 Branch: refs/heads/trunk Commit: 1eda7cb55c5876046cbc3f4ace3c7812ca032f69 Parents: 4fcd7d4 Author: Michael McKibben mikemckib...@gmail.com Authored: Fri Jul 17 08:36:00 2015 +0200 Committer: Robert Stupp sn...@snazy.de Committed: Fri Jul 17 08:36:12 2015 +0200 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/db/Memtable.java | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index e950f3b..49cc850 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.9 + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Handle corrupt files on startup (CASSANDRA-9686) * Fix clientutil jar and tests (CASSANDRA-9760) http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/src/java/org/apache/cassandra/db/Memtable.java -- diff --git a/src/java/org/apache/cassandra/db/Memtable.java b/src/java/org/apache/cassandra/db/Memtable.java index 9f6cf9b..375195f 100644 --- a/src/java/org/apache/cassandra/db/Memtable.java +++ b/src/java/org/apache/cassandra/db/Memtable.java @@ -390,7 +390,7 @@ public class Memtable } else { -logger.info(Completed flushing %s; nothing needed to be retained. Commitlog position was {}, +logger.info(Completed flushing {}; nothing needed to be retained. Commitlog position was {}, writer.getFilename(), context); writer.abort(); ssTable = null;
[6/6] cassandra git commit: Merge branch 'cassandra-2.2' into trunk
Merge branch 'cassandra-2.2' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f5f3ae1d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f5f3ae1d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f5f3ae1d Branch: refs/heads/trunk Commit: f5f3ae1da45d633c5eb03b3fe760b4e866dca9d7 Parents: 689582c f74419c Author: Robert Stupp sn...@snazy.de Authored: Fri Jul 17 08:37:49 2015 +0200 Committer: Robert Stupp sn...@snazy.de Committed: Fri Jul 17 08:37:49 2015 +0200 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/db/Memtable.java | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f5f3ae1d/CHANGES.txt -- diff --cc CHANGES.txt index 4e2c22e,e6c093d..76d6e92 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,27 -1,8 +1,28 @@@ +3.0 + * Metrics should use up to date nomenclature (CASSANDRA-9448) + * Change CREATE/ALTER TABLE syntax for compression (CASSANDRA-8384) + * Cleanup crc and adler code for java 8 (CASSANDRA-9650) + * Storage engine refactor (CASSANDRA-8099, 9743, 9746, 9759, 9781, 9808) + * Update Guava to 18.0 (CASSANDRA-9653) + * Bloom filter false positive ratio is not honoured (CASSANDRA-8413) + * New option for cassandra-stress to leave a ratio of columns null (CASSANDRA-9522) + * Change hinted_handoff_enabled yaml setting, JMX (CASSANDRA-9035) + * Add algorithmic token allocation (CASSANDRA-7032) + * Add nodetool command to replay batchlog (CASSANDRA-9547) + * Make file buffer cache independent of paths being read (CASSANDRA-8897) + * Remove deprecated legacy Hadoop code (CASSANDRA-9353) + * Decommissioned nodes will not rejoin the cluster (CASSANDRA-8801) + * Change gossip stabilization to use endpoit size (CASSANDRA-9401) + * Change default garbage collector to G1 (CASSANDRA-7486) + * Populate TokenMetadata early during startup (CASSANDRA-9317) + * undeprecate cache recentHitRate (CASSANDRA-6591) + * Add support for selectively varint encoding fields (CASSANDRA-9499) + + 2.2.0-rc3 - * sum() and avg() functions missing for smallint and tinyint types (CASSANDRA-9671) * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771) Merged from 2.1: + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Handle corrupt files on startup (CASSANDRA-9686) * Fix clientutil jar and tests (CASSANDRA-9760) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f5f3ae1d/src/java/org/apache/cassandra/db/Memtable.java --
[5/6] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f74419cd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f74419cd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f74419cd Branch: refs/heads/cassandra-2.2 Commit: f74419cd2b13c3c8fe01d09df16f7edae583fe35 Parents: 2b99b5d 1eda7cb Author: Robert Stupp sn...@snazy.de Authored: Fri Jul 17 08:37:37 2015 +0200 Committer: Robert Stupp sn...@snazy.de Committed: Fri Jul 17 08:37:37 2015 +0200 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/db/Memtable.java | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f74419cd/CHANGES.txt -- diff --cc CHANGES.txt index b4ea4b4,49cc850..e6c093d --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,7 -1,5 +1,8 @@@ -2.1.9 +2.2.0-rc3 + * sum() and avg() functions missing for smallint and tinyint types (CASSANDRA-9671) + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771) +Merged from 2.1: + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Handle corrupt files on startup (CASSANDRA-9686) * Fix clientutil jar and tests (CASSANDRA-9760) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f74419cd/src/java/org/apache/cassandra/db/Memtable.java --
[1/6] cassandra git commit: Fix broken logging for empty flushes in Memtable
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 4fcd7d4d3 - 1eda7cb55 refs/heads/cassandra-2.2 2b99b5d35 - f74419cd2 refs/heads/trunk 689582c04 - f5f3ae1da Fix broken logging for empty flushes in Memtable patch by Michael McKibben; reviewed by Robert Stupp for CASSANDRA-9837 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1eda7cb5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1eda7cb5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1eda7cb5 Branch: refs/heads/cassandra-2.1 Commit: 1eda7cb55c5876046cbc3f4ace3c7812ca032f69 Parents: 4fcd7d4 Author: Michael McKibben mikemckib...@gmail.com Authored: Fri Jul 17 08:36:00 2015 +0200 Committer: Robert Stupp sn...@snazy.de Committed: Fri Jul 17 08:36:12 2015 +0200 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/db/Memtable.java | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index e950f3b..49cc850 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.9 + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Handle corrupt files on startup (CASSANDRA-9686) * Fix clientutil jar and tests (CASSANDRA-9760) http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/src/java/org/apache/cassandra/db/Memtable.java -- diff --git a/src/java/org/apache/cassandra/db/Memtable.java b/src/java/org/apache/cassandra/db/Memtable.java index 9f6cf9b..375195f 100644 --- a/src/java/org/apache/cassandra/db/Memtable.java +++ b/src/java/org/apache/cassandra/db/Memtable.java @@ -390,7 +390,7 @@ public class Memtable } else { -logger.info(Completed flushing %s; nothing needed to be retained. Commitlog position was {}, +logger.info(Completed flushing {}; nothing needed to be retained. Commitlog position was {}, writer.getFilename(), context); writer.abort(); ssTable = null;
[4/6] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f74419cd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f74419cd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f74419cd Branch: refs/heads/trunk Commit: f74419cd2b13c3c8fe01d09df16f7edae583fe35 Parents: 2b99b5d 1eda7cb Author: Robert Stupp sn...@snazy.de Authored: Fri Jul 17 08:37:37 2015 +0200 Committer: Robert Stupp sn...@snazy.de Committed: Fri Jul 17 08:37:37 2015 +0200 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/db/Memtable.java | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f74419cd/CHANGES.txt -- diff --cc CHANGES.txt index b4ea4b4,49cc850..e6c093d --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,7 -1,5 +1,8 @@@ -2.1.9 +2.2.0-rc3 + * sum() and avg() functions missing for smallint and tinyint types (CASSANDRA-9671) + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771) +Merged from 2.1: + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Handle corrupt files on startup (CASSANDRA-9686) * Fix clientutil jar and tests (CASSANDRA-9760) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f74419cd/src/java/org/apache/cassandra/db/Memtable.java --
[2/6] cassandra git commit: Fix broken logging for empty flushes in Memtable
Fix broken logging for empty flushes in Memtable patch by Michael McKibben; reviewed by Robert Stupp for CASSANDRA-9837 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1eda7cb5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1eda7cb5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1eda7cb5 Branch: refs/heads/cassandra-2.2 Commit: 1eda7cb55c5876046cbc3f4ace3c7812ca032f69 Parents: 4fcd7d4 Author: Michael McKibben mikemckib...@gmail.com Authored: Fri Jul 17 08:36:00 2015 +0200 Committer: Robert Stupp sn...@snazy.de Committed: Fri Jul 17 08:36:12 2015 +0200 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/db/Memtable.java | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index e950f3b..49cc850 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.9 + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837) * Complete CASSANDRA-8448 fix (CASSANDRA-9519) * Handle corrupt files on startup (CASSANDRA-9686) * Fix clientutil jar and tests (CASSANDRA-9760) http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/src/java/org/apache/cassandra/db/Memtable.java -- diff --git a/src/java/org/apache/cassandra/db/Memtable.java b/src/java/org/apache/cassandra/db/Memtable.java index 9f6cf9b..375195f 100644 --- a/src/java/org/apache/cassandra/db/Memtable.java +++ b/src/java/org/apache/cassandra/db/Memtable.java @@ -390,7 +390,7 @@ public class Memtable } else { -logger.info(Completed flushing %s; nothing needed to be retained. Commitlog position was {}, +logger.info(Completed flushing {}; nothing needed to be retained. Commitlog position was {}, writer.getFilename(), context); writer.abort(); ssTable = null;
[jira] [Commented] (CASSANDRA-9797) Don't wrap byte arrays in SequentialWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631017#comment-14631017 ] Sylvain Lebresne commented on CASSANDRA-9797: - bq. I mean reintroduce the write(byte[]...) implementation as we had it then, since it likely introduces fewer risks to restore behaviour as it was than to rewrite it. That would make sense, though looking at that more closely, there has been enough change to {{SequentialWriter}} than just copy-pasting the old version doesn't at all (that old version uses {{bufferCursor()}} that doesn't exist anymore, it sets {{current}} even though that's not a method not a field, and sets {{validBufferBytes}} that also doesn't exist anymore). So we would need to revert a little bit more than just that one method, or modify it to fit the new stuffs, but both of which looks a lot more risky that the very simple version attached. That said, I'm also not the most familiar with the different evolution of {{SequentialWriter}} so if someone feels more confident with one of those two previous option, happy to let you have a shot at it. Don't wrap byte arrays in SequentialWriter -- Key: CASSANDRA-9797 URL: https://issues.apache.org/jira/browse/CASSANDRA-9797 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Labels: performance Fix For: 3.x Attachments: 9797.txt While profiling a simple stress write run ({{cassandra-stress write n=200 -rate threads=50}} to be precise) with Mission Control, I noticed that a non trivial amount of heap pressure was due to the {{ByteBuffer.wrap()}} call in {{SequentialWriter.write(byte[])}}. Basically, when writing a byte array, we wrap it in a ByteBuffer to reuse the {{SequentialWriter.write(ByteBuffer)}} method. One could have hoped this wrapping would be stack allocated, but if Mission Control isn't lying (and I was told it's fairly honest on that front), it's not. And we do use that {{write(byte[])}} method quite a bit, especially with the new vint encodings since they use a {{byte[]}} thread local buffer and call that method. Anyway, it sounds very simple to me to have a more direct {{write(byte[])}} method, so attaching a patch to do that. A very quick local benchmark seems to show a little bit less allocation and a slight edge for the branch with this patch (on top of CASSANDRA-9705 I must add), but that local bench was far from scientific so happy if someone that knows how to use our perf service want to give that patch a shot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9837) Fix broken logging for empty flushes in Memtable
[ https://issues.apache.org/jira/browse/CASSANDRA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-9837: Summary: Fix broken logging for empty flushes in Memtable (was: Bad logging interpolation string in Memtable: ) Fix broken logging for empty flushes in Memtable -- Key: CASSANDRA-9837 URL: https://issues.apache.org/jira/browse/CASSANDRA-9837 Project: Cassandra Issue Type: Bug Components: Core Reporter: Michael McKibben Priority: Trivial Attachments: trunk-9837.txt Notice the following non-interpolated log entry showing up in our logs Completed flushing %s. Looking at the source it appears to be a mix between logback style {} substitution vs String.format %s style. Attaching a trivial patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630910#comment-14630910 ] Sylvain Lebresne commented on CASSANDRA-6477: - bq. Why do we need this at all? Since replicas are in charge of updating MV then normal hints should perform the same function as batchlog except without the performance hint in the normal case. Allow me to sum up how we deal with consistency guarantees, why we do it this way and why I don't think hints work. I'm sorry if that response is a bit verbose but as this is the most important thing of this ticket imo, I think it bears repeating and making sure we're all on the same page. The main guarantee we have to provide here is that MV are eventually consistent with their base table. In other words, whatever failure scenarios we run into, we should never have an inconsistency that never gets resolved. The canonical example of why this is not a given is we have a column {{c = 2}} in the base table that is also in a MV PK, and we have 2 concurrent updates A (sets {{c = 3}}) and B (sets {{c = 4}}). Without any kind of protection, we could end up with the MV permanently having 2 entries, one of A and one for B, which is incorrect (which should eventually converge to the update that has the biggest timestamp since that's what the base table will keep). To the best of my knowledge, there is 2 fundamental components to avoiding such permanent inconsistency in the currently written patch/approach: # On each replica, we synchronize/serialize the read-before-write done on the base table. This guarantees that we won't have A and B racing on a single base-table replica. Or, in other words, *if* the same replica sees both update (where sees means do the read-before-write-and-update-MV-accordingly dance), then it will properly update the MV. And since each base-table replica updates all MV-table replica, it's enough that a single base-table replica sees both update to guarantee eventually consistent of the MV. But we do need to guarantee _at least_ one such base-table replica sees both updates and that's the 2nd component. # To provided that latter guarantee, we first put each base-table update that include MV updates in the batchlog on the coordinator, and we only remove it from the batchlog once a _QUORUM_ of replica have aknowledged the write (this is importantly not dependent of the CL, eventual consistency must be guaranteed whatever CL you use). That guarantees us that until a QUORUM of replica have seen the update, we'll keep replaying it, which in turns guarantees us that for any 2 updates, at least one replica will have sees them both. Now, the latter guarantee cannot be provided by hints because we can't guarantee hints delivery in face of failures. Typically, if I write hints on a node and that node dies in a fire before that hint it delivered, it will never be delivered. We need a distributed hint mechanism if you will, and that's what the batch log gives us. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size
[ https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630928#comment-14630928 ] Stefania commented on CASSANDRA-8894: - [~benedict] I went ahead and implemented the latest suggested optimization in this commit [here|https://github.com/stef1927/cassandra/commit/ad6712cdc12380ef0529a13ed6e9bd1c5cecebad]. I've also attached tentative stress yaml profiles, which I intend to run like this: {code} user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml ops\(insert=1,\) n=10 -rate threads=50 user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml ops\(singleblob=1,\) n=10 -rate threads=50 {code} Can you confirm the profiles are what you intended, basically a partition id and a blob column with the size distributed as you previously indicated. I'm not sure if there is anything else I should do to ensure reads mostly hit disk - other than spreading the partition id across a bit interval? I created these additional branches: - trunk-pre-8099 - 8894-pre-8099 - 8894-pre-8099-first-optim - 8894-first-optim The names are self describing except for first-optim which means before implementing the latest optimization. A tag would have been enough but cstar perf does not support it. Unfortunately cstar perf has been giving me more problems other than tags, cc [~enigmacurry]: * The old trunk branches pre 8099 fail due to the schema tables changes (http://cstar.datastax.com/tests/id/e134ee7e-2c46-11e5-a180-42010af0688f) : InvalidQueryException: Keyspace system_schema does not exist. However I think if we fake version 2.2 in build.xml we should be OK. * The new branches either fail because of a nodetool failure (http://cstar.datastax.com/tests/id/86abc144-2c55-11e5-87b9-42010af0688f) or the graphs are wrong (http://cstar.datastax.com/tests/id/11fe9c5a-2c45-11e5-9760-42010af0688f). Here is the nodetool failure: {code} [10.200.241.104] Executing task 'ensure_running' [10.200.241.104] run: JAVA_HOME=~/fab/jvms/jdk1.8.0_45 ~/fab/cassandra/bin/nodetool ring [10.200.241.104] out: error: null [10.200.241.104] out: -- StackTrace -- [10.200.241.104] out: java.util.NoSuchElementException [10.200.241.104] out: at com.google.common.collect.LinkedHashMultimap$1.next(LinkedHashMultimap.java:506) [10.200.241.104] out: at com.google.common.collect.LinkedHashMultimap$1.next(LinkedHashMultimap.java:494) [10.200.241.104] out: at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) [10.200.241.104] out: at java.util.Collections.max(Collections.java:708) [10.200.241.104] out: at org.apache.cassandra.tools.nodetool.Ring.execute(Ring.java:63) [10.200.241.104] out: at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:240) [10.200.241.104] out: at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:154) [10.200.241.104] out: [10.200.241.104] out: {code} I'll resume the performance tests once cstar perf is stable again. Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size -- Key: CASSANDRA-8894 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Labels: benedict-to-commit Fix For: 3.x Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml A large contributor to slower buffered reads than mmapped is likely that we read a full 64Kb at once, when average record sizes may be as low as 140 bytes on our stress tests. The TLB has only 128 entries on a modern core, and each read will touch 32 of these, meaning we are unlikely to almost ever be hitting the TLB, and will be incurring at least 30 unnecessary misses each time (as well as the other costs of larger than necessary accesses). When working with an SSD there is little to no benefit reading more than 4Kb at once, and in either case reading more data than we need is wasteful. So, I propose selecting a buffer size that is the next larger power of 2 than our average record size (with a minimum of 4Kb), so that we expect to read in one operation. I also propose that we create a pool of these buffers up-front, and that we ensure they are all exactly aligned to a virtual page, so that the source and target operations each touch exactly one virtual page per 4Kb of expected record size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6492) Have server pick query page size by default
[ https://issues.apache.org/jira/browse/CASSANDRA-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630977#comment-14630977 ] Sylvain Lebresne commented on CASSANDRA-6492: - bq. I'm just worried about not being able to meet user expectations when we first expose a page size in bytes. I understand, and it's a valid concern. But I don't know, I'm just not a fan of hard-coded magic constants. Even if we hide that bytes target from view, we might still be really off on our stats and fail it, which can still have user visible consequence, and so I'm not sure this ultimately help users comprehension of what is going on. The other aspect is that if we do that (just have a default mode), users for which the default doesn't work are still stuck with providing the page size in number of rows, which still requires them to guess-estimate their average row size, which is annoying to do when we can probably do a pretty good job of guess-estimating server-side automatically. But I totally agree we should be very clear initially that this is a very soft target. And maybe we can experiment a bit to get a better sense of how bad that estimate will be in practice. That is, we can try different schemas and workloads (even try actively to game the estimate), and if it proves very easy to get an estimate that is very off, then I can agree that exposing the size is probably not a good idea (though if that's the case, it will also be worth asking ourselves if even a default is going to help more than it hurts). If it's quite hard however (to get an estimate that is very off reality), then we'll still warn users that it's not precise, but that's probably good enough in practice. Have server pick query page size by default --- Key: CASSANDRA-6492 URL: https://issues.apache.org/jira/browse/CASSANDRA-6492 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Benjamin Lerer Priority: Minor Labels: client-impacting We're almost always going to do a better job picking a page size based on sstable stats, than users will guesstimating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Fix handling of thrift non-string comparators
Repository: cassandra Updated Branches: refs/heads/trunk f5f3ae1da - 412e8743d Fix handling of thrift non-string comparators patch by slebresne; reviewed by iamaleksey for CASSANDRA-9825 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/412e8743 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/412e8743 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/412e8743 Branch: refs/heads/trunk Commit: 412e8743d7e933e5b3008242f74007f7ddd435cb Parents: f5f3ae1 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Jul 16 15:38:02 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Jul 17 10:39:10 2015 +0200 -- CHANGES.txt | 2 +- src/java/org/apache/cassandra/config/CFMetaData.java | 5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/412e8743/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 76d6e92..db306ea 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -2,7 +2,7 @@ * Metrics should use up to date nomenclature (CASSANDRA-9448) * Change CREATE/ALTER TABLE syntax for compression (CASSANDRA-8384) * Cleanup crc and adler code for java 8 (CASSANDRA-9650) - * Storage engine refactor (CASSANDRA-8099, 9743, 9746, 9759, 9781, 9808) + * Storage engine refactor (CASSANDRA-8099, 9743, 9746, 9759, 9781, 9808, 9825) * Update Guava to 18.0 (CASSANDRA-9653) * Bloom filter false positive ratio is not honoured (CASSANDRA-8413) * New option for cassandra-stress to leave a ratio of columns null (CASSANDRA-9522) http://git-wip-us.apache.org/repos/asf/cassandra/blob/412e8743/src/java/org/apache/cassandra/config/CFMetaData.java -- diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java b/src/java/org/apache/cassandra/config/CFMetaData.java index 84639dc..ee1ed25 100644 --- a/src/java/org/apache/cassandra/config/CFMetaData.java +++ b/src/java/org/apache/cassandra/config/CFMetaData.java @@ -1117,7 +1117,7 @@ public final class CFMetaData interval (%d)., maxIndexInterval, minIndexInterval)); } -// The comparator to validate the definition name. +// The comparator to validate the definition name with thrift. public AbstractType? thriftColumnNameType() { if (isSuper()) @@ -1127,7 +1127,8 @@ public final class CFMetaData return ((MapType)def.type).nameComparator(); } -return UTF8Type.instance; +assert isStaticCompactTable(); +return clusteringColumns.get(0).type; } public CFMetaData addAllColumnDefinitions(CollectionColumnDefinition defs)
[jira] [Updated] (CASSANDRA-9797) Don't wrap byte arrays in SequentialWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-9797: Fix Version/s: 2.2.x Don't wrap byte arrays in SequentialWriter -- Key: CASSANDRA-9797 URL: https://issues.apache.org/jira/browse/CASSANDRA-9797 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Labels: performance Fix For: 3.x, 2.2.x Attachments: 9797.txt While profiling a simple stress write run ({{cassandra-stress write n=200 -rate threads=50}} to be precise) with Mission Control, I noticed that a non trivial amount of heap pressure was due to the {{ByteBuffer.wrap()}} call in {{SequentialWriter.write(byte[])}}. Basically, when writing a byte array, we wrap it in a ByteBuffer to reuse the {{SequentialWriter.write(ByteBuffer)}} method. One could have hoped this wrapping would be stack allocated, but if Mission Control isn't lying (and I was told it's fairly honest on that front), it's not. And we do use that {{write(byte[])}} method quite a bit, especially with the new vint encodings since they use a {{byte[]}} thread local buffer and call that method. Anyway, it sounds very simple to me to have a more direct {{write(byte[])}} method, so attaching a patch to do that. A very quick local benchmark seems to show a little bit less allocation and a slight edge for the branch with this patch (on top of CASSANDRA-9705 I must add), but that local bench was far from scientific so happy if someone that knows how to use our perf service want to give that patch a shot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
[ https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631066#comment-14631066 ] Stefania commented on CASSANDRA-7066: - bq. Also, I'd like to propose we hide TransactionLogs a little, by making its class constructor package-private, and ensuring it only ever exists as part of a LifecycleTransaction. [~benedict], we can make it package private but I think there is one legitimate case where we need the transaction logs without a lifecycle transaction and that is when we only have a writer, like for {{SSTableTxnWriter}}. Do you think we should extend the lifecycle transaction to handle the case of no existing readers, no cfs but only one new writer? It seems kind of heavy to me and I would prefer to just move SSTableTxnWriter to the lifecycle package, perhaps with a better name? Also, the transaction logs must be created before the writer, because it must register the new file in its constructor before creating it. Simplify (and unify) cleanup of compaction leftovers Key: CASSANDRA-7066 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Priority: Minor Labels: benedict-to-commit, compaction Fix For: 3.x Attachments: 7066.txt Currently we manage a list of in-progress compactions in a system table, which we use to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files, or conversely not cleanup files that have been superceded); and 2) it's only used for a regular compaction - no other compaction types are guarded in the same way, so can result in duplication if we fail before deleting the replacements. I'd like to see each sstable store in its metadata its direct ancestors, and on startup we simply delete any sstables that occur in the union of all ancestor sets. This way as soon as we finish writing we're capable of cleaning up any leftovers, so we never get duplication. It's also much easier to reason about. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9837) Bad logging interpolation string in Memtable:
[ https://issues.apache.org/jira/browse/CASSANDRA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-9837: Reviewer: Robert Stupp Bad logging interpolation string in Memtable: -- Key: CASSANDRA-9837 URL: https://issues.apache.org/jira/browse/CASSANDRA-9837 Project: Cassandra Issue Type: Bug Components: Core Reporter: Michael McKibben Priority: Trivial Attachments: trunk-9837.txt Notice the following non-interpolated log entry showing up in our logs Completed flushing %s. Looking at the source it appears to be a mix between logback style {} substitution vs String.format %s style. Attaching a trivial patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630926#comment-14630926 ] Sylvain Lebresne commented on CASSANDRA-6477: - bq. Made the base - view mutations async. Once we write to the local batchlog, we don't care if the actual mutations are sent, it's best effort. So we can fire and forget these and update the base memtable. That's correct from an eventual consistency point of view, but I'm pretty sure this breaks the CL guarantees for the user. What we want is that if I write at {{CL.QUORUM}} the base table, and then read my MV at {{CL.QUORUM}}, then I'm guaranteed to see my previous update. But that requires that each replica does synchronous updates to the MV, and with the user CL. Writing a local batchlog is not enough in particular since it doesn't give any kind of guarantee of the visibility of the update. See my next comment though on that local batchlog. bq. Made the Base - View batchlog update local only I've actually never understood why we do a batchlog update on the base table replicas (and so I think we should remove it, even though that's likely not the most costly one). Why do we need it? If my reasoning above is correct, the coordinator batchlog is enough to guarantee durability and eventual consistency because we will replay the whole mutation until a QUORUM of replica acknowledges success. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9785) dtests-offheap: leak detected
[ https://issues.apache.org/jira/browse/CASSANDRA-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630866#comment-14630866 ] Robert Stupp commented on CASSANDRA-9785: - [~benedict] out of curiosity: do you know whether this _LEAK DETECTED_ is already handled in another ticket? dtests-offheap: leak detected - Key: CASSANDRA-9785 URL: https://issues.apache.org/jira/browse/CASSANDRA-9785 Project: Cassandra Issue Type: Bug Reporter: Robert Stupp Following dtests fail with LEAK DETECTED with {{OFFHEAP_MEMTABLES=yes}}: * repair_test.py:TestRepair.dc_repair_test * repair_test.TestRepair.simple_sequential_repair_test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size
[ https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630928#comment-14630928 ] Stefania edited comment on CASSANDRA-8894 at 7/17/15 7:55 AM: -- [~benedict] I went ahead and implemented the latest suggested optimization in this commit [here|https://github.com/stef1927/cassandra/commit/ad6712cdc12380ef0529a13ed6e9bd1c5cecebad]. I've also attached tentative stress yaml profiles, which I intend to run like this: {code} user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml ops\(insert=1,\) n=10 -rate threads=50 user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml ops\(singleblob=1,\) n=10 -rate threads=50 {code} Can you confirm the profiles are what you intended, basically a partition id and a blob column with the size distributed as you previously indicated. I'm not sure if there is anything else I should do to ensure reads mostly hit disk - other than spreading the partition id across a big interval? I created these additional branches: - trunk-pre-8099 - 8894-pre-8099 - 8894-pre-8099-first-optim - 8894-first-optim The names are self describing except for first-optim which means before implementing the latest optimization. A tag would have been enough but cstar perf does not support it. Unfortunately cstar perf has been giving me more problems other than tags, cc [~enigmacurry]: * The old trunk branches pre 8099 fail due to the schema tables changes (http://cstar.datastax.com/tests/id/e134ee7e-2c46-11e5-a180-42010af0688f) : InvalidQueryException: Keyspace system_schema does not exist. However I think if we fake version 2.2 in build.xml we should be OK. * The new branches either fail because of a nodetool failure (http://cstar.datastax.com/tests/id/86abc144-2c55-11e5-87b9-42010af0688f) or the graphs are wrong (http://cstar.datastax.com/tests/id/11fe9c5a-2c45-11e5-9760-42010af0688f). Here is the nodetool failure: {code} [10.200.241.104] Executing task 'ensure_running' [10.200.241.104] run: JAVA_HOME=~/fab/jvms/jdk1.8.0_45 ~/fab/cassandra/bin/nodetool ring [10.200.241.104] out: error: null [10.200.241.104] out: -- StackTrace -- [10.200.241.104] out: java.util.NoSuchElementException [10.200.241.104] out: at com.google.common.collect.LinkedHashMultimap$1.next(LinkedHashMultimap.java:506) [10.200.241.104] out: at com.google.common.collect.LinkedHashMultimap$1.next(LinkedHashMultimap.java:494) [10.200.241.104] out: at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) [10.200.241.104] out: at java.util.Collections.max(Collections.java:708) [10.200.241.104] out: at org.apache.cassandra.tools.nodetool.Ring.execute(Ring.java:63) [10.200.241.104] out: at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:240) [10.200.241.104] out: at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:154) [10.200.241.104] out: [10.200.241.104] out: {code} I'll resume the performance tests once cstar perf is stable again. was (Author: stefania): [~benedict] I went ahead and implemented the latest suggested optimization in this commit [here|https://github.com/stef1927/cassandra/commit/ad6712cdc12380ef0529a13ed6e9bd1c5cecebad]. I've also attached tentative stress yaml profiles, which I intend to run like this: {code} user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml ops\(insert=1,\) n=10 -rate threads=50 user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml ops\(singleblob=1,\) n=10 -rate threads=50 {code} Can you confirm the profiles are what you intended, basically a partition id and a blob column with the size distributed as you previously indicated. I'm not sure if there is anything else I should do to ensure reads mostly hit disk - other than spreading the partition id across a bit interval? I created these additional branches: - trunk-pre-8099 - 8894-pre-8099 - 8894-pre-8099-first-optim - 8894-first-optim The names are self describing except for first-optim which means before implementing the latest optimization. A tag would have been enough but cstar perf does not support it. Unfortunately cstar perf has been giving me more problems other than tags, cc [~enigmacurry]: * The old trunk branches pre 8099 fail due to the schema tables changes (http://cstar.datastax.com/tests/id/e134ee7e-2c46-11e5-a180-42010af0688f) : InvalidQueryException: Keyspace system_schema does not exist. However I think if we fake version 2.2 in build.xml we should be OK. * The new branches either fail because of a nodetool failure (http://cstar.datastax.com/tests/id/86abc144-2c55-11e5-87b9-42010af0688f) or the graphs are wrong (http://cstar.datastax.com/tests/id/11fe9c5a-2c45-11e5-9760-42010af0688f). Here is the nodetool failure: {code} [10.200.241.104] Executing task
[jira] [Updated] (CASSANDRA-9837) Bad logging interpolation string in Memtable:
[ https://issues.apache.org/jira/browse/CASSANDRA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-9837: Assignee: (was: Robert Stupp) Bad logging interpolation string in Memtable: -- Key: CASSANDRA-9837 URL: https://issues.apache.org/jira/browse/CASSANDRA-9837 Project: Cassandra Issue Type: Bug Components: Core Reporter: Michael McKibben Priority: Trivial Attachments: trunk-9837.txt Notice the following non-interpolated log entry showing up in our logs Completed flushing %s. Looking at the source it appears to be a mix between logback style {} substitution vs String.format %s style. Attaching a trivial patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-9837) Bad logging interpolation string in Memtable:
[ https://issues.apache.org/jira/browse/CASSANDRA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp reassigned CASSANDRA-9837: --- Assignee: Robert Stupp Bad logging interpolation string in Memtable: -- Key: CASSANDRA-9837 URL: https://issues.apache.org/jira/browse/CASSANDRA-9837 Project: Cassandra Issue Type: Bug Components: Core Reporter: Michael McKibben Assignee: Robert Stupp Priority: Trivial Attachments: trunk-9837.txt Notice the following non-interpolated log entry showing up in our logs Completed flushing %s. Looking at the source it appears to be a mix between logback style {} substitution vs String.format %s style. Attaching a trivial patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9670) Cannot run CQL scripts on Windows AND having error Ubuntu Linux
[ https://issues.apache.org/jira/browse/CASSANDRA-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631485#comment-14631485 ] Philip Thompson commented on CASSANDRA-9670: Sorry, [~bholya], i was away from the office for a few weeks. I'm looking at these scripts now, and while the problem is definitely in the importing of one of these characters, the sp_setup.cql file you attached does not define the schema for the cities table, which is the one with the failing imports from the attached .csv files. Could you please give me that schema so I can repro the issue? Cannot run CQL scripts on Windows AND having error Ubuntu Linux --- Key: CASSANDRA-9670 URL: https://issues.apache.org/jira/browse/CASSANDRA-9670 Project: Cassandra Issue Type: Bug Components: Core Environment: DataStax Community Edition on Windows 7, 64 Bit and Ubuntu Reporter: Sanjay Patel Assignee: Philip Thompson Labels: cqlsh Fix For: 2.1.x Attachments: cities.cql, germany_cities.cql, germany_cities.cql, india_cities.csv, india_states.csv, sp_setup.cql After installation of 2.1.6 and 2.1.7 it is not possible to execute cql scripts, which were earlier executed on windows + Linux environment successfully. I have tried to install Python 2 latest version and try to execute, but having same error. Attaching cities.cql for reference. --- {code} cqlsh source 'shoppoint_setup.cql' ; shoppoint_setup.cql:16:InvalidRequest: code=2200 [Invalid query] message=Keyspace 'shopping' does not exist shoppoint_setup.cql:647:'ascii' codec can't decode byte 0xc3 in position 57: ordinal not in range(128) cities.cql:9:'ascii' codec can't decode byte 0xc3 in position 51: ordinal not in range(128) cities.cql:14: Error starting import process: cities.cql:14:Can't pickle type 'thread.lock': it's not found as thread.lock cities.cql:14:can only join a started process cities.cql:16: Error starting import process: cities.cql:16:Can't pickle type 'thread.lock': it's not found as thread.lock cities.cql:16:can only join a started process Traceback (most recent call last): File string, line 1, in module File I:\programm\python2710\lib\multiprocessing\forking.py, line 380, in main prepare(preparation_data) File I:\programm\python2710\lib\multiprocessing\forking.py, line 489, in prepare Traceback (most recent call last): File string, line 1, in module file, path_name, etc = imp.find_module(main_name, dirs) ImportError: No module named cqlsh File I:\programm\python2710\lib\multiprocessing\forking.py, line 380, in main prepare(preparation_data) File I:\programm\python2710\lib\multiprocessing\forking.py, line 489, in prepare file, path_name, etc = imp.find_module(main_name, dirs) ImportError: No module named cqlsh shoppoint_setup.cql:663:'ascii' codec can't decode byte 0xc3 in position 18: ordinal not in range(128) ipcache.cql:28:ServerError: ErrorMessage code= [Server error] message=java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.FileNotFoundException: I:\var\lib\cassandra\data\syste m\schema_columns-296e9c049bec3085827dc17d3df2122a\system-schema_columns-ka-300-Data.db (The process cannot access the file because it is being used by another process) ccavn_bulkupdate.cql:75:ServerError: ErrorMessage code= [Server error] message=java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.FileNotFoundException: I:\var\lib\cassandra\d ata\system\schema_columns-296e9c049bec3085827dc17d3df2122a\system-schema_columns-tmplink-ka-339-Data.db (The process cannot access the file because it is being used by another process) shoppoint_setup.cql:680:'ascii' codec can't decode byte 0xe2 in position 14: ordinal not in range(128){code} - In one of Ubuntu development environment we have similar errors. - {code} shoppoint_setup.cql:647:'ascii' codec can't decode byte 0xc3 in position 57: ordinal not in range(128) cities.cql:9:'ascii' codec can't decode byte 0xc3 in position 51: ordinal not in range(128) (corresponding line) COPY cities (city,country_code,state,isactive) FROM 'testdata/india_cities.csv' ; [19:53:18] j.basu: shoppoint_setup.cql:663:'ascii' codec can't decode byte 0xc3 in position 18: ordinal not in range(128) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9795) Fix cqlsh dtests on windows
[ https://issues.apache.org/jira/browse/CASSANDRA-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631532#comment-14631532 ] T Jake Luciani commented on CASSANDRA-9795: --- Yeah but the fix for windows here was to set (delete=False) since it was keeping other processes from opening the file Fix cqlsh dtests on windows --- Key: CASSANDRA-9795 URL: https://issues.apache.org/jira/browse/CASSANDRA-9795 Project: Cassandra Issue Type: Sub-task Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 2.2.x There are a number of portability problems with python on win32 as I've learned over the past few days. * Our use of multiprocess is broken in cqlsh for windows. https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming The code was passing self to the sub-process which on windows must be pickleable (it's not). So I refactored to be a class which is initialized in the parent. Also, when the windows process starts it needs to load our cqlsh as a module. So I moved cqlsh - cqlsh.py and added a tiny wrapper for bin/cqlsh * Our use of strftime is broken on windows The default timezone information %z in strftime isn't valid on windows. I added code to the date format parser in C* to support windows timezone labels. * We have a number of file access issues in dtest * csv import/export is broken on windows and requires all file be opened with mode 'wb' or 'rb' http://stackoverflow.com/questions/1170214/pythons-csv-writer-produces-wrong-line-terminator/1170297#1170297 * CCM's use of popen required the univeral_newline=True flag to work on windows -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631496#comment-14631496 ] Jack Krupansky commented on CASSANDRA-6477: --- bq. multiple MVs being updated It would be good to get a handle on what the scalability of MVs per base table is in terms of recommended best practice. Hundreds? Thousands? A few dozen? Maybe just a handful, like 5 or 10 or a dozen? I hate it when a feature like this gets implemented without scalability in mind and then some poor/idiot user comes along and tries a use case which is way out of line with the implemented architecture but we provide no guidance as to what the practical limits really are (e.g., number of tables - thousands vs. hundreds.) It seems to me that the primary use case is for query tables, where an app might typically have a handful of queries and probably not more than a small number of dozens in even extreme cases. In any case, it would be great to be clear about the design limit for number of MVs per base table - and to make sure some testing gets done to assure that the number is practical. And by design limit I don't mean a hard limit where more will cause an explicit error, but where performance is considered acceptable. Are the MV updates occurring in parallel with each other, or are they serial? How many MVs could a base table have before the MV updates effectively become serialized? Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631500#comment-14631500 ] Tupshin Harper commented on CASSANDRA-6477: --- Just a reminder (since it was a loong time ago in this ticket), that we were going to target immediate consistency once we could leverage RAMP, and not before. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631506#comment-14631506 ] Carl Yeksigian commented on CASSANDRA-6477: --- [~tupshin] Because we are no longer implementing this as a non-denormalized global index, we don't have multiple partitions to read, so RAMP unfortunately won't solve problems in a materialized view. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9683) Get mucher higher load and latencies after upgrading from 2.1.6 to cassandra 2.1.7
[ https://issues.apache.org/jira/browse/CASSANDRA-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631513#comment-14631513 ] Ariel Weisberg commented on CASSANDRA-9683: --- OK, so that might explain it. The metric you are looking at for writes might not include replication. These are screenshots from OpsCenter? I will look into how those metrics are collected. Durable writes don't impact replication much since it still has to occur. All durable writes does disable writing to the commit log at each node. I will try again with multiple nodes and keep an eye on the ops center metrics. Get mucher higher load and latencies after upgrading from 2.1.6 to cassandra 2.1.7 -- Key: CASSANDRA-9683 URL: https://issues.apache.org/jira/browse/CASSANDRA-9683 Project: Cassandra Issue Type: Bug Environment: Ubuntu 12.04 (3.13 Kernel) * 3 JDK: Oracle JDK 7 RAM: 32GB Cores 4 (+4 HT) Reporter: Loic Lambiel Assignee: Ariel Weisberg Fix For: 2.1.x Attachments: cassandra-env.sh, cassandra.yaml, cfstats.txt, os_load.png, pending_compactions.png, read_latency.png, schema.txt, system.log, write_latency.png After upgrading our cassandra staging cluster version from 2.1.6 to 2.1.7, the average load grows from 0.1-0.3 to 1.8. Latencies did increase as well. We see an increase of pending compactions, probably due to CASSANDRA-9592. This cluster has almost no workload (staging environment) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631507#comment-14631507 ] T Jake Luciani commented on CASSANDRA-6477: --- bq. Frankly, I'm pretty negative on adding such option. But then why do we even offer the batchlog at all? Hand rolled materialized views use them. And if you feel we should guarantee a consistency level then you would never use a batchlog. Since any timeout would mean you didn't achieve your consistency level and you must retry. If you are talking about just the UE then we could check the MV replica UP/Down status in the coordinator as well as the base. Materialized Views (was: Global Indexes) Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 beta 1 Attachments: test-view-data.sh, users.yaml Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9795) Fix cqlsh dtests on windows
[ https://issues.apache.org/jira/browse/CASSANDRA-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631527#comment-14631527 ] Philip Thompson commented on CASSANDRA-9795: [~JoshuaMcKenzie], [~tjake], NamedTemporaryFiles are automatically deleted at the end of tests already, so there is no need to explicitly remove them. See https://docs.python.org/2/library/tempfile.html#tempfile.NamedTemporaryFile Fix cqlsh dtests on windows --- Key: CASSANDRA-9795 URL: https://issues.apache.org/jira/browse/CASSANDRA-9795 Project: Cassandra Issue Type: Sub-task Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 2.2.x There are a number of portability problems with python on win32 as I've learned over the past few days. * Our use of multiprocess is broken in cqlsh for windows. https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming The code was passing self to the sub-process which on windows must be pickleable (it's not). So I refactored to be a class which is initialized in the parent. Also, when the windows process starts it needs to load our cqlsh as a module. So I moved cqlsh - cqlsh.py and added a tiny wrapper for bin/cqlsh * Our use of strftime is broken on windows The default timezone information %z in strftime isn't valid on windows. I added code to the date format parser in C* to support windows timezone labels. * We have a number of file access issues in dtest * csv import/export is broken on windows and requires all file be opened with mode 'wb' or 'rb' http://stackoverflow.com/questions/1170214/pythons-csv-writer-produces-wrong-line-terminator/1170297#1170297 * CCM's use of popen required the univeral_newline=True flag to work on windows -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9519) CASSANDRA-8448 Doesn't seem to be fixed
[ https://issues.apache.org/jira/browse/CASSANDRA-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631083#comment-14631083 ] Sylvain Lebresne commented on CASSANDRA-9519: - I'll admit I just went with the 'fix version' of that issue. If CASSANDRA-8448 wasn't committed to 2.0 as the fix version imply, we would need to fix that too since it has part of the fix of the actual problem, but I don't know what was the rational for not committing it to 2.0 in the first place (if it was deemed not worth the risk, no reason for that to have changed). CASSANDRA-8448 Doesn't seem to be fixed --- Key: CASSANDRA-9519 URL: https://issues.apache.org/jira/browse/CASSANDRA-9519 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jeremiah Jordan Assignee: Sylvain Lebresne Fix For: 2.1.9, 2.2.0 Attachments: 9519.txt Still seeing the Comparison method violates its general contract! in 2.1.5 {code} java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeHi(TimSort.java:895) ~[na:1.8.0_45] at java.util.TimSort.mergeAt(TimSort.java:512) ~[na:1.8.0_45] at java.util.TimSort.mergeCollapse(TimSort.java:437) ~[na:1.8.0_45] at java.util.TimSort.sort(TimSort.java:241) ~[na:1.8.0_45] at java.util.Arrays.sort(Arrays.java:1512) ~[na:1.8.0_45] at java.util.ArrayList.sort(ArrayList.java:1454) ~[na:1.8.0_45] at java.util.Collections.sort(Collections.java:175) ~[na:1.8.0_45] at org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:158) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:187) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:152) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1530) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1688) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:256) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:63) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:260) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:272) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9472) Reintroduce off heap memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-9472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631112#comment-14631112 ] Sylvain Lebresne commented on CASSANDRA-9472: - I'll note that unless I'm forgetting something, re-introducing this is not terribly complex (especially post-CASSANDRA-9705). CASSANDRA-8099 hasn't touched the off-heap memtable machinery much, so I think all we need is to implement the {{NativeAllocator.rowBuilder}} method (referencing CASSANDRA-9705 patch here). Which in turns mainly mean writting a {{NativeRow}} implementation that is the counterpart of our previous {{NativeCell}} implementations (and we can likely salvage some of the code of said {{NativeCell}}). This thus require to come up with a reasonable serialization format for offheap rows, but that's hardly rocket science. Note that I'm just talking here of doing what this ticket is actually about, which is to re-introduce off-heap memtables in a way that is as close as possible from what we had pre-CASSANDRA-8099. And I don't think we should wait on Java 9 or anything to do that. Of course we will improve on all this in the future, but let's please leave that to some future ticket. Reintroduce off heap memtables -- Key: CASSANDRA-9472 URL: https://issues.apache.org/jira/browse/CASSANDRA-9472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Fix For: 3.x CASSANDRA-8099 removes off heap memtables. We should reintroduce them ASAP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
[ https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631119#comment-14631119 ] Stefania commented on CASSANDRA-7066: - Then I need to add a new offline method to pass the metadata to the transaction logs and that's all? Simplify (and unify) cleanup of compaction leftovers Key: CASSANDRA-7066 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Priority: Minor Labels: benedict-to-commit, compaction Fix For: 3.x Attachments: 7066.txt Currently we manage a list of in-progress compactions in a system table, which we use to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files, or conversely not cleanup files that have been superceded); and 2) it's only used for a regular compaction - no other compaction types are guarded in the same way, so can result in duplication if we fail before deleting the replacements. I'd like to see each sstable store in its metadata its direct ancestors, and on startup we simply delete any sstables that occur in the union of all ancestor sets. This way as soon as we finish writing we're capable of cleaning up any leftovers, so we never get duplication. It's also much easier to reason about. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9472) Reintroduce off heap memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-9472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631147#comment-14631147 ] Sylvain Lebresne commented on CASSANDRA-9472: - bq. however if we're shooting for a beta release ASAP, it will not likely be done in time. Totally agreed. I did not meant to imply that we should attempt it for 3.0. As, as you said, it hasn't been taken out of experimental status, we've agreed that re-introducing that in 3.1/3.2 is good enough and I think we should stick to that. Though I also agree that since it is still experimental _and_ the fix here is totally isolated, then if beta1 goes surprisingly well and someone finds time to get this ready for the RC, then I wouldn't personally oppose a late inclusion. Reintroduce off heap memtables -- Key: CASSANDRA-9472 URL: https://issues.apache.org/jira/browse/CASSANDRA-9472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Fix For: 3.x CASSANDRA-8099 removes off heap memtables. We should reintroduce them ASAP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size
[ https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631203#comment-14631203 ] Benedict commented on CASSANDRA-8894: - I may aim to integrate this work prior to our running full performance tests, as I would like to see this safely hit pre-3.0, and we know it is effective already for in-memory workloads. The question now is more the tuning parameters, and how we might yet tweak them, and that's something that can be done much closer to release if we need to. Some quick feedback: * crossingProbability is always zero, I think? need to use {{ / 4096d}} * disk_optimization_record_size_percentile: disk_optimization_estimate_percentile? * disk_optimization_crossing_chance: disk_optimization_page_cross_chance? No super strong feelings about the names, though. Just suggestions; not 100% certain they're even better from my POV, nor that it's important. Otherwise this all LGTM, and I'm keen to commit. When performance testing, we should figure out how (via cstar) we can tweak read ahead settings on the machine. [~enigmacurry]: is there any way we could have that as a GUI option? Because this new code should make read ahead a bad idea for SSD clusters, and disabling it may see this will likely see standard mode become a superior option to mmap, since we can predict exactly how much we should read better than the OS can. Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size -- Key: CASSANDRA-8894 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Labels: benedict-to-commit Fix For: 3.x Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml A large contributor to slower buffered reads than mmapped is likely that we read a full 64Kb at once, when average record sizes may be as low as 140 bytes on our stress tests. The TLB has only 128 entries on a modern core, and each read will touch 32 of these, meaning we are unlikely to almost ever be hitting the TLB, and will be incurring at least 30 unnecessary misses each time (as well as the other costs of larger than necessary accesses). When working with an SSD there is little to no benefit reading more than 4Kb at once, and in either case reading more data than we need is wasteful. So, I propose selecting a buffer size that is the next larger power of 2 than our average record size (with a minimum of 4Kb), so that we expect to read in one operation. I also propose that we create a pool of these buffers up-front, and that we ensure they are all exactly aligned to a virtual page, so that the source and target operations each touch exactly one virtual page per 4Kb of expected record size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9835) Restore collectTimeOrderedData behaviour post-8099
[ https://issues.apache.org/jira/browse/CASSANDRA-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631123#comment-14631123 ] Sylvain Lebresne commented on CASSANDRA-9835: - What makes you think that? We do still do this, it's now in {{SinglePartitionNamesCommand}} (in the {{queryMemtableAndDiskInternal}} method to be precise). In fact, if anything, CASSANDRA-8099 makes this a lot more useful since pre-CASSANDRA-8099, {{collectTimeOrderedData}} is _never_ used for non-compact tables (we just never create names queries for non-compact tables for reasons explained at length in CASSANDRA-7085 and related issues). Restore collectTimeOrderedData behaviour post-8099 -- Key: CASSANDRA-9835 URL: https://issues.apache.org/jira/browse/CASSANDRA-9835 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Fix For: 3.0.x AFAICT, we no longer prune the sstables we iterate once we know we've satisfied a query. This is not only still possible, but possible in more scenarios (since we can do it for any single CQL-row lookup). Affected workloads may have noticeably degraded behaviour, and this will impact CASSANDRA-6477. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
[ https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631129#comment-14631129 ] Stefania commented on CASSANDRA-7066: - Sounds good. :) Simplify (and unify) cleanup of compaction leftovers Key: CASSANDRA-7066 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Priority: Minor Labels: benedict-to-commit, compaction Fix For: 3.x Attachments: 7066.txt Currently we manage a list of in-progress compactions in a system table, which we use to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files, or conversely not cleanup files that have been superceded); and 2) it's only used for a regular compaction - no other compaction types are guarded in the same way, so can result in duplication if we fail before deleting the replacements. I'd like to see each sstable store in its metadata its direct ancestors, and on startup we simply delete any sstables that occur in the union of all ancestor sets. This way as soon as we finish writing we're capable of cleaning up any leftovers, so we never get duplication. It's also much easier to reason about. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-9835) Restore collectTimeOrderedData behaviour post-8099
[ https://issues.apache.org/jira/browse/CASSANDRA-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict resolved CASSANDRA-9835. - Resolution: Invalid Restore collectTimeOrderedData behaviour post-8099 -- Key: CASSANDRA-9835 URL: https://issues.apache.org/jira/browse/CASSANDRA-9835 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Fix For: 3.0.x AFAICT, we no longer prune the sstables we iterate once we know we've satisfied a query. This is not only still possible, but possible in more scenarios (since we can do it for any single CQL-row lookup). Affected workloads may have noticeably degraded behaviour, and this will impact CASSANDRA-6477. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9835) Restore collectTimeOrderedData behaviour post-8099
[ https://issues.apache.org/jira/browse/CASSANDRA-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631143#comment-14631143 ] Benedict commented on CASSANDRA-9835: - Hmm. I guess a between-keyboard-and-chair error. I only found SinglePartitionReadCommand when finding implementations of queryMemtableAndDisk. Looking at it, it could perhaps do with a little TLC still. Restore collectTimeOrderedData behaviour post-8099 -- Key: CASSANDRA-9835 URL: https://issues.apache.org/jira/browse/CASSANDRA-9835 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Fix For: 3.0.x AFAICT, we no longer prune the sstables we iterate once we know we've satisfied a query. This is not only still possible, but possible in more scenarios (since we can do it for any single CQL-row lookup). Affected workloads may have noticeably degraded behaviour, and this will impact CASSANDRA-6477. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6237) Allow range deletions in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631193#comment-14631193 ] Benjamin Lerer commented on CASSANDRA-6237: --- {quote}It's not clear to me why some user function tests changed, e.g. here and here. Are these in scope for this ticket?{quote} Statements are processed in 2 phases. They are prepared and then executed. We try to preform the validation as much as possible in the preparation phase. It was not always the case for Insert/Update/Delete statements. I fixed that in my patch. Some of the unit tests were using invalid statements but as they were only testing the preparation phase no errors were thrown. After my patch it was no longer the case. So I had to make sure that the statements were valid. Allow range deletions in CQL Key: CASSANDRA-6237 URL: https://issues.apache.org/jira/browse/CASSANDRA-6237 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benjamin Lerer Priority: Minor Labels: cql, docs Fix For: 3.0.0 rc1 Attachments: CASSANDRA-6237.txt We uses RangeTombstones internally in a number of places, but we could expose more directly too. Typically, given a table like: {noformat} CREATE TABLE events ( id text, created_at timestamp, content text, PRIMARY KEY (id, created_at) ) {noformat} we could allow queries like: {noformat} DELETE FROM events WHERE id='someEvent' AND created_at 'Jan 3, 2013'; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size
[ https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631108#comment-14631108 ] Benedict commented on CASSANDRA-8894: - A few comments on the stress testing: * The blob_id population doesn't need to be constrained (it defaults to something like 1..100B) * To perform the inserts, we want to ensure we construct a dataset large enough to spill to disk, i.e. we want to probably insert at least 100M items (perhaps 200M+) if they're only ~50 bytes each. * We probably want to run with slightly more threads, say 300 The graphs don't appear to actually be broken that were produced: the stress run was simply extremely brief, since it only operated over 100K items :) At risk of sounding like a broken record to everyone, it can help to use K, M, B syntax for your numbers in the profile/command line. Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size -- Key: CASSANDRA-8894 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Labels: benedict-to-commit Fix For: 3.x Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml A large contributor to slower buffered reads than mmapped is likely that we read a full 64Kb at once, when average record sizes may be as low as 140 bytes on our stress tests. The TLB has only 128 entries on a modern core, and each read will touch 32 of these, meaning we are unlikely to almost ever be hitting the TLB, and will be incurring at least 30 unnecessary misses each time (as well as the other costs of larger than necessary accesses). When working with an SSD there is little to no benefit reading more than 4Kb at once, and in either case reading more data than we need is wasteful. So, I propose selecting a buffer size that is the next larger power of 2 than our average record size (with a minimum of 4Kb), so that we expect to read in one operation. I also propose that we create a pool of these buffers up-front, and that we ensure they are all exactly aligned to a virtual page, so that the source and target operations each touch exactly one virtual page per 4Kb of expected record size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9472) Reintroduce off heap memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-9472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631117#comment-14631117 ] Aleksey Yeschenko commented on CASSANDRA-9472: -- [~slebresne] I was referring to 'we should make memtables completely off-heap' comment. Reintroduce off heap memtables -- Key: CASSANDRA-9472 URL: https://issues.apache.org/jira/browse/CASSANDRA-9472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Fix For: 3.x CASSANDRA-8099 removes off heap memtables. We should reintroduce them ASAP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9472) Reintroduce off heap memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-9472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631121#comment-14631121 ] Benedict commented on CASSANDRA-9472: - bq. I don't think we should wait on Java 9 or anything to do that. Of course we will improve on all this in the future, but let's please leave that to some future ticket. Yes, I think we're all on the same page there. bq. re-introducing this is not terribly complex Also agreed, however if we're shooting for a beta release ASAP, it will not likely be done in time. We could perhaps sneak it in before RC or at .0 if we're willing to do that, of course. Perhaps since it was never taken out of experimental status that would be acceptable. But there is still a lot of other follow on work from 8099 to get through. Reintroduce off heap memtables -- Key: CASSANDRA-9472 URL: https://issues.apache.org/jira/browse/CASSANDRA-9472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Fix For: 3.x CASSANDRA-8099 removes off heap memtables. We should reintroduce them ASAP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
[ https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631124#comment-14631124 ] Benedict commented on CASSANDRA-7066: - Should be. In fact we'll need a metadata accepting method for the approach I've taken in CASSANDRA-9669. Which is to use a {{LifecycleTransaction}} for memtable flush, which means it needs to be constructed empty (but is an online operation). So there is synergy :) Simplify (and unify) cleanup of compaction leftovers Key: CASSANDRA-7066 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Priority: Minor Labels: benedict-to-commit, compaction Fix For: 3.x Attachments: 7066.txt Currently we manage a list of in-progress compactions in a system table, which we use to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files, or conversely not cleanup files that have been superceded); and 2) it's only used for a regular compaction - no other compaction types are guarded in the same way, so can result in duplication if we fail before deleting the replacements. I'd like to see each sstable store in its metadata its direct ancestors, and on startup we simply delete any sstables that occur in the union of all ancestor sets. This way as soon as we finish writing we're capable of cleaning up any leftovers, so we never get duplication. It's also much easier to reason about. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9519) CASSANDRA-8448 Doesn't seem to be fixed
[ https://issues.apache.org/jira/browse/CASSANDRA-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631184#comment-14631184 ] Jeremiah Jordan commented on CASSANDRA-9519: 8448 went in to 2.0.13 per the fix version? CASSANDRA-8448 Doesn't seem to be fixed --- Key: CASSANDRA-9519 URL: https://issues.apache.org/jira/browse/CASSANDRA-9519 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jeremiah Jordan Assignee: Sylvain Lebresne Fix For: 2.1.9, 2.2.0 Attachments: 9519.txt Still seeing the Comparison method violates its general contract! in 2.1.5 {code} java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeHi(TimSort.java:895) ~[na:1.8.0_45] at java.util.TimSort.mergeAt(TimSort.java:512) ~[na:1.8.0_45] at java.util.TimSort.mergeCollapse(TimSort.java:437) ~[na:1.8.0_45] at java.util.TimSort.sort(TimSort.java:241) ~[na:1.8.0_45] at java.util.Arrays.sort(Arrays.java:1512) ~[na:1.8.0_45] at java.util.ArrayList.sort(ArrayList.java:1454) ~[na:1.8.0_45] at java.util.Collections.sort(Collections.java:175) ~[na:1.8.0_45] at org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:158) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:187) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:152) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1530) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1688) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:256) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:63) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:260) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:272) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9797) Don't wrap byte arrays in SequentialWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631095#comment-14631095 ] Benedict commented on CASSANDRA-9797: - Nah. If CI is happy, patch LGTM Don't wrap byte arrays in SequentialWriter -- Key: CASSANDRA-9797 URL: https://issues.apache.org/jira/browse/CASSANDRA-9797 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Labels: performance Fix For: 3.x, 2.2.x Attachments: 9797.txt While profiling a simple stress write run ({{cassandra-stress write n=200 -rate threads=50}} to be precise) with Mission Control, I noticed that a non trivial amount of heap pressure was due to the {{ByteBuffer.wrap()}} call in {{SequentialWriter.write(byte[])}}. Basically, when writing a byte array, we wrap it in a ByteBuffer to reuse the {{SequentialWriter.write(ByteBuffer)}} method. One could have hoped this wrapping would be stack allocated, but if Mission Control isn't lying (and I was told it's fairly honest on that front), it's not. And we do use that {{write(byte[])}} method quite a bit, especially with the new vint encodings since they use a {{byte[]}} thread local buffer and call that method. Anyway, it sounds very simple to me to have a more direct {{write(byte[])}} method, so attaching a patch to do that. A very quick local benchmark seems to show a little bit less allocation and a slight edge for the branch with this patch (on top of CASSANDRA-9705 I must add), but that local bench was far from scientific so happy if someone that knows how to use our perf service want to give that patch a shot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
[ https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631109#comment-14631109 ] Benedict commented on CASSANDRA-7066: - LifecycleTransaction already supports this, by constructing it via the {{offline}} method call. Simplify (and unify) cleanup of compaction leftovers Key: CASSANDRA-7066 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Priority: Minor Labels: benedict-to-commit, compaction Fix For: 3.x Attachments: 7066.txt Currently we manage a list of in-progress compactions in a system table, which we use to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files, or conversely not cleanup files that have been superceded); and 2) it's only used for a regular compaction - no other compaction types are guarded in the same way, so can result in duplication if we fail before deleting the replacements. I'd like to see each sstable store in its metadata its direct ancestors, and on startup we simply delete any sstables that occur in the union of all ancestor sets. This way as soon as we finish writing we're capable of cleaning up any leftovers, so we never get duplication. It's also much easier to reason about. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9669) Commit Log Replay is Broken
[ https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631186#comment-14631186 ] Benedict commented on CASSANDRA-9669: - So, I am liking this approach less and less. It may be the least effort, but it has too many sharp edges, in critical portions of the system. It's also literally a custom endeavour for 2.0, 2.1, 2.2 _and_ 3.0. I think I will introduce a new commit log expiration ledger, and just write to it whenever we perform a {{discardCompletedSegments()}} call. This is then replayed prior to CL replay, to build the state of what records we consider replayable. Initially, I will limit this to a simple statement of latest replayposition we can be certain to have replayed to since this is a uniform behaviour for 2.0+. 2.1+ easily supports ranges, which can be implemented when we deliver CASSANDRA-8496. Commit Log Replay is Broken --- Key: CASSANDRA-9669 URL: https://issues.apache.org/jira/browse/CASSANDRA-9669 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Critical Labels: correctness Fix For: 3.x, 2.1.x, 2.2.x, 3.0.x While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, on restart we simply take the maximum replay position of any sstable on disk, and ignore anything prior. It is quite possible for there to be two flushes triggered for a given table, and for the second to finish first by virtue of containing a much smaller quantity of live data (or perhaps the disk is just under less pressure). If we crash before the first sstable has been written, then on restart the data it would have represented will disappear, since we will not replay the CL records. This looks to be a bug present since time immemorial, and also seems pretty serious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9519) CASSANDRA-8448 Doesn't seem to be fixed
[ https://issues.apache.org/jira/browse/CASSANDRA-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631190#comment-14631190 ] Sylvain Lebresne commented on CASSANDRA-9519: - Hum, indeed. Who does put 2.0.13 _after_ 2.1.3?! That's outrageous, I can't work in these conditions! I'll commit this to 2.0. CASSANDRA-8448 Doesn't seem to be fixed --- Key: CASSANDRA-9519 URL: https://issues.apache.org/jira/browse/CASSANDRA-9519 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jeremiah Jordan Assignee: Sylvain Lebresne Fix For: 2.1.9, 2.2.0 Attachments: 9519.txt Still seeing the Comparison method violates its general contract! in 2.1.5 {code} java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeHi(TimSort.java:895) ~[na:1.8.0_45] at java.util.TimSort.mergeAt(TimSort.java:512) ~[na:1.8.0_45] at java.util.TimSort.mergeCollapse(TimSort.java:437) ~[na:1.8.0_45] at java.util.TimSort.sort(TimSort.java:241) ~[na:1.8.0_45] at java.util.Arrays.sort(Arrays.java:1512) ~[na:1.8.0_45] at java.util.ArrayList.sort(ArrayList.java:1454) ~[na:1.8.0_45] at java.util.Collections.sort(Collections.java:175) ~[na:1.8.0_45] at org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:158) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:187) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:152) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1530) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1688) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:256) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:63) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:260) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:272) ~[cassandra-all-2.1.5.469.jar:2.1.5.469] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-6237) Allow range deletions in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631193#comment-14631193 ] Benjamin Lerer edited comment on CASSANDRA-6237 at 7/17/15 11:07 AM: - {quote}It's not clear to me why some user function tests changed, e.g. here and here. Are these in scope for this ticket?{quote} Statements are processed in 2 phases. They are prepared and then executed. We try to preform the validation, as much as possible, in the preparation phase. It was not always the case for Insert/Update/Delete statements. I fixed that in my patch. Some of the unit tests were using invalid statements but as they were only testing the preparation phase no errors were thrown. After my patch it was no longer the case. So, I had to make sure that the statements were valid. was (Author: blerer): {quote}It's not clear to me why some user function tests changed, e.g. here and here. Are these in scope for this ticket?{quote} Statements are processed in 2 phases. They are prepared and then executed. We try to preform the validation as much as possible in the preparation phase. It was not always the case for Insert/Update/Delete statements. I fixed that in my patch. Some of the unit tests were using invalid statements but as they were only testing the preparation phase no errors were thrown. After my patch it was no longer the case. So I had to make sure that the statements were valid. Allow range deletions in CQL Key: CASSANDRA-6237 URL: https://issues.apache.org/jira/browse/CASSANDRA-6237 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benjamin Lerer Priority: Minor Labels: cql, docs Fix For: 3.0.0 rc1 Attachments: CASSANDRA-6237.txt We uses RangeTombstones internally in a number of places, but we could expose more directly too. Typically, given a table like: {noformat} CREATE TABLE events ( id text, created_at timestamp, content text, PRIMARY KEY (id, created_at) ) {noformat} we could allow queries like: {noformat} DELETE FROM events WHERE id='someEvent' AND created_at 'Jan 3, 2013'; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)