[jira] [Created] (CASSANDRA-7048) Cannot get comparator 2 in CompositeType
Ben Hood created CASSANDRA-7048: --- Summary: Cannot get comparator 2 in CompositeType Key: CASSANDRA-7048 URL: https://issues.apache.org/jira/browse/CASSANDRA-7048 Project: Cassandra Issue Type: Bug Environment: Archlinux, AWS m1.large Reporter: Ben Hood Attachments: cassandra.log.zip I've left a Cassandra instance in limbo for the last days, meaning that it has been happily serving read requests, but I've cut off the data ingress, but I was doing some read-only development. After not writing anything to Cassandra for a few days, I got the following error for the first write to Cassandra: Caused by: java.lang.RuntimeException: Cannot get comparator 2 in org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.TimestampType,org.apache.cassandra.db.marshal.UTF8Type). This might due to a mismatch between the schema and the data read at org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:133) at org.apache.cassandra.db.marshal.AbstractCompositeType.split(AbstractCompositeType.java:137) at org.apache.cassandra.db.filter.ColumnCounter$GroupByPrefix.count(ColumnCounter.java:115) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:192) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1551) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1380) at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:327) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1341) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1896) ... 3 more Caused by: java.lang.IndexOutOfBoundsException: index (2) must be less than size (2) at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:306) at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:285) at com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:65) at org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:124) ... 17 more I'm not sure whether this is the root cause, so I'm attaching the server log file. I'm going to try to investigate a bit further, to see what changes, if any the application driver introduced. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7049) Rows with dynamic columns inserted via cassandra-cli are not shown via cqlsh
Sergey Bushik created CASSANDRA-7049: Summary: Rows with dynamic columns inserted via cassandra-cli are not shown via cqlsh Key: CASSANDRA-7049 URL: https://issues.apache.org/jira/browse/CASSANDRA-7049 Project: Cassandra Issue Type: Bug Components: Core Reporter: Sergey Bushik Fix For: 2.0.6 # use cli/thrift interface bin/cassandra-cli {code} create keyspace test; use test; # create column family first create column family t1 with comparator = UTF8Type and key_validation_class = UTF8Type and default_validation_class=UTF8Type and column_metadata = [{column_name: column1, validation_class: UTF8Type}]; # insert few rows into t1 with dynamic columns set t1['1']['column2'] = 'value2'; set t1['2']['column3'] = 'value3'; # list rows list t1; --- RowKey: 2 = (name=column3, value=value3, timestamp=1397717445436000) --- RowKey: 1 = (name=column2, value=value2, timestamp=1397717447253000) 2 Rows Returned. {code} # check rows are visible from cqlsh bin/cqlsh {code} use test; select * from t1; (0 rows) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7049) Rows with dynamic columns inserted via cassandra-cli are not shown via cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Bushik updated CASSANDRA-7049: - Description: {noformat} # use cli/thrift interface to create column family and insert 2 rows bin/cassandra-cli {noformat} {code} create keyspace test; use test; # create column family first create column family t1 with comparator = UTF8Type and key_validation_class = UTF8Type and default_validation_class=UTF8Type and column_metadata = [{column_name: column1, validation_class: UTF8Type}]; # insert few rows into t1 with dynamic columns set t1['1']['column2'] = 'value2'; set t1['2']['column3'] = 'value3'; # list rows list t1; --- RowKey: 2 = (name=column3, value=value3, timestamp=1397717445436000) --- RowKey: 1 = (name=column2, value=value2, timestamp=1397717447253000) 2 Rows Returned. {code} {noformat} # check that 2 rows are visible from cqlsh {noformat} bin/cqlsh {code} use test; select * from t1; (0 rows) {code} Expected result: Rows are show from CQL Actual result: No rows are displayed was: # use cli/thrift interface bin/cassandra-cli {code} create keyspace test; use test; # create column family first create column family t1 with comparator = UTF8Type and key_validation_class = UTF8Type and default_validation_class=UTF8Type and column_metadata = [{column_name: column1, validation_class: UTF8Type}]; # insert few rows into t1 with dynamic columns set t1['1']['column2'] = 'value2'; set t1['2']['column3'] = 'value3'; # list rows list t1; --- RowKey: 2 = (name=column3, value=value3, timestamp=1397717445436000) --- RowKey: 1 = (name=column2, value=value2, timestamp=1397717447253000) 2 Rows Returned. {code} # check rows are visible from cqlsh bin/cqlsh {code} use test; select * from t1; (0 rows) {code} Rows with dynamic columns inserted via cassandra-cli are not shown via cqlsh Key: CASSANDRA-7049 URL: https://issues.apache.org/jira/browse/CASSANDRA-7049 Project: Cassandra Issue Type: Bug Components: Core Reporter: Sergey Bushik Labels: cassandra-cli, cqlsh Fix For: 2.0.6 {noformat} # use cli/thrift interface to create column family and insert 2 rows bin/cassandra-cli {noformat} {code} create keyspace test; use test; # create column family first create column family t1 with comparator = UTF8Type and key_validation_class = UTF8Type and default_validation_class=UTF8Type and column_metadata = [{column_name: column1, validation_class: UTF8Type}]; # insert few rows into t1 with dynamic columns set t1['1']['column2'] = 'value2'; set t1['2']['column3'] = 'value3'; # list rows list t1; --- RowKey: 2 = (name=column3, value=value3, timestamp=1397717445436000) --- RowKey: 1 = (name=column2, value=value2, timestamp=1397717447253000) 2 Rows Returned. {code} {noformat} # check that 2 rows are visible from cqlsh {noformat} bin/cqlsh {code} use test; select * from t1; (0 rows) {code} Expected result: Rows are show from CQL Actual result: No rows are displayed -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7049) Rows with dynamic columns inserted via cassandra-cli are not shown via cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Bushik updated CASSANDRA-7049: - Description: {noformat} # use cli/thrift interface to create column family and insert 2 rows bin/cassandra-cli {noformat} {code} create keyspace test; use test; # create column family first create column family t1 with comparator = UTF8Type and key_validation_class = UTF8Type and default_validation_class=UTF8Type and column_metadata = [{column_name: column1, validation_class: UTF8Type}]; # insert few rows into t1 with dynamic columns set t1['1']['column2'] = 'value2'; set t1['2']['column3'] = 'value3'; # list rows list t1; --- RowKey: 2 = (name=column3, value=value3, timestamp=1397717445436000) --- RowKey: 1 = (name=column2, value=value2, timestamp=1397717447253000) 2 Rows Returned. {code} {noformat} # check that 2 rows are visible from cqlsh bin/cqlsh {noformat} {code} use test; select * from t1; (0 rows) {code} Expected result: Rows are show from CQL Actual result: No rows are displayed was: {noformat} # use cli/thrift interface to create column family and insert 2 rows bin/cassandra-cli {noformat} {code} create keyspace test; use test; # create column family first create column family t1 with comparator = UTF8Type and key_validation_class = UTF8Type and default_validation_class=UTF8Type and column_metadata = [{column_name: column1, validation_class: UTF8Type}]; # insert few rows into t1 with dynamic columns set t1['1']['column2'] = 'value2'; set t1['2']['column3'] = 'value3'; # list rows list t1; --- RowKey: 2 = (name=column3, value=value3, timestamp=1397717445436000) --- RowKey: 1 = (name=column2, value=value2, timestamp=1397717447253000) 2 Rows Returned. {code} {noformat} # check that 2 rows are visible from cqlsh {noformat} bin/cqlsh {code} use test; select * from t1; (0 rows) {code} Expected result: Rows are show from CQL Actual result: No rows are displayed Rows with dynamic columns inserted via cassandra-cli are not shown via cqlsh Key: CASSANDRA-7049 URL: https://issues.apache.org/jira/browse/CASSANDRA-7049 Project: Cassandra Issue Type: Bug Components: Core Reporter: Sergey Bushik Labels: cassandra-cli, cqlsh Fix For: 2.0.6 {noformat} # use cli/thrift interface to create column family and insert 2 rows bin/cassandra-cli {noformat} {code} create keyspace test; use test; # create column family first create column family t1 with comparator = UTF8Type and key_validation_class = UTF8Type and default_validation_class=UTF8Type and column_metadata = [{column_name: column1, validation_class: UTF8Type}]; # insert few rows into t1 with dynamic columns set t1['1']['column2'] = 'value2'; set t1['2']['column3'] = 'value3'; # list rows list t1; --- RowKey: 2 = (name=column3, value=value3, timestamp=1397717445436000) --- RowKey: 1 = (name=column2, value=value2, timestamp=1397717447253000) 2 Rows Returned. {code} {noformat} # check that 2 rows are visible from cqlsh bin/cqlsh {noformat} {code} use test; select * from t1; (0 rows) {code} Expected result: Rows are show from CQL Actual result: No rows are displayed -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972524#comment-13972524 ] Benedict commented on CASSANDRA-6694: - So, on the whole I really don't perceive this approach as better: there's a great deal of code duplication now (set to get worse still when you finish the refactor for DecoratedKey), between each of the correspondingly named cell implementations. Personally I think the Impl approach is neater as a result of avoiding that (this may be more pronounced if we decide to optimise equals() is you suggested). That said, if this moves us forwards I can live with it, if you can address point 1 below. There are a few problems though: # I am *very* opposed to a public setPeer() method. This is a deal breaker for me - but it can be avoided with a bit more refactoring. # Your optimised updateDigest function is actually much slower than the old implementation for all but the smallest values: an optimised version needs to batch the contents into an array (stored in a ThreadLocal) and call updateDigest with the array, unless the total size is very small (there's a crossover point on my laptop of about 12 bytes, under which it's faster to call update(byte)). # AbstractNativeCell.getBytes actually calls setBytes # excessHeapSize... should be unsharedHeapSize... # There should be no hashCode method in Buffer\*Cell - I removed these for a reason. Because we can have a Cell that is a CellName, and vice-versa, using a Cell as a key for a map is likely dangerous. Since we don't do it anywhere, it's safe to simply remove the methods. There may be other minor issues, I'll hold off giving it a formal review until we decide the direction we're going. To respond to a few of your comments: bq. CounterUpdateCell interface is missing as well as NativeCounterUpdateCell implementation to match it. There shouldn't be one for the time being - we can never construct one. bq. CounterUpdateCell should be BufferCounterUpdateCell as it extends BufferCell Same reason - it doesn't exist as either or, so I made a conscious decision to leave it as a CounterUpdateCell: the fact that it extends BufferCell is kind of unimportant. It's purpose is somewhat different, and I think it is better left named CounterUpdateCell, as that is its purpose (to carry a counter update as far as the memtable, and no further). bq. Impl classes extends another Impl classes which doesn't make much sense as all of the methods are static. This brings in the namespace of the extended class' static methods, which is useful. bq. When taken out of context like that it doesn't really make sense but what I meant, there are situation where we don't really need to get BB from the CellName but can transfer bytes directly (especially for the native cell implementations). Sure, but again: scope of ticket, and care needs to be taken when doing this (e.g. your updateDigest modifications) Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972776#comment-13972776 ] Pavel Yaskevich commented on CASSANDRA-6694: To address all of your comments this is not intended for any kind of review yet, it is just an idea demonstration that's why I basically carried over all of the methods from original implementations, didn't rename or move stuff. Also I'm fine if methods in both implementations are going to return constant values like serializationFlags or isMarkedForDeleted, a part from that there is not much of the code duplication, duplication is also going to be minimized when hashCode and other methods go away, which would probably only leave us with dataSize and serializedSize duplication but I guess we can come up with something clever for native cells there too. Regarding the point about updateDigest - it's meant more like representation of kind of things we can do if we have two different implementations of it, not optimized for performance yet. bq. There shouldn't be one for the time being - we can never construct one. and bq. Same reason - it doesn't exist as either or, so I made a conscious decision to leave it as a CounterUpdateCell: the fact that it extends BufferCell is kind of unimportant. It's purpose is somewhat different, and I think it is better left named CounterUpdateCell, as that is its purpose (to carry a counter update as far as the memtable, and no further). It is constructed in ColumnFamily and ColumnSerializer. If it's supposed to be only one implementation for now let's name it appropriately and use like all other buffered cells. bq. This brings in the namespace of the extended class' static methods, which is useful. By why do we care and what does it give us as those interfaces are called directly and static methods don't override each other? bq. Sure, but again: scope of ticket, and care needs to be taken when doing this (e.g. your updateDigest modifications) I don't really follow what are you implying with that, the scope is introduce native implementations as optimized as possible so why do we miss out of such low hanging fruit?... Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972796#comment-13972796 ] Benedict commented on CASSANDRA-6694: - bq. the scope is introduce native implementations -as optimized as possible- Otherwise we need to do a lot more than the changes you are suggesting :) bq. Also I'm fine if methods in both implementations are going to return constant values like serializationFlags or isMarkedForDeleted Well, these are still duplication - it is not clear as a result where the definition of these behaviours live. If the semantics change in future, it may introduce errors unnecessarily. Either way equals(), reconcile() and validateFields() will still be issues. You don't seem to have implemented most of these methods yet (looks like your code doesn't actually compile). These methods are each non-trivial amounts of code duplication, equals() especially so is we optimise it as you want to. CounterCell.diff() will also need to be duplicated. But, like I said, I can probably live with all of this if we address the setPeer() issue. equals() should probably still end up in a shared static method, at the very least, though. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972832#comment-13972832 ] Marcus Eriksson commented on CASSANDRA-6694: I'm +1 on [~benedict]s branch (have not looked at the one by [~xedin] yet) nits; * A few methods in Cell.Impl look redundant, isMarkedForDelete/isLive for example, kept around for symmetry? * License header in DeletedCell and ExpiringCell * Javadoc comment in NativeAllocator looks wrong Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7050) AbstractColumnFamilyInputFormat AbstractColumnFamilyOutputFormat throw NPE if username is provided but password is null
Mike Adamson created CASSANDRA-7050: --- Summary: AbstractColumnFamilyInputFormat AbstractColumnFamilyOutputFormat throw NPE if username is provided but password is null Key: CASSANDRA-7050 URL: https://issues.apache.org/jira/browse/CASSANDRA-7050 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Mike Adamson Priority: Minor Fix For: 2.0.7 If a username is provided to either of these classes but the password is null the thrift layer throws an NPE because it can't handle null values for the login. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7050) AbstractColumnFamilyInputFormat AbstractColumnFamilyOutputFormat throw NPE if username is provided but password is null
[ https://issues.apache.org/jira/browse/CASSANDRA-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Adamson updated CASSANDRA-7050: Attachment: 7050.patch Patch adds conditional check for password not being null before attempting login AbstractColumnFamilyInputFormat AbstractColumnFamilyOutputFormat throw NPE if username is provided but password is null - Key: CASSANDRA-7050 URL: https://issues.apache.org/jira/browse/CASSANDRA-7050 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Mike Adamson Priority: Minor Fix For: 2.0.7 Attachments: 7050.patch If a username is provided to either of these classes but the password is null the thrift layer throws an NPE because it can't handle null values for the login. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6572) Workload recording / playback
[ https://issues.apache.org/jira/browse/CASSANDRA-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972845#comment-13972845 ] Benedict commented on CASSANDRA-6572: - [~lyubent] a few comments/suggestions about the patch: # In the query recorder, it would make most sense to keep a writer handle open, instead of re-opening the file every time you append a new query. Ideally, we would probably have a buffer of some kind we write to in-memory, that we swap when we have flush to disk so that other queries can continue to log to the buffer without being impeded by the flush. # It's quite wasteful to convert the query string to base64 encoded bytes, and then to convert that back into a string. Should write the bytes straight into the new buffer # Since you're using an AtomicInteger, there's no need to use a lock there: can simply increment the counter and check the result (modulo frequency) to see if we should log. No need to reset to zero. Workload recording / playback - Key: CASSANDRA-6572 URL: https://issues.apache.org/jira/browse/CASSANDRA-6572 Project: Cassandra Issue Type: New Feature Components: Core, Tools Reporter: Jonathan Ellis Assignee: Lyuben Todorov Fix For: 2.0.8 Attachments: 6572-trunk.diff Write sample mode gets us part way to testing new versions against a real world workload, but we need an easy way to test the query side as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7041) Select query returns inconsistent result
[ https://issues.apache.org/jira/browse/CASSANDRA-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970601#comment-13970601 ] Ngoc Minh Vo edited comment on CASSANDRA-7041 at 4/17/14 11:37 AM: --- We implemented a test method in Java client to estimate the number of attempts required, for a failing query, to get a expected/non-empty result. The number of attempts is between 2 and ~40 and it very random... -No issue detected in other column families having more complicated schemas.- -Hence, it might relate to CF without data columns? (i.e. all columns are part of Primary Key)- This issue appears in multiple tables, not only the simple string_values one. Thanks in advance for your help. Best regards, Minh was (Author: vongocminh): We implemented a test method in Java client to estimate the number of attempts required, for a failing query, to get a expected/non-empty result. The number of attempts is between 2 and ~40 and it very random... No issue detected in other column families having more complicated schemas. Hence, it might relate to CF without data columns? (i.e. all columns are part of Primary Key) Thanks in advance for your help. Best regards, Minh Select query returns inconsistent result Key: CASSANDRA-7041 URL: https://issues.apache.org/jira/browse/CASSANDRA-7041 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra v2.0.6 (upgraded from v2.0.3) 4-node cluster: Windows7, 12GB JVM Reporter: Ngoc Minh Vo Priority: Critical Hello, We are running in an issue with C* v2.0.x: CQL queries randomly return empty result. Here is the scenario: 1. Schema: {noformat} CREATE TABLE string_values ( date int, field text, value text, PRIMARY KEY ((date, field), value) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {noformat} 2. There is no new data imported to the cluster during the test. 3. CQL query: {noformat} select * from string_values where date=20140122 and field='SCONYKSP1'; {noformat} 4. In Cqlsh, the same query has been executed several times during a short interval (~1-2 seconds). The first query results are empty and then we got the data. And from that point, we always get the correct result: {noformat} cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) ... ... cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; date | field | value --+---+- 20140122 | SCONYKSP1 | 201401220251826297a_0_3 (1 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; date | field | value --+---+- 20140122 | SCONYKSP1 | 201401220251826297a_0_3 (1 rows) {noformat} 5. It might relate to some kind of warmup process. We tried to disable key/data caching but it does not help. Upgrading cluster from v2.0.3 to v2.0.6 does not fix the issue (hence, not related to CASSANDRA-6555). Long time ago, we posted a report on Java Driver JIRA: https://datastax-oss.atlassian.net/browse/JAVA-217. But it seems that the issue is in the server side. Best regards, Minh -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7041) Select query returns inconsistent result
[ https://issues.apache.org/jira/browse/CASSANDRA-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972895#comment-13972895 ] Sylvain Lebresne commented on CASSANDRA-7041: - I see nothing here that indicates that you're using QUORUM consistency (and you do have 4 nodes, though you didn't indicated your replication factor). cqlsh uses CL.ONE by default in particular. If you don't use QUORUM consistency (and your replication factor is 1), then what you see is perfectly expected. Select query returns inconsistent result Key: CASSANDRA-7041 URL: https://issues.apache.org/jira/browse/CASSANDRA-7041 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra v2.0.6 (upgraded from v2.0.3) 4-node cluster: Windows7, 12GB JVM Reporter: Ngoc Minh Vo Priority: Critical Hello, We are running in an issue with C* v2.0.x: CQL queries randomly return empty result. Here is the scenario: 1. Schema: {noformat} CREATE TABLE string_values ( date int, field text, value text, PRIMARY KEY ((date, field), value) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {noformat} 2. There is no new data imported to the cluster during the test. 3. CQL query: {noformat} select * from string_values where date=20140122 and field='SCONYKSP1'; {noformat} 4. In Cqlsh, the same query has been executed several times during a short interval (~1-2 seconds). The first query results are empty and then we got the data. And from that point, we always get the correct result: {noformat} cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) ... ... cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; date | field | value --+---+- 20140122 | SCONYKSP1 | 201401220251826297a_0_3 (1 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; date | field | value --+---+- 20140122 | SCONYKSP1 | 201401220251826297a_0_3 (1 rows) {noformat} 5. It might relate to some kind of warmup process. We tried to disable key/data caching but it does not help. Upgrading cluster from v2.0.3 to v2.0.6 does not fix the issue (hence, not related to CASSANDRA-6555). Long time ago, we posted a report on Java Driver JIRA: https://datastax-oss.atlassian.net/browse/JAVA-217. But it seems that the issue is in the server side. Best regards, Minh -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7041) Select query returns inconsistent result
[ https://issues.apache.org/jira/browse/CASSANDRA-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ngoc Minh Vo updated CASSANDRA-7041: Reproduced In: 2.0.6, 2.0.3 (was: 2.0.3, 2.0.6) Environment: Cassandra v2.0.6 (upgraded from v2.0.3) 4-node RF=3, cluster: Windows7, 12GB JVM was: Cassandra v2.0.6 (upgraded from v2.0.3) 4-node cluster: Windows7, 12GB JVM Select query returns inconsistent result Key: CASSANDRA-7041 URL: https://issues.apache.org/jira/browse/CASSANDRA-7041 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra v2.0.6 (upgraded from v2.0.3) 4-node RF=3, cluster: Windows7, 12GB JVM Reporter: Ngoc Minh Vo Priority: Critical Hello, We are running in an issue with C* v2.0.x: CQL queries randomly return empty result. Here is the scenario: 1. Schema: {noformat} CREATE TABLE string_values ( date int, field text, value text, PRIMARY KEY ((date, field), value) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {noformat} 2. There is no new data imported to the cluster during the test. 3. CQL query: {noformat} select * from string_values where date=20140122 and field='SCONYKSP1'; {noformat} 4. In Cqlsh, the same query has been executed several times during a short interval (~1-2 seconds). The first query results are empty and then we got the data. And from that point, we always get the correct result: {noformat} cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) ... ... cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; date | field | value --+---+- 20140122 | SCONYKSP1 | 201401220251826297a_0_3 (1 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; date | field | value --+---+- 20140122 | SCONYKSP1 | 201401220251826297a_0_3 (1 rows) {noformat} 5. It might relate to some kind of warmup process. We tried to disable key/data caching but it does not help. Upgrading cluster from v2.0.3 to v2.0.6 does not fix the issue (hence, not related to CASSANDRA-6555). Long time ago, we posted a report on Java Driver JIRA: https://datastax-oss.atlassian.net/browse/JAVA-217. But it seems that the issue is in the server side. Best regards, Minh -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7041) Select query returns inconsistent result
[ https://issues.apache.org/jira/browse/CASSANDRA-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972901#comment-13972901 ] Ngoc Minh Vo commented on CASSANDRA-7041: - Hello Sylvain, Thanks for your prompt answer. Indeed, the issue is related to discrepancies in our data on date 20140122. Queries on other dates worked fine. Change CL to QUORUM solved the issue! Do we need to set CL to QUORUM on write queries as well? With the default setting (ONE), we didn't get any error during data import. Best regards, Minh Select query returns inconsistent result Key: CASSANDRA-7041 URL: https://issues.apache.org/jira/browse/CASSANDRA-7041 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra v2.0.6 (upgraded from v2.0.3) 4-node cluster: Windows7, 12GB JVM Reporter: Ngoc Minh Vo Priority: Critical Hello, We are running in an issue with C* v2.0.x: CQL queries randomly return empty result. Here is the scenario: 1. Schema: {noformat} CREATE TABLE string_values ( date int, field text, value text, PRIMARY KEY ((date, field), value) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {noformat} 2. There is no new data imported to the cluster during the test. 3. CQL query: {noformat} select * from string_values where date=20140122 and field='SCONYKSP1'; {noformat} 4. In Cqlsh, the same query has been executed several times during a short interval (~1-2 seconds). The first query results are empty and then we got the data. And from that point, we always get the correct result: {noformat} cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) ... ... cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; date | field | value --+---+- 20140122 | SCONYKSP1 | 201401220251826297a_0_3 (1 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; date | field | value --+---+- 20140122 | SCONYKSP1 | 201401220251826297a_0_3 (1 rows) {noformat} 5. It might relate to some kind of warmup process. We tried to disable key/data caching but it does not help. Upgrading cluster from v2.0.3 to v2.0.6 does not fix the issue (hence, not related to CASSANDRA-6555). Long time ago, we posted a report on Java Driver JIRA: https://datastax-oss.atlassian.net/browse/JAVA-217. But it seems that the issue is in the server side. Best regards, Minh -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6591) un-deprecate cache recentHitRate and expose in o.a.c.metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972905#comment-13972905 ] Chris Burroughs commented on CASSANDRA-6591: What's next for this ticket? un-deprecate cache recentHitRate and expose in o.a.c.metrics Key: CASSANDRA-6591 URL: https://issues.apache.org/jira/browse/CASSANDRA-6591 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Burroughs Assignee: Chris Burroughs Priority: Minor Attachments: j6591-1.2-v1.txt, j6591-1.2-v2.txt, j6591-1.2-v3.txt recentHitRate metrics were not added as part of CASSANDRA-4009 because there is not an obvious way to do it with the Metrics library. Instead hitRate was added as an all time measurement since node restart. This does allow changes in cache rate (aka production performance problems) to be detected. Ideally there would be 1/5/15 moving averages for the hit rate, but I'm not sure how to calculate that. Instead I propose updating recentHitRate on a fixed interval and exposing that as a Gauge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (CASSANDRA-7041) Select query returns inconsistent result
[ https://issues.apache.org/jira/browse/CASSANDRA-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-7041. - Resolution: Invalid Reproduced In: 2.0.6, 2.0.3 (was: 2.0.3, 2.0.6) Yes, you need QUORUM for writes and reads if you want to be guaranteed to see your write right away. I strongly encourage you to read documentations to understand how consistency level works as this is a pretty fundamental concept in Cassandra. Most Cassandra introduction you can found easily with google should help you there but you can probably start [here|http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html]. Select query returns inconsistent result Key: CASSANDRA-7041 URL: https://issues.apache.org/jira/browse/CASSANDRA-7041 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra v2.0.6 (upgraded from v2.0.3) 4-node RF=3, cluster: Windows7, 12GB JVM Reporter: Ngoc Minh Vo Priority: Critical Hello, We are running in an issue with C* v2.0.x: CQL queries randomly return empty result. Here is the scenario: 1. Schema: {noformat} CREATE TABLE string_values ( date int, field text, value text, PRIMARY KEY ((date, field), value) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {noformat} 2. There is no new data imported to the cluster during the test. 3. CQL query: {noformat} select * from string_values where date=20140122 and field='SCONYKSP1'; {noformat} 4. In Cqlsh, the same query has been executed several times during a short interval (~1-2 seconds). The first query results are empty and then we got the data. And from that point, we always get the correct result: {noformat} cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) ... ... cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; (0 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; date | field | value --+---+- 20140122 | SCONYKSP1 | 201401220251826297a_0_3 (1 rows) cqlsh:titan_test select * from string_values where date=20140122 and field='SCONYKSP1'; date | field | value --+---+- 20140122 | SCONYKSP1 | 201401220251826297a_0_3 (1 rows) {noformat} 5. It might relate to some kind of warmup process. We tried to disable key/data caching but it does not help. Upgrading cluster from v2.0.3 to v2.0.6 does not fix the issue (hence, not related to CASSANDRA-6555). Long time ago, we posted a report on Java Driver JIRA: https://datastax-oss.atlassian.net/browse/JAVA-217. But it seems that the issue is in the server side. Best regards, Minh -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7051) UnsupportedOperationException
Digant Modha created CASSANDRA-7051: --- Summary: UnsupportedOperationException Key: CASSANDRA-7051 URL: https://issues.apache.org/jira/browse/CASSANDRA-7051 Project: Cassandra Issue Type: Bug Components: API, Core Environment: Cassandra 2.0.6 Reporter: Digant Modha Priority: Critical UnsupportedOperationException exception thrown when using batchstatement. This is because in org.apache.cassandra.cql3.statements.BatchStatement.unzipMutations returns a collection that does not support add if the size of mutation is 1. STACK: throws UnsupportedOperationException. Daemon Thread [Native-Transport-Requests:1043] (Suspended (entry into method init in UnsupportedOperationException)) UnsupportedOperationException.init() line: 42 [local variables unavailable] HashMap$Values(AbstractCollectionE).add(E) line: 260 HashMap$Values(AbstractCollectionE).addAll(Collection? extends E) line: 342 StorageProxy.mutateWithTriggers(CollectionIMutation, ConsistencyLevel, boolean) line: 519 BatchStatement.executeWithoutConditions(CollectionIMutation, ConsistencyLevel) line: 210 BatchStatement.execute(BatchStatement$BatchVariables, boolean, ConsistencyLevel, long) line: 203 BatchStatement.executeWithPerStatementVariables(ConsistencyLevel, QueryState, ListListByteBuffer) line: 192 QueryProcessor.processBatch(BatchStatement, ConsistencyLevel, QueryState, ListListByteBuffer, ListObject) line: 373 BatchMessage.execute(QueryState) line: 206 Message$Dispatcher.messageReceived(ChannelHandlerContext, MessageEvent) line: 304 Message$Dispatcher(SimpleChannelUpstreamHandler).handleUpstream(ChannelHandlerContext, ChannelEvent) line: 70 DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline$DefaultChannelHandlerContext, ChannelEvent) line: 564 DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(ChannelEvent) line: 791 ChannelUpstreamEventRunnable.doRun() line: 43 ChannelUpstreamEventRunnable(ChannelEventRunnable).run() line: 67 RequestThreadPoolExecutor(ThreadPoolExecutor).runWorker(ThreadPoolExecutor$Worker) line: 1145 ThreadPoolExecutor$Worker.run() line: 615 Thread.run() line: 744 org.apache.cassandra.cql3.statements.BatchStatement: private Collection? extends IMutation unzipMutations(MapString, MapByteBuffer, IMutation mutations) { // The case where all statement where on the same keyspace is pretty common if (mutations.size() == 1) return mutations.values().iterator().next().values(); ListIMutation ms = new ArrayList(); for (MapByteBuffer, IMutation ksMap : mutations.values()) ms.addAll(ksMap.values()); return ms; } -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6847) The binary transport doesn't load truststore file
[ https://issues.apache.org/jira/browse/CASSANDRA-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972980#comment-13972980 ] Jeremiah Jordan commented on CASSANDRA-6847: Can we put a line in changes.txt for this? I spent 2 days pulling my hair out from this one, and yes I probably should have done a full JIRA search, but I would expect require_client_auth being completely broken to show up in changes.txt :/ The binary transport doesn't load truststore file - Key: CASSANDRA-6847 URL: https://issues.apache.org/jira/browse/CASSANDRA-6847 Project: Cassandra Issue Type: Bug Reporter: Mikhail Stepura Assignee: Mikhail Stepura Priority: Minor Labels: ssl Fix For: 1.2.16, 2.0.7, 2.1 beta2 Attachments: cassandra-2.0-6847.patch {code:title=org.apache.cassandra.transport.Server.SecurePipelineFactory} this.sslContext = SSLFactory.createSSLContext(encryptionOptions, false); {code} {{false}} there means that {{truststore}} file won't be loaded in any case. And that means that the file will not be used to validate clients when {{require_client_auth==true}}, making http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureNewTrustedUsers_t.html meaningless. The only way to workaround that currently is to start C* with {{-Djavax.net.ssl.trustStore=conf/.truststore}} I believe we should load {{truststore}} when {{require_client_auth==true}}, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7047) TriggerExecutor should group mutations by row key
[ https://issues.apache.org/jira/browse/CASSANDRA-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Bossa updated CASSANDRA-7047: Attachment: CASSANDRA-7047.patch TriggerExecutor should group mutations by row key - Key: CASSANDRA-7047 URL: https://issues.apache.org/jira/browse/CASSANDRA-7047 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sergio Bossa Assignee: Sergio Bossa Attachments: CASSANDRA-7047.patch TriggerExecutor doesn't currently group mutations returned by triggers even if belonging to the same row key: while harmful per se (at least, I think so), this is definitely a performance problem, because each mutation is a *cluster* mutation, generating more network traffic, more disk IO and more index calls (if present). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7047) TriggerExecutor should group mutations by row key
[ https://issues.apache.org/jira/browse/CASSANDRA-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-7047: - Reviewer: Aleksey Yeschenko TriggerExecutor should group mutations by row key - Key: CASSANDRA-7047 URL: https://issues.apache.org/jira/browse/CASSANDRA-7047 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sergio Bossa Assignee: Sergio Bossa Attachments: CASSANDRA-7047.patch TriggerExecutor doesn't currently group mutations returned by triggers even if belonging to the same row key: while harmful per se (at least, I think so), this is definitely a performance problem, because each mutation is a *cluster* mutation, generating more network traffic, more disk IO and more index calls (if present). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-3668) Parallel streaming for sstableloader
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-3668: -- Reviewer: Yuki Morishita (was: Peter Schuller) Parallel streaming for sstableloader Key: CASSANDRA-3668 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 Project: Cassandra Issue Type: Improvement Components: API Reporter: Manish Zope Assignee: Joshua McKenzie Priority: Minor Labels: streaming Fix For: 2.1 beta2 Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 3668_v2.txt, 3688-reply_before_closing_writer.txt, sstable-loader performance.txt Original Estimate: 48h Remaining Estimate: 48h One of my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader. ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 As stated in above issue generator performance is rectified but performance of the sstableloader is still an issue. 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the problem with sstableloader still exists. So opening other issue so that sstbleloader problem should not go unnoticed. FYI : We have tested the generator part with the patch given in 3589.Its Working fine. Please let us know if you guys require further inputs from our side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973220#comment-13973220 ] Pavel Yaskevich commented on CASSANDRA-6694: bq. Well, these are still duplication - it is not clear as a result where the definition of these behaviours live. If the semantics change in future, it may introduce errors unnecessarily. Either way equals(), reconcile() and validateFields() will still be issues. You don't seem to have implemented most of these methods yet (looks like your code doesn't actually compile). These methods are each non-trivial amounts of code duplication, equals() especially so is we optimise it as you want to. CounterCell.diff() will also need to be duplicated. Most of the duplicated methods are methods with static behavior which is not going to change e.g. isMarkedForDelete, getMarkedForDeleteAt or serializationFlags. CounterCell.diff and reconcile are living in the interface for now. I will address setPeer(long) problem and hashCode. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973226#comment-13973226 ] Benedict commented on CASSANDRA-6694: - bq. CounterCell.diff and reconcile are living in the interface for now Ah. This is a Java 8 only feature, which is why I missed it. Not really feasible. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973297#comment-13973297 ] Pavel Yaskevich commented on CASSANDRA-6694: I'm not talking about default methods in interfaces, I'm just saying that I added static diff/reconcile to CounterCell for now :) Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6962) examine shortening path length post-5202
[ https://issues.apache.org/jira/browse/CASSANDRA-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973336#comment-13973336 ] Yuki Morishita commented on CASSANDRA-6962: --- This turns out to be a bit complex than I first thought because secondary index CFs are flushing to the same directory. :( Any ideas? examine shortening path length post-5202 Key: CASSANDRA-6962 URL: https://issues.apache.org/jira/browse/CASSANDRA-6962 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Brandon Williams Assignee: Yuki Morishita Fix For: 2.1 beta2 From CASSANDRA-5202 discussion: {quote} Did we give up on this? Could we clean up the redundancy a little by moving the ID into the directory name? e.g., ks/cf-uuid/version-generation-component.db I'm worried about path length, which is limited on Windows. Edit: to give a specific example, for KS foo Table bar we now have /var/lib/cassandra/flush/foo/bar-2fbb89709a6911e3b7dc4d7d4e3ca4b4/foo-bar-ka-1-Data.db I'm proposing /var/lib/cassandra/flush/foo/bar-2fbb89709a6911e3b7dc4d7d4e3ca4b4/ka-1-Data.db {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7052) Query on compact storage with limit returns extra rows
Stuart Freeman created CASSANDRA-7052: - Summary: Query on compact storage with limit returns extra rows Key: CASSANDRA-7052 URL: https://issues.apache.org/jira/browse/CASSANDRA-7052 Project: Cassandra Issue Type: Bug Reporter: Stuart Freeman I tested this on Cassandra 2.0.6 and 2.0.3 and got the same result on both: {code} cqlsh create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE test; cqlsh:test CREATE COLUMNFAMILY VerifyPagedColumnQueryStartAndEnd (keyId text, columnName text, value text, PRIMARY KEY (keyId, columnName)) WITH COMPACT STORAGE; cqlsh:test INSERT INTO VerifyPagedColumnQueryStartAndEnd (keyId, columnName, value) VALUES ( 'key', 'a', '1' ) ; cqlsh:test INSERT INTO VerifyPagedColumnQueryStartAndEnd (keyId, columnName, value) VALUES ( 'key', 'b', '1' ) ; cqlsh:test INSERT INTO VerifyPagedColumnQueryStartAndEnd (keyId, columnName, value) VALUES ( 'key', 'c', '1' ) ; cqlsh:test INSERT INTO VerifyPagedColumnQueryStartAndEnd (keyId, columnName, value) VALUES ( 'key', 'd', '1' ) ; cqlsh:test INSERT INTO VerifyPagedColumnQueryStartAndEnd (keyId, columnName, value) VALUES ( 'key', 'e', '1' ) ; cqlsh:test SELECT * FROM VerifyPagedColumnQueryStartAndEnd WHERE keyId = 'key' AND columnName '' AND columnName = 'e' LIMIT 2; keyId | columnName | value ---++--- key | a | 1 key | b | 1 key | c | 1 (3 rows) cqlsh:test {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973391#comment-13973391 ] Tyler Hobbs commented on CASSANDRA-6875: Regarding prepared statements, I assume we want to support all of the following: * {{... WHERE (k, c1) IN ?}} * {{... WHERE (k, c1) IN (?, ?, ...)}} * {{... WHERE (k, c1) IN ((?, ?), (?, ?), ...)}} CQL3: select multiple CQL rows in a single partition using IN - Key: CASSANDRA-6875 URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 Project: Cassandra Issue Type: Bug Components: API Reporter: Nicolas Favre-Felix Assignee: Tyler Hobbs Priority: Minor Fix For: 2.0.8 In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is important to support reading several distinct CQL rows from a given partition using a distinct set of coordinates for these rows within the partition. CASSANDRA-4851 introduced a range scan over the multi-dimensional space of clustering keys. We also need to support a multi-get of CQL rows, potentially using the IN keyword to define a set of clustering keys to fetch at once. (reusing the same example\:) Consider the following table: {code} CREATE TABLE test ( k int, c1 int, c2 int, PRIMARY KEY (k, c1, c2) ); {code} with the following data: {code} k | c1 | c2 ---++ 0 | 0 | 0 0 | 0 | 1 0 | 1 | 0 0 | 1 | 1 {code} We can fetch a single row or a range of rows, but not a set of them: {code} SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; Bad Request: line 1:54 missing EOF at ',' {code} Supporting this syntax would return: {code} k | c1 | c2 ---++ 0 | 0 | 0 0 | 1 | 1 {code} Being able to fetch these two CQL rows in a single read is important to maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973391#comment-13973391 ] Tyler Hobbs edited comment on CASSANDRA-6875 at 4/17/14 8:50 PM: - Regarding prepared statements, I assume we want to support all of the following: * {{... WHERE (c1, c2) IN ?}} * {{... WHERE (c1, c2) IN (?, ?, ...)}} * {{... WHERE (c1, c2) IN ((?, ?), (?, ?), ...)}} was (Author: thobbs): Regarding prepared statements, I assume we want to support all of the following: * {{... WHERE (k, c1) IN ?}} * {{... WHERE (k, c1) IN (?, ?, ...)}} * {{... WHERE (k, c1) IN ((?, ?), (?, ?), ...)}} CQL3: select multiple CQL rows in a single partition using IN - Key: CASSANDRA-6875 URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 Project: Cassandra Issue Type: Bug Components: API Reporter: Nicolas Favre-Felix Assignee: Tyler Hobbs Priority: Minor Fix For: 2.0.8 In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is important to support reading several distinct CQL rows from a given partition using a distinct set of coordinates for these rows within the partition. CASSANDRA-4851 introduced a range scan over the multi-dimensional space of clustering keys. We also need to support a multi-get of CQL rows, potentially using the IN keyword to define a set of clustering keys to fetch at once. (reusing the same example\:) Consider the following table: {code} CREATE TABLE test ( k int, c1 int, c2 int, PRIMARY KEY (k, c1, c2) ); {code} with the following data: {code} k | c1 | c2 ---++ 0 | 0 | 0 0 | 0 | 1 0 | 1 | 0 0 | 1 | 1 {code} We can fetch a single row or a range of rows, but not a set of them: {code} SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; Bad Request: line 1:54 missing EOF at ',' {code} Supporting this syntax would return: {code} k | c1 | c2 ---++ 0 | 0 | 0 0 | 1 | 1 {code} Being able to fetch these two CQL rows in a single read is important to maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973395#comment-13973395 ] Aleksey Yeschenko commented on CASSANDRA-6694: -- bq. It's purpose is somewhat different, and I think it is better left named CounterUpdateCell, as that is its purpose (to carry a counter update as far as the memtable, and no further). FWIW it doesn't even make it to a memtable in 2.1, ever. That said, not calling it BufferCounterUpdateCell would be bothering my consistency OCD, a lot, and I'm not done with counters until 3.0. Can you make my OCD a tiny favor and call it consistently with the other implementations? (: Thanks. bq. There should be no hashCode method in Buffer*Cell - I removed these for a reason. Because we can have a Cell that is a CellName, and vice-versa, using a Cell as a key for a map is likely dangerous. Since we don't do it anywhere, it's safe to simply remove the methods. Maybe we should just throw UnsupportedOperationException then, but leave the methods? I agree that using Cell-s as keys is very unlikely, but stuff like this has bitten us before. Haven't read either branch yet, but planning to soon, just wanted to jump at the opportunity to bikeshed a bit. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973405#comment-13973405 ] Benedict commented on CASSANDRA-6694: - bq. Can you make my OCD a tiny favor and call it consistently with the other implementations? (: Thanks. Sure. I have a preference to keep it that way, but not a strong one. bq. Maybe we should just throw UnsupportedOperationException then, but leave the methods? I agree that using Cell-s as keys is very unlikely, but stuff like this has bitten us before. Also sure. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973435#comment-13973435 ] Pavel Yaskevich edited comment on CASSANDRA-6694 at 4/17/14 9:29 PM: - Regarding hashCode that's what we do, I do it in AbstractCell now, Benedict does it in both BufferCell and NativeCell. was (Author: xedin): Regarding, the hashCode that's what we do, I do it in AbstractCell now, Benedict does it in both BufferCell and NativeCell. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973435#comment-13973435 ] Pavel Yaskevich commented on CASSANDRA-6694: Regarding, the hashCode that's what we do, I do it in AbstractCell now, Benedict does it in both BufferCell and NativeCell. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6999) Batchlog replay should account for CF truncation records
[ https://issues.apache.org/jira/browse/CASSANDRA-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973447#comment-13973447 ] Jonathan Ellis commented on CASSANDRA-6999: --- Looks to me like the ImmutableSet copy in replayBatch is unnecessary, since mutation.without creates a new modifications map rather than modifying the original. Who wins on ties? Should writtenAt SystemTable.getTruncatedAt be =? Rest LGTM. Batchlog replay should account for CF truncation records Key: CASSANDRA-6999 URL: https://issues.apache.org/jira/browse/CASSANDRA-6999 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Fix For: 1.2.17, 2.0.8, 2.1 beta2 Just as HHOM does, BM should properly handle column families' truncation records and not replay mutations that are younger that the last known record. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6999) Batchlog replay should account for CF truncation records
[ https://issues.apache.org/jira/browse/CASSANDRA-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973456#comment-13973456 ] Aleksey Yeschenko commented on CASSANDRA-6999: -- bq. Looks to me like the ImmutableSet copy in replayBatch is unnecessary, since mutation.without creates a new modifications map rather than modifying the original. It is necessary :( http://docs.oracle.com/javase/7/docs/api/java/util/Map.html#keySet() - if not copied, might return a ConcurrentModificationException. bq. Who wins on ties? Should writtenAt SystemTable.getTruncatedAt be =? Probably. Will alter HHOM to use = as well. Batchlog replay should account for CF truncation records Key: CASSANDRA-6999 URL: https://issues.apache.org/jira/browse/CASSANDRA-6999 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Fix For: 1.2.17, 2.0.8, 2.1 beta2 Just as HHOM does, BM should properly handle column families' truncation records and not replay mutations that are younger that the last known record. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6999) Batchlog replay should account for CF truncation records
[ https://issues.apache.org/jira/browse/CASSANDRA-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973484#comment-13973484 ] Aleksey Yeschenko commented on CASSANDRA-6999: -- NVM, you were right about ImmutableSet copy in replayBatch being unnecessary, sorry. Batchlog replay should account for CF truncation records Key: CASSANDRA-6999 URL: https://issues.apache.org/jira/browse/CASSANDRA-6999 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Fix For: 1.2.17, 2.0.8, 2.1 beta2 Just as HHOM does, BM should properly handle column families' truncation records and not replay mutations that are younger that the last known record. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7053) USING TIMESTAMP for batches does not work
Robert Supencheck created CASSANDRA-7053: Summary: USING TIMESTAMP for batches does not work Key: CASSANDRA-7053 URL: https://issues.apache.org/jira/browse/CASSANDRA-7053 Project: Cassandra Issue Type: Bug Reporter: Robert Supencheck When using the USING TIMESTAMP timestamp syntax for a batch statement, the supplied timestamp is ignored. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7053) USING TIMESTAMP for batches does not work
[ https://issues.apache.org/jira/browse/CASSANDRA-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973522#comment-13973522 ] Robert Supencheck commented on CASSANDRA-7053: -- Replication steps: 1) Invoke the cqlsh prompt; 2) Create a keyspace: create keyspace test with replication = {'class':'SimpleStrategy','replication_factor':1}; 3) Choose to use the keyspace, test; 4) Create a table in the test keyspace: CREATE TABLE test_table ( key text PRIMARY KEY, data text) ; 5) Attempt a batch insert, using a timestamp, in the table, test_table: BEGIN BATCH USING TIMESTAMP INSERT INTO test_table (key, data ) VALUES ( 'key1', 'some data 1'); INSERT INTO test_table (key, data) VALUES ( 'key2', 'some data 2') ; APPLY BATCH ; 6) View the timestamps on the newly inserted table entries to observe that the timestamps are not as specified: select writetime(data), key, data from test_table; writetime(data) | key | data --+--+- 1397772023766000 | key1 | some data 1 1397772023766000 | key2 | some data 2 (2 rows) *** The expected behavior is that the timestamps in the resulting table should be . USING TIMESTAMP for batches does not work - Key: CASSANDRA-7053 URL: https://issues.apache.org/jira/browse/CASSANDRA-7053 Project: Cassandra Issue Type: Bug Reporter: Robert Supencheck Labels: cqlsh When using the USING TIMESTAMP timestamp syntax for a batch statement, the supplied timestamp is ignored. -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: ix batchlog to account for CF truncation records
Repository: cassandra Updated Branches: refs/heads/cassandra-1.2 fe94e90f4 - f46c6578c ix batchlog to account for CF truncation records patch by Aleksey Yeschenko; reviewed by Jonathan Ellis for CASSANDRA-6999 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f46c6578 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f46c6578 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f46c6578 Branch: refs/heads/cassandra-1.2 Commit: f46c6578c2fb905cd88681d80218d89798032e03 Parents: fe94e90 Author: Aleksey Yeschenko alek...@apache.org Authored: Fri Apr 18 01:36:08 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Fri Apr 18 01:36:08 2014 +0300 -- CHANGES.txt | 1 + .../apache/cassandra/db/BatchlogManager.java| 102 --- .../apache/cassandra/db/ColumnFamilyStore.java | 6 -- .../cassandra/db/HintedHandOffManager.java | 16 +-- .../org/apache/cassandra/db/RowMutation.java| 6 +- .../org/apache/cassandra/db/SystemTable.java| 53 +++--- .../db/commitlog/CommitLogReplayer.java | 4 +- .../apache/cassandra/service/StorageProxy.java | 9 +- .../cassandra/db/BatchlogManagerTest.java | 78 -- .../apache/cassandra/db/HintedHandOffTest.java | 2 +- 10 files changed, 189 insertions(+), 88 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f46c6578/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 07c09cf..bb08a37 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -5,6 +5,7 @@ * Schedule schema pulls on change (CASSANDRA-6971) * Non-droppable verbs shouldn't be dropped from OTC (CASSANDRA-6980) * Shutdown batchlog executor in SS#drain() (CASSANDRA-7025) + * Fix batchlog to account for CF truncation records (CASSANDRA-6999) 1.2.16 http://git-wip-us.apache.org/repos/asf/cassandra/blob/f46c6578/src/java/org/apache/cassandra/db/BatchlogManager.java -- diff --git a/src/java/org/apache/cassandra/db/BatchlogManager.java b/src/java/org/apache/cassandra/db/BatchlogManager.java index b8dbadd..ea32e9d 100644 --- a/src/java/org/apache/cassandra/db/BatchlogManager.java +++ b/src/java/org/apache/cassandra/db/BatchlogManager.java @@ -24,10 +24,7 @@ import java.lang.management.ManagementFactory; import java.net.InetAddress; import java.nio.ByteBuffer; import java.util.*; -import java.util.concurrent.CopyOnWriteArraySet; -import java.util.concurrent.ExecutionException; -import java.util.concurrent.ScheduledExecutorService; -import java.util.concurrent.TimeUnit; +import java.util.concurrent.*; import java.util.concurrent.atomic.AtomicBoolean; import java.util.concurrent.atomic.AtomicLong; import javax.management.MBeanServer; @@ -36,6 +33,7 @@ import javax.management.ObjectName; import com.google.common.annotations.VisibleForTesting; import com.google.common.collect.Iterables; import com.google.common.collect.Lists; +import com.google.common.collect.Sets; import com.google.common.util.concurrent.RateLimiter; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -254,45 +252,72 @@ public class BatchlogManager implements BatchlogManagerMBean { DataInputStream in = new DataInputStream(ByteBufferUtil.inputStream(data)); int size = in.readInt(); +ListRowMutation mutations = new ArrayListRowMutation(size); + for (int i = 0; i size; i++) -replaySerializedMutation(RowMutation.serializer.deserialize(in, VERSION), writtenAt, rateLimiter); +{ +RowMutation mutation = RowMutation.serializer.deserialize(in, VERSION); + +// Remove CFs that have been truncated since. writtenAt and SystemTable#getTruncatedAt() both return millis. +// We don't abort the replay entirely b/c this can be considered a succes (truncated is same as delivered then +// truncated. +for (UUID cfId : mutation.getColumnFamilyIds()) +if (writtenAt = SystemTable.getTruncatedAt(cfId)) +mutation = mutation.without(cfId); + +if (!mutation.isEmpty()) +mutations.add(mutation); +} + +if (!mutations.isEmpty()) +replayMutations(mutations, writtenAt, rateLimiter); } /* * We try to deliver the mutations to the replicas ourselves if they are alive and only resort to writing hints * when a replica is down or a write request times out. */ -private void replaySerializedMutation(RowMutation mutation, long writtenAt, RateLimiter rateLimiter) throws IOException +private void
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973534#comment-13973534 ] Pavel Yaskevich commented on CASSANDRA-6694: Ok, hashCode and setPeer changes are now pushed to the same branch, AbstractNativeCell is independent of NativeAllocation now because NativeAllocator returns aligned peer directly, which allows peer field to be made final in AbstractNativeCell. Also I have pushed set/get logic for data size associated with the pointer to the NativeAllocator as it's basically it's metadata, IMO it's a bit cleaner comparing to how that is done in Benedict's branch where NativeAllocation tracks pointer alignment to size (internalPeer() { return peer + 4; }) but NativeAllocator takes care of allocating 4 additional bytes to requested size. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973534#comment-13973534 ] Pavel Yaskevich edited comment on CASSANDRA-6694 at 4/17/14 10:56 PM: -- Ok, hashCode and setPeer changes are now pushed to the same branch, AbstractNativeCell is independent of NativeAllocation now because NativeAllocator returns aligned peer directly, which allows peer field to be made final in AbstractNativeCell. Also I have pushed set/get logic for data size associated with the pointer to the NativeAllocator as it's basically it's metadata, IMO it's a bit cleaner comparing to how that is done in Benedict's branch where NativeAllocation tracks pointer alignment to size (internalPeer() \{ return peer + 4; \}) but NativeAllocator takes care of allocating 4 additional bytes to requested size. was (Author: xedin): Ok, hashCode and setPeer changes are now pushed to the same branch, AbstractNativeCell is independent of NativeAllocation now because NativeAllocator returns aligned peer directly, which allows peer field to be made final in AbstractNativeCell. Also I have pushed set/get logic for data size associated with the pointer to the NativeAllocator as it's basically it's metadata, IMO it's a bit cleaner comparing to how that is done in Benedict's branch where NativeAllocation tracks pointer alignment to size (internalPeer() { return peer + 4; }) but NativeAllocator takes care of allocating 4 additional bytes to requested size. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973538#comment-13973538 ] Benedict commented on CASSANDRA-6694: - I don't think this is the right approach: with the changes we are making, we are pretty much precluding doing anything fancy with GC (we'll have to rely on malloc for now). As such the size is no longer providing any useful book keeping information to the NativeAllocator. It should be dealt with entirely in the AbstractNativeCell - its concept of size is entirely unique to it for now. This also, separately, makes packing structs of NativeCell a lot more straight forward. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[1/2] git commit: Fix batchlog to account for CF truncation records
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 7dbbe9233 - 384de4b85 Fix batchlog to account for CF truncation records patch by Aleksey Yeschenko; reviewed by Jonathan Ellis for CASSANDRA-6999 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/87097066 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/87097066 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/87097066 Branch: refs/heads/cassandra-2.0 Commit: 87097066e7c3c133e333804c4e4b00457b6c989d Parents: fe94e90 Author: Aleksey Yeschenko alek...@apache.org Authored: Fri Apr 18 01:36:08 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Fri Apr 18 01:38:55 2014 +0300 -- CHANGES.txt | 1 + .../apache/cassandra/db/BatchlogManager.java| 102 --- .../apache/cassandra/db/ColumnFamilyStore.java | 6 -- .../cassandra/db/HintedHandOffManager.java | 16 +-- .../org/apache/cassandra/db/RowMutation.java| 6 +- .../org/apache/cassandra/db/SystemTable.java| 53 +++--- .../db/commitlog/CommitLogReplayer.java | 4 +- .../apache/cassandra/service/StorageProxy.java | 9 +- .../cassandra/db/BatchlogManagerTest.java | 78 -- .../apache/cassandra/db/HintedHandOffTest.java | 2 +- 10 files changed, 189 insertions(+), 88 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/87097066/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 07c09cf..bb08a37 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -5,6 +5,7 @@ * Schedule schema pulls on change (CASSANDRA-6971) * Non-droppable verbs shouldn't be dropped from OTC (CASSANDRA-6980) * Shutdown batchlog executor in SS#drain() (CASSANDRA-7025) + * Fix batchlog to account for CF truncation records (CASSANDRA-6999) 1.2.16 http://git-wip-us.apache.org/repos/asf/cassandra/blob/87097066/src/java/org/apache/cassandra/db/BatchlogManager.java -- diff --git a/src/java/org/apache/cassandra/db/BatchlogManager.java b/src/java/org/apache/cassandra/db/BatchlogManager.java index b8dbadd..ea32e9d 100644 --- a/src/java/org/apache/cassandra/db/BatchlogManager.java +++ b/src/java/org/apache/cassandra/db/BatchlogManager.java @@ -24,10 +24,7 @@ import java.lang.management.ManagementFactory; import java.net.InetAddress; import java.nio.ByteBuffer; import java.util.*; -import java.util.concurrent.CopyOnWriteArraySet; -import java.util.concurrent.ExecutionException; -import java.util.concurrent.ScheduledExecutorService; -import java.util.concurrent.TimeUnit; +import java.util.concurrent.*; import java.util.concurrent.atomic.AtomicBoolean; import java.util.concurrent.atomic.AtomicLong; import javax.management.MBeanServer; @@ -36,6 +33,7 @@ import javax.management.ObjectName; import com.google.common.annotations.VisibleForTesting; import com.google.common.collect.Iterables; import com.google.common.collect.Lists; +import com.google.common.collect.Sets; import com.google.common.util.concurrent.RateLimiter; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -254,45 +252,72 @@ public class BatchlogManager implements BatchlogManagerMBean { DataInputStream in = new DataInputStream(ByteBufferUtil.inputStream(data)); int size = in.readInt(); +ListRowMutation mutations = new ArrayListRowMutation(size); + for (int i = 0; i size; i++) -replaySerializedMutation(RowMutation.serializer.deserialize(in, VERSION), writtenAt, rateLimiter); +{ +RowMutation mutation = RowMutation.serializer.deserialize(in, VERSION); + +// Remove CFs that have been truncated since. writtenAt and SystemTable#getTruncatedAt() both return millis. +// We don't abort the replay entirely b/c this can be considered a succes (truncated is same as delivered then +// truncated. +for (UUID cfId : mutation.getColumnFamilyIds()) +if (writtenAt = SystemTable.getTruncatedAt(cfId)) +mutation = mutation.without(cfId); + +if (!mutation.isEmpty()) +mutations.add(mutation); +} + +if (!mutations.isEmpty()) +replayMutations(mutations, writtenAt, rateLimiter); } /* * We try to deliver the mutations to the replicas ourselves if they are alive and only resort to writing hints * when a replica is down or a write request times out. */ -private void replaySerializedMutation(RowMutation mutation, long writtenAt, RateLimiter rateLimiter) throws IOException +private void
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973571#comment-13973571 ] Pavel Yaskevich commented on CASSANDRA-6694: I just don't like that in NativeAllocation we assume that NativeAllocator has reserved 4 bytes for us. So I decided to put everything into NativeAllocator and only return useful space so we don't have to + 4 every time we need a peer. It could be done in AbstractNativeCell which would allocate size + 4 or it could be done in NativeAllocator and it would tell how big allocation was based on the area pointer that it returned (which is was NativeAllocator.getDataSize(areaPointer) does) on demand, either of those places (AbstractNativeCell or NativeAllocator) works for me. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973582#comment-13973582 ] Benedict edited comment on CASSANDRA-6694 at 4/17/14 11:55 PM: --- The only reason we were assigning a size in NativeAllocator was to support moving the peer around (in which case you need to know how much memory you're copying). NativeAllocation assuming it has (i.e. _being defined as having_) a size prefix is fine when it is tightly coupled with NativeAllocator (like it is in my branch) - but once you have it as a final field in another object, NativeAllocator should simply have no say in the matter. It never needs to know the size of the allocation, so we should just redefine what our AbstractNativeCell considers to be its size in its sizeOf() calculation, and have the NativeAllocator use that unadulterated value. was (Author: benedict): The only reason it was happening in NativeAllocator was to support moving the peer around (so you need to know how much memory you're copying). NativeAllocation assuming it has (i.e. _being defined as having_) a size prefix is fine when it is tightly coupled with NativeAllocator (like it is in my branch) - but once you have it as a final field in another object, NativeAllocator should simply have no say in the matter. It never needs to know the size of the allocation, so we should just redefine what our AbstractNativeCell considers to be its size in its sizeOf() calculation. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973582#comment-13973582 ] Benedict commented on CASSANDRA-6694: - The only reason it was happening in NativeAllocator was to support moving the peer around (so you need to know how much memory you're copying). NativeAllocation assuming it has (i.e. _being defined as having_) a size prefix is fine when it is tightly coupled with NativeAllocator (like it is in my branch) - but once you have it as a final field in another object, NativeAllocator should simply have no say in the matter. It never needs to know the size of the allocation, so we should just redefine what our AbstractNativeCell considers to be its size in its sizeOf() calculation. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973595#comment-13973595 ] Pavel Yaskevich commented on CASSANDRA-6694: Sure, if you like that better I will change that right away, anyhow if we need it in allocator for some reason we can change it. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[1/4] git commit: Update versions for 2.0.7 release
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 de8a479f2 - 66af6fedc Update versions for 2.0.7 release Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7dbbe923 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7dbbe923 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7dbbe923 Branch: refs/heads/cassandra-2.1 Commit: 7dbbe9233ce83c2a473ba2510c827a661de99400 Parents: 294c011 Author: Sylvain Lebresne sylv...@datastax.com Authored: Mon Apr 14 16:43:46 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Mon Apr 14 16:43:46 2014 +0200 -- NEWS.txt | 11 ++- build.xml| 2 +- debian/changelog | 6 ++ 3 files changed, 17 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7dbbe923/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 18f89bc..05f9392 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -14,6 +14,15 @@ restore snapshots created with the previous major version using the using the provided 'sstableupgrade' tool. +2.0.7 += + +Upgrading +- +- Nothing specific to this release, but please see 2.0.6 if you are upgrading + from a previous version. + + 2.0.6 = @@ -29,7 +38,7 @@ New features Upgrading - -- Nothing specific to this release, but please see 2.0.6 if you are upgrading +- Nothing specific to this release, but please see 2.0.5 if you are upgrading from a previous version. http://git-wip-us.apache.org/repos/asf/cassandra/blob/7dbbe923/build.xml -- diff --git a/build.xml b/build.xml index e6d77d8..5c6c736 100644 --- a/build.xml +++ b/build.xml @@ -25,7 +25,7 @@ property name=debuglevel value=source,lines,vars/ !-- default version and SCM information -- -property name=base.version value=2.0.6/ +property name=base.version value=2.0.7/ property name=scm.connection value=scm:git://git.apache.org/cassandra.git/ property name=scm.developerConnection value=scm:git://git.apache.org/cassandra.git/ property name=scm.url value=http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree/ http://git-wip-us.apache.org/repos/asf/cassandra/blob/7dbbe923/debian/changelog -- diff --git a/debian/changelog b/debian/changelog index 6cc4391..37c7425 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +cassandra (2.0.7) unstable; urgency=low + + * New release + + -- Sylvain Lebresne slebre...@apache.org Mon, 14 Apr 2014 16:42:09 +0200 + cassandra (2.0.6) unstable; urgency=low * New release
[4/4] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Conflicts: CHANGES.txt build.xml debian/changelog src/java/org/apache/cassandra/db/BatchlogManager.java src/java/org/apache/cassandra/db/ColumnFamilyStore.java src/java/org/apache/cassandra/db/HintedHandOffManager.java src/java/org/apache/cassandra/db/SystemKeyspace.java src/java/org/apache/cassandra/service/StorageProxy.java test/unit/org/apache/cassandra/db/BatchlogManagerTest.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/66af6fed Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/66af6fed Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/66af6fed Branch: refs/heads/cassandra-2.1 Commit: 66af6fedc02eed630028043f8a6f0d3014f193d5 Parents: de8a479 384de4b Author: Aleksey Yeschenko alek...@apache.org Authored: Fri Apr 18 03:14:47 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Fri Apr 18 03:14:47 2014 +0300 -- CHANGES.txt | 1 + NEWS.txt| 11 +- .../apache/cassandra/db/BatchlogManager.java| 102 +++ .../apache/cassandra/db/ColumnFamilyStore.java | 6 -- .../cassandra/db/HintedHandOffManager.java | 19 +--- .../org/apache/cassandra/db/SystemKeyspace.java | 55 +++--- .../db/commitlog/CommitLogReplayer.java | 12 +-- .../apache/cassandra/service/StorageProxy.java | 9 +- .../cassandra/db/BatchlogManagerTest.java | 84 +-- .../apache/cassandra/db/HintedHandOffTest.java | 19 ++-- 10 files changed, 214 insertions(+), 104 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/66af6fed/CHANGES.txt -- diff --cc CHANGES.txt index 9f34023,ad26f6d..705f1b8 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -108,6 -64,6 +108,7 @@@ Merged from 1.2 * Schedule schema pulls on change (CASSANDRA-6971) * Non-droppable verbs shouldn't be dropped from OTC (CASSANDRA-6980) * Shutdown batchlog executor in SS#drain() (CASSANDRA-7025) ++ * Fix batchlog to account for CF truncation records (CASSANDRA-6999) 2.0.6 http://git-wip-us.apache.org/repos/asf/cassandra/blob/66af6fed/NEWS.txt -- diff --cc NEWS.txt index 9567ef3,05f9392..ac78a73 --- a/NEWS.txt +++ b/NEWS.txt @@@ -13,46 -13,16 +13,55 @@@ restore snapshots created with the prev 'sstableloader' tool. You can upgrade the file format of your snapshots using the provided 'sstableupgrade' tool. +2.1 +=== + +New features + + - SSTable data directory name is slightly changed. Each directory will + have hex string appended after CF name, e.g. + ks/cf-5be396077b811e3a3ab9dc4b9ac088d/ + This hex string part represents unique ColumnFamily ID. + Note that existing directories are used as is, so only newly created + directories after upgrade have new directory name format. + - Saved key cache files also have ColumnFamily ID in their file name. + - It is now possible to do incremental repairs, sstables that have been + repaired are marked with a timestamp and not included in the next + repair session. Use nodetool repair -par -inc to use this feature. + A tool to manually mark/unmark sstables as repaired is available in + tools/bin/sstablerepairedset. + +Upgrading +- + - Rolling upgrades from anything pre-2.0.7 is not supported. Furthermore + pre-2.0 sstables are not supported. This means that before upgrading + a node on 2.1, this node must be started on 2.0 and + 'nodetool upgdradesstables' must be run (and this even in the case + of not-rolling upgrades). + - For size-tiered compaction users, Cassandra now defaults to ignoring + the coldest 5% of sstables. This can be customized with the + cold_reads_to_omit compaction option; 0.0 omits nothing (the old + behavior) and 1.0 omits everything. + - Multithreaded compaction has been removed. + - Counters implementation has been changed, replaced by a safer one with + less caveats, but different performance characteristics. You might have + to change your data model to accomodate the new implementation. + (See https://issues.apache.org/jira/browse/CASSANDRA-6504 and the dev + blog post at http://www.datastax.com/dev/blog/PLACEHOLDER for details). +- (per-table) index_interval parameter has been replaced with + min_index_interval and max_index_interval paratemeters. index_interval + has been deprecated. + + 2.0.7 + = + + Upgrading + - + - Nothing specific to this
[2/4] git commit: Fix batchlog to account for CF truncation records
Fix batchlog to account for CF truncation records patch by Aleksey Yeschenko; reviewed by Jonathan Ellis for CASSANDRA-6999 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/87097066 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/87097066 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/87097066 Branch: refs/heads/cassandra-2.1 Commit: 87097066e7c3c133e333804c4e4b00457b6c989d Parents: fe94e90 Author: Aleksey Yeschenko alek...@apache.org Authored: Fri Apr 18 01:36:08 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Fri Apr 18 01:38:55 2014 +0300 -- CHANGES.txt | 1 + .../apache/cassandra/db/BatchlogManager.java| 102 --- .../apache/cassandra/db/ColumnFamilyStore.java | 6 -- .../cassandra/db/HintedHandOffManager.java | 16 +-- .../org/apache/cassandra/db/RowMutation.java| 6 +- .../org/apache/cassandra/db/SystemTable.java| 53 +++--- .../db/commitlog/CommitLogReplayer.java | 4 +- .../apache/cassandra/service/StorageProxy.java | 9 +- .../cassandra/db/BatchlogManagerTest.java | 78 -- .../apache/cassandra/db/HintedHandOffTest.java | 2 +- 10 files changed, 189 insertions(+), 88 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/87097066/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 07c09cf..bb08a37 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -5,6 +5,7 @@ * Schedule schema pulls on change (CASSANDRA-6971) * Non-droppable verbs shouldn't be dropped from OTC (CASSANDRA-6980) * Shutdown batchlog executor in SS#drain() (CASSANDRA-7025) + * Fix batchlog to account for CF truncation records (CASSANDRA-6999) 1.2.16 http://git-wip-us.apache.org/repos/asf/cassandra/blob/87097066/src/java/org/apache/cassandra/db/BatchlogManager.java -- diff --git a/src/java/org/apache/cassandra/db/BatchlogManager.java b/src/java/org/apache/cassandra/db/BatchlogManager.java index b8dbadd..ea32e9d 100644 --- a/src/java/org/apache/cassandra/db/BatchlogManager.java +++ b/src/java/org/apache/cassandra/db/BatchlogManager.java @@ -24,10 +24,7 @@ import java.lang.management.ManagementFactory; import java.net.InetAddress; import java.nio.ByteBuffer; import java.util.*; -import java.util.concurrent.CopyOnWriteArraySet; -import java.util.concurrent.ExecutionException; -import java.util.concurrent.ScheduledExecutorService; -import java.util.concurrent.TimeUnit; +import java.util.concurrent.*; import java.util.concurrent.atomic.AtomicBoolean; import java.util.concurrent.atomic.AtomicLong; import javax.management.MBeanServer; @@ -36,6 +33,7 @@ import javax.management.ObjectName; import com.google.common.annotations.VisibleForTesting; import com.google.common.collect.Iterables; import com.google.common.collect.Lists; +import com.google.common.collect.Sets; import com.google.common.util.concurrent.RateLimiter; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -254,45 +252,72 @@ public class BatchlogManager implements BatchlogManagerMBean { DataInputStream in = new DataInputStream(ByteBufferUtil.inputStream(data)); int size = in.readInt(); +ListRowMutation mutations = new ArrayListRowMutation(size); + for (int i = 0; i size; i++) -replaySerializedMutation(RowMutation.serializer.deserialize(in, VERSION), writtenAt, rateLimiter); +{ +RowMutation mutation = RowMutation.serializer.deserialize(in, VERSION); + +// Remove CFs that have been truncated since. writtenAt and SystemTable#getTruncatedAt() both return millis. +// We don't abort the replay entirely b/c this can be considered a succes (truncated is same as delivered then +// truncated. +for (UUID cfId : mutation.getColumnFamilyIds()) +if (writtenAt = SystemTable.getTruncatedAt(cfId)) +mutation = mutation.without(cfId); + +if (!mutation.isEmpty()) +mutations.add(mutation); +} + +if (!mutations.isEmpty()) +replayMutations(mutations, writtenAt, rateLimiter); } /* * We try to deliver the mutations to the replicas ourselves if they are alive and only resort to writing hints * when a replica is down or a write request times out. */ -private void replaySerializedMutation(RowMutation mutation, long writtenAt, RateLimiter rateLimiter) throws IOException +private void replayMutations(ListRowMutation mutations, long writtenAt, RateLimiter rateLimiter) throws IOException {
[5/5] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e1002881 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e1002881 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e1002881 Branch: refs/heads/trunk Commit: e100288123b055021d9b8873ab86f0dbf5fc9f22 Parents: 4d06917 66af6fe Author: Aleksey Yeschenko alek...@apache.org Authored: Fri Apr 18 03:16:14 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Fri Apr 18 03:16:14 2014 +0300 -- CHANGES.txt | 1 + NEWS.txt| 11 +- .../apache/cassandra/db/BatchlogManager.java| 102 +++ .../apache/cassandra/db/ColumnFamilyStore.java | 6 -- .../cassandra/db/HintedHandOffManager.java | 19 +--- .../org/apache/cassandra/db/SystemKeyspace.java | 55 +++--- .../db/commitlog/CommitLogReplayer.java | 12 +-- .../apache/cassandra/service/StorageProxy.java | 9 +- .../cassandra/db/BatchlogManagerTest.java | 84 +-- .../apache/cassandra/db/HintedHandOffTest.java | 19 ++-- 10 files changed, 214 insertions(+), 104 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e1002881/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e1002881/NEWS.txt --
[1/5] git commit: Update versions for 2.0.7 release
Repository: cassandra Updated Branches: refs/heads/trunk 4d0691759 - e10028812 Update versions for 2.0.7 release Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7dbbe923 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7dbbe923 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7dbbe923 Branch: refs/heads/trunk Commit: 7dbbe9233ce83c2a473ba2510c827a661de99400 Parents: 294c011 Author: Sylvain Lebresne sylv...@datastax.com Authored: Mon Apr 14 16:43:46 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Mon Apr 14 16:43:46 2014 +0200 -- NEWS.txt | 11 ++- build.xml| 2 +- debian/changelog | 6 ++ 3 files changed, 17 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7dbbe923/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 18f89bc..05f9392 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -14,6 +14,15 @@ restore snapshots created with the previous major version using the using the provided 'sstableupgrade' tool. +2.0.7 += + +Upgrading +- +- Nothing specific to this release, but please see 2.0.6 if you are upgrading + from a previous version. + + 2.0.6 = @@ -29,7 +38,7 @@ New features Upgrading - -- Nothing specific to this release, but please see 2.0.6 if you are upgrading +- Nothing specific to this release, but please see 2.0.5 if you are upgrading from a previous version. http://git-wip-us.apache.org/repos/asf/cassandra/blob/7dbbe923/build.xml -- diff --git a/build.xml b/build.xml index e6d77d8..5c6c736 100644 --- a/build.xml +++ b/build.xml @@ -25,7 +25,7 @@ property name=debuglevel value=source,lines,vars/ !-- default version and SCM information -- -property name=base.version value=2.0.6/ +property name=base.version value=2.0.7/ property name=scm.connection value=scm:git://git.apache.org/cassandra.git/ property name=scm.developerConnection value=scm:git://git.apache.org/cassandra.git/ property name=scm.url value=http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree/ http://git-wip-us.apache.org/repos/asf/cassandra/blob/7dbbe923/debian/changelog -- diff --git a/debian/changelog b/debian/changelog index 6cc4391..37c7425 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +cassandra (2.0.7) unstable; urgency=low + + * New release + + -- Sylvain Lebresne slebre...@apache.org Mon, 14 Apr 2014 16:42:09 +0200 + cassandra (2.0.6) unstable; urgency=low * New release
[4/5] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Conflicts: CHANGES.txt build.xml debian/changelog src/java/org/apache/cassandra/db/BatchlogManager.java src/java/org/apache/cassandra/db/ColumnFamilyStore.java src/java/org/apache/cassandra/db/HintedHandOffManager.java src/java/org/apache/cassandra/db/SystemKeyspace.java src/java/org/apache/cassandra/service/StorageProxy.java test/unit/org/apache/cassandra/db/BatchlogManagerTest.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/66af6fed Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/66af6fed Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/66af6fed Branch: refs/heads/trunk Commit: 66af6fedc02eed630028043f8a6f0d3014f193d5 Parents: de8a479 384de4b Author: Aleksey Yeschenko alek...@apache.org Authored: Fri Apr 18 03:14:47 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Fri Apr 18 03:14:47 2014 +0300 -- CHANGES.txt | 1 + NEWS.txt| 11 +- .../apache/cassandra/db/BatchlogManager.java| 102 +++ .../apache/cassandra/db/ColumnFamilyStore.java | 6 -- .../cassandra/db/HintedHandOffManager.java | 19 +--- .../org/apache/cassandra/db/SystemKeyspace.java | 55 +++--- .../db/commitlog/CommitLogReplayer.java | 12 +-- .../apache/cassandra/service/StorageProxy.java | 9 +- .../cassandra/db/BatchlogManagerTest.java | 84 +-- .../apache/cassandra/db/HintedHandOffTest.java | 19 ++-- 10 files changed, 214 insertions(+), 104 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/66af6fed/CHANGES.txt -- diff --cc CHANGES.txt index 9f34023,ad26f6d..705f1b8 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -108,6 -64,6 +108,7 @@@ Merged from 1.2 * Schedule schema pulls on change (CASSANDRA-6971) * Non-droppable verbs shouldn't be dropped from OTC (CASSANDRA-6980) * Shutdown batchlog executor in SS#drain() (CASSANDRA-7025) ++ * Fix batchlog to account for CF truncation records (CASSANDRA-6999) 2.0.6 http://git-wip-us.apache.org/repos/asf/cassandra/blob/66af6fed/NEWS.txt -- diff --cc NEWS.txt index 9567ef3,05f9392..ac78a73 --- a/NEWS.txt +++ b/NEWS.txt @@@ -13,46 -13,16 +13,55 @@@ restore snapshots created with the prev 'sstableloader' tool. You can upgrade the file format of your snapshots using the provided 'sstableupgrade' tool. +2.1 +=== + +New features + + - SSTable data directory name is slightly changed. Each directory will + have hex string appended after CF name, e.g. + ks/cf-5be396077b811e3a3ab9dc4b9ac088d/ + This hex string part represents unique ColumnFamily ID. + Note that existing directories are used as is, so only newly created + directories after upgrade have new directory name format. + - Saved key cache files also have ColumnFamily ID in their file name. + - It is now possible to do incremental repairs, sstables that have been + repaired are marked with a timestamp and not included in the next + repair session. Use nodetool repair -par -inc to use this feature. + A tool to manually mark/unmark sstables as repaired is available in + tools/bin/sstablerepairedset. + +Upgrading +- + - Rolling upgrades from anything pre-2.0.7 is not supported. Furthermore + pre-2.0 sstables are not supported. This means that before upgrading + a node on 2.1, this node must be started on 2.0 and + 'nodetool upgdradesstables' must be run (and this even in the case + of not-rolling upgrades). + - For size-tiered compaction users, Cassandra now defaults to ignoring + the coldest 5% of sstables. This can be customized with the + cold_reads_to_omit compaction option; 0.0 omits nothing (the old + behavior) and 1.0 omits everything. + - Multithreaded compaction has been removed. + - Counters implementation has been changed, replaced by a safer one with + less caveats, but different performance characteristics. You might have + to change your data model to accomodate the new implementation. + (See https://issues.apache.org/jira/browse/CASSANDRA-6504 and the dev + blog post at http://www.datastax.com/dev/blog/PLACEHOLDER for details). +- (per-table) index_interval parameter has been replaced with + min_index_interval and max_index_interval paratemeters. index_interval + has been deprecated. + + 2.0.7 + = + + Upgrading + - + - Nothing specific to this release, but
[2/5] git commit: Fix batchlog to account for CF truncation records
Fix batchlog to account for CF truncation records patch by Aleksey Yeschenko; reviewed by Jonathan Ellis for CASSANDRA-6999 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/87097066 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/87097066 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/87097066 Branch: refs/heads/trunk Commit: 87097066e7c3c133e333804c4e4b00457b6c989d Parents: fe94e90 Author: Aleksey Yeschenko alek...@apache.org Authored: Fri Apr 18 01:36:08 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Fri Apr 18 01:38:55 2014 +0300 -- CHANGES.txt | 1 + .../apache/cassandra/db/BatchlogManager.java| 102 --- .../apache/cassandra/db/ColumnFamilyStore.java | 6 -- .../cassandra/db/HintedHandOffManager.java | 16 +-- .../org/apache/cassandra/db/RowMutation.java| 6 +- .../org/apache/cassandra/db/SystemTable.java| 53 +++--- .../db/commitlog/CommitLogReplayer.java | 4 +- .../apache/cassandra/service/StorageProxy.java | 9 +- .../cassandra/db/BatchlogManagerTest.java | 78 -- .../apache/cassandra/db/HintedHandOffTest.java | 2 +- 10 files changed, 189 insertions(+), 88 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/87097066/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 07c09cf..bb08a37 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -5,6 +5,7 @@ * Schedule schema pulls on change (CASSANDRA-6971) * Non-droppable verbs shouldn't be dropped from OTC (CASSANDRA-6980) * Shutdown batchlog executor in SS#drain() (CASSANDRA-7025) + * Fix batchlog to account for CF truncation records (CASSANDRA-6999) 1.2.16 http://git-wip-us.apache.org/repos/asf/cassandra/blob/87097066/src/java/org/apache/cassandra/db/BatchlogManager.java -- diff --git a/src/java/org/apache/cassandra/db/BatchlogManager.java b/src/java/org/apache/cassandra/db/BatchlogManager.java index b8dbadd..ea32e9d 100644 --- a/src/java/org/apache/cassandra/db/BatchlogManager.java +++ b/src/java/org/apache/cassandra/db/BatchlogManager.java @@ -24,10 +24,7 @@ import java.lang.management.ManagementFactory; import java.net.InetAddress; import java.nio.ByteBuffer; import java.util.*; -import java.util.concurrent.CopyOnWriteArraySet; -import java.util.concurrent.ExecutionException; -import java.util.concurrent.ScheduledExecutorService; -import java.util.concurrent.TimeUnit; +import java.util.concurrent.*; import java.util.concurrent.atomic.AtomicBoolean; import java.util.concurrent.atomic.AtomicLong; import javax.management.MBeanServer; @@ -36,6 +33,7 @@ import javax.management.ObjectName; import com.google.common.annotations.VisibleForTesting; import com.google.common.collect.Iterables; import com.google.common.collect.Lists; +import com.google.common.collect.Sets; import com.google.common.util.concurrent.RateLimiter; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -254,45 +252,72 @@ public class BatchlogManager implements BatchlogManagerMBean { DataInputStream in = new DataInputStream(ByteBufferUtil.inputStream(data)); int size = in.readInt(); +ListRowMutation mutations = new ArrayListRowMutation(size); + for (int i = 0; i size; i++) -replaySerializedMutation(RowMutation.serializer.deserialize(in, VERSION), writtenAt, rateLimiter); +{ +RowMutation mutation = RowMutation.serializer.deserialize(in, VERSION); + +// Remove CFs that have been truncated since. writtenAt and SystemTable#getTruncatedAt() both return millis. +// We don't abort the replay entirely b/c this can be considered a succes (truncated is same as delivered then +// truncated. +for (UUID cfId : mutation.getColumnFamilyIds()) +if (writtenAt = SystemTable.getTruncatedAt(cfId)) +mutation = mutation.without(cfId); + +if (!mutation.isEmpty()) +mutations.add(mutation); +} + +if (!mutations.isEmpty()) +replayMutations(mutations, writtenAt, rateLimiter); } /* * We try to deliver the mutations to the replicas ourselves if they are alive and only resort to writing hints * when a replica is down or a write request times out. */ -private void replaySerializedMutation(RowMutation mutation, long writtenAt, RateLimiter rateLimiter) throws IOException +private void replayMutations(ListRowMutation mutations, long writtenAt, RateLimiter rateLimiter) throws IOException { -
[jira] [Updated] (CASSANDRA-7053) USING TIMESTAMP for batches does not work
[ https://issues.apache.org/jira/browse/CASSANDRA-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Stepura updated CASSANDRA-7053: --- Attachment: cassandra-2.0-7053.patch Attaching the patch which fix the described scenario. The {{timestamp}} from {{Attributes}} wasn't used in {{executeWithoutConditions}} USING TIMESTAMP for batches does not work - Key: CASSANDRA-7053 URL: https://issues.apache.org/jira/browse/CASSANDRA-7053 Project: Cassandra Issue Type: Bug Reporter: Robert Supencheck Labels: cqlsh Attachments: cassandra-2.0-7053.patch When using the USING TIMESTAMP timestamp syntax for a batch statement, the supplied timestamp is ignored. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7053) USING TIMESTAMP for batches does not work
[ https://issues.apache.org/jira/browse/CASSANDRA-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973616#comment-13973616 ] Mikhail Stepura commented on CASSANDRA-7053: There are a couple of places in {{executeWithConditions}} which use {{now}} instead {{timestam}}, though. Should those be fixed as well? {code} conditions = new CQL3CasConditions(statement.cfm, now); ... UpdateParameters params = statement.makeUpdateParameters(Collections.singleton(key), clusteringPrefix, statementVariables, false, cl, now); {code} USING TIMESTAMP for batches does not work - Key: CASSANDRA-7053 URL: https://issues.apache.org/jira/browse/CASSANDRA-7053 Project: Cassandra Issue Type: Bug Reporter: Robert Supencheck Assignee: Mikhail Stepura Labels: cqlsh Attachments: cassandra-2.0-7053.patch When using the USING TIMESTAMP timestamp syntax for a batch statement, the supplied timestamp is ignored. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7053) USING TIMESTAMP for batches does not work
[ https://issues.apache.org/jira/browse/CASSANDRA-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Stepura updated CASSANDRA-7053: --- Fix Version/s: 2.0.8 USING TIMESTAMP for batches does not work - Key: CASSANDRA-7053 URL: https://issues.apache.org/jira/browse/CASSANDRA-7053 Project: Cassandra Issue Type: Bug Reporter: Robert Supencheck Assignee: Mikhail Stepura Labels: cqlsh Fix For: 2.0.8 Attachments: cassandra-2.0-7053.patch When using the USING TIMESTAMP timestamp syntax for a batch statement, the supplied timestamp is ignored. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973618#comment-13973618 ] Pavel Yaskevich commented on CASSANDRA-6694: Done, I have force pushed to my branch, now AbstractNativeCell is handling size, NativeAllocator has nothing to do with it. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973633#comment-13973633 ] Benedict commented on CASSANDRA-6694: - Thanks. Although it looks like you haven't updated any of the offsets to work with the new layout? As to the other changes you've made: I do not like the pollution of PoolAllocator with supportsNative(). Since this branch is supposed to be pushing idiomatic Java usage, let's stick to using interfaces for specialisation since we can. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973633#comment-13973633 ] Benedict edited comment on CASSANDRA-6694 at 4/18/14 12:36 AM: --- Thanks. Although it looks like you haven't updated any of the offsets to work with the new layout? As to the other changes you've made: I do not like the pollution of PoolAllocator with supportsNative() and allocateNative(). Since this branch is supposed to be pushing idiomatic Java usage, let's stick to using interfaces for specialisation since we can. was (Author: benedict): Thanks. Although it looks like you haven't updated any of the offsets to work with the new layout? As to the other changes you've made: I do not like the pollution of PoolAllocator with supportsNative(). Since this branch is supposed to be pushing idiomatic Java usage, let's stick to using interfaces for specialisation since we can. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973645#comment-13973645 ] Pavel Yaskevich commented on CASSANDRA-6694: Why it does - internalPeer does + 4 and internalSize does - 4 when all get/set methods use internalPeer() + offset. Regarding (and I was waiting for that) supportsNative() and allocateNative - I did that because I don't want to put time into adding DataAllocator and DataPool interfaces that your code has just yet, once it's decided which way we want to go I will remove allocateNative and do proper work there. This still intended as just an idea presentation for how to handle Cell without Impl classes. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973647#comment-13973647 ] Benedict commented on CASSANDRA-6694: - bq. This still intended as just an idea presentation for how to handle Cell without Impl classes. OK, cool. Glad we're staying on topic :) bq. Why it does - internalPeer does + 4 and internalSize does - 4 My mistake. I was expecting to see the static OFFSET fields updated - we should probably optimise that before we finish up (now that we can), but obviously fine for now. Slightly More Off-Heap Memtables Key: CASSANDRA-6694 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 2.1 beta2 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap overhead is still very large. It should not be tremendously difficult to extend these changes so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their associated overhead). The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the VM to allow us to address a reasonably large memory space, although this trick is unlikely to last us forever, at which point we will have to bite the bullet and accept a 24-byte per cell overhead), and 4-byte object reference for maintaining our internal list of allocations, which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting. The ugliest thing here is going to be implementing the various CellName instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7047) TriggerExecutor should group mutations by row key
[ https://issues.apache.org/jira/browse/CASSANDRA-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-7047: - Attachment: 7047-v2.txt TriggerExecutor should group mutations by row key - Key: CASSANDRA-7047 URL: https://issues.apache.org/jira/browse/CASSANDRA-7047 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sergio Bossa Assignee: Sergio Bossa Attachments: 7047-v2.txt, CASSANDRA-7047.patch TriggerExecutor doesn't currently group mutations returned by triggers even if belonging to the same row key: while harmful per se (at least, I think so), this is definitely a performance problem, because each mutation is a *cluster* mutation, generating more network traffic, more disk IO and more index calls (if present). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7047) TriggerExecutor should group mutations by row key
[ https://issues.apache.org/jira/browse/CASSANDRA-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973716#comment-13973716 ] Aleksey Yeschenko commented on CASSANDRA-7047: -- Oh, well. Cleaned up TriggerExecutorTest, and only then realized that something is wrong (by TriggerExecutorTest diff being all green and lacking the license header). We already have TriggersTest.java. Can you move the tests there? (and rewrite them to match the style of the tests there, too?) Thanks. TriggerExecutor should group mutations by row key - Key: CASSANDRA-7047 URL: https://issues.apache.org/jira/browse/CASSANDRA-7047 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sergio Bossa Assignee: Sergio Bossa Attachments: 7047-v2.txt, CASSANDRA-7047.patch TriggerExecutor doesn't currently group mutations returned by triggers even if belonging to the same row key: while harmful per se (at least, I think so), this is definitely a performance problem, because each mutation is a *cluster* mutation, generating more network traffic, more disk IO and more index calls (if present). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7053) USING TIMESTAMP for batches does not work
[ https://issues.apache.org/jira/browse/CASSANDRA-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Stepura updated CASSANDRA-7053: --- Reviewer: Aleksey Yeschenko [~iamaleksey] could you please review? USING TIMESTAMP for batches does not work - Key: CASSANDRA-7053 URL: https://issues.apache.org/jira/browse/CASSANDRA-7053 Project: Cassandra Issue Type: Bug Reporter: Robert Supencheck Assignee: Mikhail Stepura Labels: cqlsh Fix For: 2.0.8 Attachments: cassandra-2.0-7053.patch When using the USING TIMESTAMP timestamp syntax for a batch statement, the supplied timestamp is ignored. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7047) TriggerExecutor should group mutations by row key
[ https://issues.apache.org/jira/browse/CASSANDRA-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973790#comment-13973790 ] Aleksey Yeschenko commented on CASSANDRA-7047: -- Unless of course this has been an intentional decision. TriggerExecutor should group mutations by row key - Key: CASSANDRA-7047 URL: https://issues.apache.org/jira/browse/CASSANDRA-7047 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sergio Bossa Assignee: Sergio Bossa Attachments: 7047-v2.txt, CASSANDRA-7047.patch TriggerExecutor doesn't currently group mutations returned by triggers even if belonging to the same row key: while harmful per se (at least, I think so), this is definitely a performance problem, because each mutation is a *cluster* mutation, generating more network traffic, more disk IO and more index calls (if present). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7047) TriggerExecutor should group mutations by row key
[ https://issues.apache.org/jira/browse/CASSANDRA-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973806#comment-13973806 ] Sergio Bossa commented on CASSANDRA-7047: - That was definitely intentional. This patch is not meant to have any functional impact, so TriggersTest result will be the same with or without it, as it only tests the end result of mutations being stored or not. TriggerExecutorTest is instead a true unit test, as it tests the mutations produced by the two execute() methods (hence the different style). I can merge the two, but I think that would be the wrong thing. TriggerExecutor should group mutations by row key - Key: CASSANDRA-7047 URL: https://issues.apache.org/jira/browse/CASSANDRA-7047 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sergio Bossa Assignee: Sergio Bossa Attachments: 7047-v2.txt, CASSANDRA-7047.patch TriggerExecutor doesn't currently group mutations returned by triggers even if belonging to the same row key: while harmful per se (at least, I think so), this is definitely a performance problem, because each mutation is a *cluster* mutation, generating more network traffic, more disk IO and more index calls (if present). -- This message was sent by Atlassian JIRA (v6.2#6252)