[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787971#comment-13787971 ] Sylvain Lebresne commented on CASSANDRA-6107: - CASSANDRA-5981 is indeed about limiting the size at the protocol level. However it's a global frame limitation. In particular this is the hard limit for queries with their values and for that reason the current hard-coded limit is relatively high (256MB). And we can bikeshed on the exact default to user and CASSANDRA-5981 will probably allow the user to play with that limit, but in any case, it will definitively have to be higher than the 1MB. The other detail is that the limit done by CASSANDRA-5981 is on the sent bytes, not the in-memory size of the query, but that probably don't matter much. Anyway, provided that a prepared statement doesn't include values, it wouldn't be absurd to have a specific, lower limit on their size. Though my own preference would be to just leave it to a global limit on the preparedStatements cache map (but it could make sense to reject statements that blow up the entire limit on their own, so as to make sure to respect it). Too many hard-coded limitations make me nervous. CQL3 Batch statement memory leak Key: CASSANDRA-6107 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107 Project: Cassandra Issue Type: Bug Components: API, Core Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both - Running on OSX MacbookPro - Sun JVM 1.7 - Single local cassandra node - both CMS and G1 GC used - we are using the cass-JDBC driver to submit our batches Reporter: Constance Eustace Assignee: Lyuben Todorov Priority: Minor Fix For: 1.2.11 Attachments: 6107.patch, 6107_v2.patch, 6107_v3.patch, 6107-v4.txt, Screen Shot 2013-10-03 at 17.59.37.png We are doing large volume insert/update tests on a CASS via CQL3. Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, we run out of heap, and it never dissipates, and we begin getting this infamous error which many people seem to be encountering: WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) Heap is 0.9383457210434385 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 3614) Unable to reduce heap usage since there are no dirty column families 8 and 12 GB heaps appear to delay the problem by roughly proportionate amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key creations sees the heap grow and never shrink again. We have attempted to no effect: - removing all secondary indexes to see if that alleviates overuse of bloom filters - adjusted parameters for compaction throughput - adjusted memtable flush thresholds and other parameters By examining heapdumps, it seems apparent that the problem is perpetual retention of CQL3 BATCH statements. We have even tried dropping the keyspaces after the updates and the CQL3 statement are still visible in the heapdump, and after many many many CMS GC runs. G1 also showed this issue. The 750,000 statements are broken into batches of roughly 200 statements. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6055) 'Bad Request: Invalid null value for partition key part' on SELECT .. WHERE key IN (val,NULL)
[ https://issues.apache.org/jira/browse/CASSANDRA-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787994#comment-13787994 ] Roman Skvazh commented on CASSANDRA-6055: - I agree with Sylvain Lebresne. CQL is not SQL and there is no need for this hack (ignore NULL values in queries). 'Bad Request: Invalid null value for partition key part' on SELECT .. WHERE key IN (val,NULL) - Key: CASSANDRA-6055 URL: https://issues.apache.org/jira/browse/CASSANDRA-6055 Project: Cassandra Issue Type: Bug Environment: cqlsh, pdo_cassandra Reporter: Sergey Nagaytsev Priority: Minor Labels: cql3 Query: SELECT * FROM user WHERE key IN(uuid,NULL); Table: CREATE COLUMNFAMILY user ( KEY uuid PRIMARY KEY, name text, note text, avatar text, email text, phone text, login text, pw text, st text ); Logs: Nothing, last message hours ago. This query is good in SQL and so is generated by DB abstraction library. Fix on applications sides is multiplying of work. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4988) Fix concurrent addition of collection columns
[ https://issues.apache.org/jira/browse/CASSANDRA-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788002#comment-13788002 ] Sylvain Lebresne commented on CASSANDRA-4988: - Each separate collection already has it's separate entry in schema_columns, and we have all the information there, so I don't think we need a new table here. The information is already redundant. It's just that because the comparator object needs to know the collections (to implement correctly AbstractType.compareCollectionMembers), we currently include the collection names in the comparator serialized form and we should stop doing that, but we already have all the information we need. In fact, in 2.0 the 'comparator' field in schema_columnfamilies is entirely useless, all the information it contains can be reconstructed from the schema_columns. So probably the right solution is to stop saving that field at all, and to reconstruct it from schema_columns instead. Which would also some the concurrent modification of comparator components other problem I've discussed above. Of course, we'd need to be careful with backward compatibility if we do so. Fix concurrent addition of collection columns - Key: CASSANDRA-4988 URL: https://issues.apache.org/jira/browse/CASSANDRA-4988 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 2.0.2 It is currently not safe to update the schema by adding multiple collection columns to the same table. The reason is that with collections, the comparator embeds a map of names-comparator for each collection columns (since different maps can have different key type for example). And when serialized on disk in the schema table, the comparator is serialized as a string with that map as one column. So if new collection columns are added concurrently, the addition may not be merged correctly. One option to fix this would be to stop serializing the names-comparator map of ColumnToCollectionType in toString(), and do one of: # reconstruct that map from the information stores in the schema_columns. The downside I can see is that code-wise this may not be super clean to do. # change ColumnToCollectionType so that instead of having it's own names-comparator map, to just store a point to the CFMetaData that contains it and when it needs to find the exact comparator for a collection column, it would use CFMetadata.column_metadata directly. The downside is that creating a dependency from a comparator to a CFMetadata feels a bit backward. Note sure what's the best solution of the two honestly. While probably more anecdotal, we also now allow to change the type of the comparator in some cases (for example updating to BytesType is always allowed), and doing so concurrently on multiple components of a composite comparator is also not safe for a similar reason. I'm not sure how to fix that one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788015#comment-13788015 ] Lyuben Todorov commented on CASSANDRA-4809: --- The patch works as intended. To specify a particular ks/cf to be archived we need to add parameters when starting cassandra. {code} ./cassandra -Dcassandra.readKeyspaceCommitlog=exampleKS -Dcassandra.readColumnFamilyCommitlog=exampleCF {code} Maybe it would be a better option to add restore_keyspace and restore_columnfamily config options to commitlog_archiving.properties. If the original setup is preferable I'll rebase the patch (it doesn't apply cleanly because of other changes on the branch). Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788015#comment-13788015 ] Lyuben Todorov edited comment on CASSANDRA-4809 at 10/7/13 9:54 AM: The patch works as intended. To specify a particular ks/cf to be archived we need to add parameters when starting cassandra. {code} ./cassandra -Dcassandra.readKeyspaceCommitlog=exampleKS -Dcassandra.readColumnFamilyCommitlog=exampleCF {code} Maybe it would be a better option to add restore_keyspace and restore_columnfamily config options to commitlog_archiving.properties. If the original setup is preferable I'll rebase the patch (it doesn't apply cleanly because of other changes on the branch). The last thing I want to change is: The patch assumes that users will want to replay only part of the commitlog, we should still allow them to replay the entire log if the ks/cf parameters aren't supplied. was (Author: lyubent): The patch works as intended. To specify a particular ks/cf to be archived we need to add parameters when starting cassandra. {code} ./cassandra -Dcassandra.readKeyspaceCommitlog=exampleKS -Dcassandra.readColumnFamilyCommitlog=exampleCF {code} Maybe it would be a better option to add restore_keyspace and restore_columnfamily config options to commitlog_archiving.properties. If the original setup is preferable I'll rebase the patch (it doesn't apply cleanly because of other changes on the branch). Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788015#comment-13788015 ] Lyuben Todorov edited comment on CASSANDRA-4809 at 10/7/13 9:58 AM: The patch works as intended. To specify a particular ks/cf to be archived we need to add parameters when starting cassandra. {code} ./cassandra -Dcassandra.readKeyspaceCommitlog=exampleKS -Dcassandra.readColumnFamilyCommitlog=exampleCF {code} Maybe it would be a better option to add restore_keyspace and restore_columnfamily config options to commitlog_archiving.properties. If the original setup is preferable I'll rebase the patch (it doesn't apply cleanly because of other changes on the branch). was (Author: lyubent): The patch works as intended. To specify a particular ks/cf to be archived we need to add parameters when starting cassandra. {code} ./cassandra -Dcassandra.readKeyspaceCommitlog=exampleKS -Dcassandra.readColumnFamilyCommitlog=exampleCF {code} Maybe it would be a better option to add restore_keyspace and restore_columnfamily config options to commitlog_archiving.properties. If the original setup is preferable I'll rebase the patch (it doesn't apply cleanly because of other changes on the branch). The last thing I want to change is: The patch assumes that users will want to replay only part of the commitlog, we should still allow them to replay the entire log if the ks/cf parameters aren't supplied. Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown reassigned CASSANDRA-4718: -- Assignee: Jason Brown More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Jason Brown Priority: Minor Labels: performance Attachments: baq vs trunk.png, op costs of various queues.ods, PerThreadQueue.java Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4988) Fix concurrent addition of collection columns
[ https://issues.apache.org/jira/browse/CASSANDRA-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788144#comment-13788144 ] Aleksey Yeschenko commented on CASSANDRA-4988: -- IMO, this is what we should do (get rid of the comparator column in schema_columnfamilies). And get rid of the rest of the redundancies: - column_aliases, key_aliases, value_alias columns - CASSANDRA-4603 And after that implement CASSANDRA-6038 in 3.0, with all the garbage cleaned up. Fix concurrent addition of collection columns - Key: CASSANDRA-4988 URL: https://issues.apache.org/jira/browse/CASSANDRA-4988 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 2.0.2 It is currently not safe to update the schema by adding multiple collection columns to the same table. The reason is that with collections, the comparator embeds a map of names-comparator for each collection columns (since different maps can have different key type for example). And when serialized on disk in the schema table, the comparator is serialized as a string with that map as one column. So if new collection columns are added concurrently, the addition may not be merged correctly. One option to fix this would be to stop serializing the names-comparator map of ColumnToCollectionType in toString(), and do one of: # reconstruct that map from the information stores in the schema_columns. The downside I can see is that code-wise this may not be super clean to do. # change ColumnToCollectionType so that instead of having it's own names-comparator map, to just store a point to the CFMetaData that contains it and when it needs to find the exact comparator for a collection column, it would use CFMetadata.column_metadata directly. The downside is that creating a dependency from a comparator to a CFMetadata feels a bit backward. Note sure what's the best solution of the two honestly. While probably more anecdotal, we also now allow to change the type of the comparator in some cases (for example updating to BytesType is always allowed), and doing so concurrently on multiple components of a composite comparator is also not safe for a similar reason. I'm not sure how to fix that one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788157#comment-13788157 ] Mike Bulman commented on CASSANDRA-4809: Fwiw, it would be great if multiple keyspaces and/or column families could be specified. Requiring a restart of c* for each CF is.. tedious. If I wanted to restore multiple column families with the current patch, is there a way to know that the restore is done? (eg, once thrift is available?) Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788188#comment-13788188 ] Lyuben Todorov edited comment on CASSANDRA-4809 at 10/7/13 2:35 PM: bq. is there a way to know that the restore is done? Once the replays complete you should get a message: *INFO 17:31:09,085 Log replay complete, X replayed mutations* and this does occur before thrift starts. As for the multiple kss/cfs we could do what we did in [4191|https://issues.apache.org/jira/browse/CASSANDRA-4191] where a list of ks.cf1 ks2.cf1 is supplied. was (Author: lyubent): bq. is there a way to know that the restore is done? Once the replays complete you should get a message: *INFO 17:31:09,085 Log replay complete, 0 replayed mutations* As for the multiple kss/cfs we could do what we did in [4191|https://issues.apache.org/jira/browse/CASSANDRA-4191] where a list of ks.cf1 ks2.cf1 is supplied. Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788188#comment-13788188 ] Lyuben Todorov commented on CASSANDRA-4809: --- bq. is there a way to know that the restore is done? Once the replays complete you should get a message: *INFO 17:31:09,085 Log replay complete, 0 replayed mutations* As for the multiple kss/cfs we could do what we did in [4191|https://issues.apache.org/jira/browse/CASSANDRA-4191] where a list of ks.cf1 ks2.cf1 is supplied. Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788199#comment-13788199 ] Jonathan Ellis commented on CASSANDRA-4809: --- The -D solution is a bit clunky but putting restore properties in the archive config file feels backwards too. And -D is maybe a bit less likely to be left enabled for multiple restarts by mistake, so let's go with that. List approach is fine. Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788210#comment-13788210 ] Mike Bulman commented on CASSANDRA-4809: bq. and this does occur before thrift starts. Cool. Waiting for thrift to be available wfm bq. List approach is fine. +1 Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6128) Add more data mappings for Pig
[ https://issues.apache.org/jira/browse/CASSANDRA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788217#comment-13788217 ] Brandon Williams commented on CASSANDRA-6128: - Shouldn't DecimalType map to a float or double instead of a string? Add more data mappings for Pig -- Key: CASSANDRA-6128 URL: https://issues.apache.org/jira/browse/CASSANDRA-6128 Project: Cassandra Issue Type: Bug Reporter: Alex Liu Assignee: Alex Liu Attachments: 6128-1.2-branch.txt We need add more data mappings for {code} DecimalType InetAddressType LexicalUUIDType TimeUUIDType UUIDType {code} Existing implementation throws exception for those data type -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (CASSANDRA-5905) Cassandra crashes on restart with IndexOutOfBoundsException: index (2) must be less than size (2)
[ https://issues.apache.org/jira/browse/CASSANDRA-5905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-5905. --- Resolution: Duplicate Fix Version/s: (was: 1.2.11) As near as I can tell, this is another manifestation of the same problem as CASSANDRA-5202 -- since we reuse CFIDs for different incarnations of CFs with the same name, commitlog will try to apply mutations from an earlier definition, to whatever the current one is. Cassandra crashes on restart with IndexOutOfBoundsException: index (2) must be less than size (2) --- Key: CASSANDRA-5905 URL: https://issues.apache.org/jira/browse/CASSANDRA-5905 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 12.04 x64 3 nodes, 1GB each Reporter: David Semeria All 3 nodes crash on restart with same error: INFO 13:31:05,272 Finished reading /var/lib/cassandra/commitlog/CommitLog-2-1376754824649.log INFO 13:31:05,272 Replaying /var/lib/cassandra/commitlog/CommitLog-2-1376754824650.log INFO 13:31:08,267 Finished reading /var/lib/cassandra/commitlog/CommitLog-2-1376754824650.log java.lang.IndexOutOfBoundsException: index (2) must be less than size (2) at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305) at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284) at com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:81) at org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:94) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:76) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31) at java.util.TreeMap.compare(TreeMap.java:1188) at java.util.TreeMap.put(TreeMap.java:531) at org.apache.cassandra.db.TreeMapBackedSortedColumns.addColumn(TreeMapBackedSortedColumns.java:102) at org.apache.cassandra.db.TreeMapBackedSortedColumns.addColumn(TreeMapBackedSortedColumns.java:88) at org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:114) at org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:109) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:101) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:376) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:203) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:281) at org.apache.cassandra.service.CassandraDaemon.init(CassandraDaemon.java:358) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:212) Cannot load daemon Service exit with a return value of 3 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-5815) NPE from migration manager
[ https://issues.apache.org/jira/browse/CASSANDRA-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788228#comment-13788228 ] Chris Burroughs commented on CASSANDRA-5815: I'm seeing an NPE in migration manager in 1.2.9 and what I think is the same spot (line numbers changed slightly since July). This occurs on at least one node every time (about 10 attempts) I try to bootstrap with a 2 dc production cluster using the GPFS w/ reconnecting. {noformat} ERROR [OptionalTasks:1] 2013-10-07 08:06:05,658 CassandraDaemon.java (line 194) Exception in thread Thread[OptionalTasks:1,5,main] java.lang.NullPointerException at org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:130) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} I added a log message to confirm that Gossiper really really thinks it's not there (off of the 1.2.10 tag if that matters). I'm suspicious of this being a timing problem the reconnect dance, but I'm not sure how to prove or disprove that. {noformat} logger.warn([csb] Trying to get endpoint state for {} ; exists {}, new Object[] {endpoint, Gossiper.instance.isKnownEndpoint(endpoint)}); INFO [GossipTasks:1] 2013-10-07 11:19:10,565 Gossiper.java (line 803) InetAddress /208.49.103.36 is now DOWN INFO [GossipTasks:1] 2013-10-07 11:19:13,572 Gossiper.java (line 608) FatClient /208.49.103.36 has been silent for 3ms, removing from gossip INFO [HANDSHAKE-/208.49.103.36] 2013-10-07 11:19:13,863 OutboundTcpConnection.java (line 399) Handshaking version with /208.49.103.36 INFO [HANDSHAKE-/208.49.103.36] 2013-10-07 11:19:15,275 OutboundTcpConnection.java (line 399) Handshaking version with /208.49.103.36 WARN [OptionalTasks:1] 2013-10-07 11:19:36,696 MigrationManager.java (line 130) [csb] Trying to get endpoint state for /208.49.103.36 ; exists false ERROR [OptionalTasks:1] 2013-10-07 11:19:36,696 CassandraDaemon.java (line 193) Exception in thread Thread[OptionalTasks:1,5,main] java.lang.NullPointerException at org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:131) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} NPE from migration manager -- Key: CASSANDRA-5815 URL: https://issues.apache.org/jira/browse/CASSANDRA-5815 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.12 Reporter: Vishy Kasar Assignee: Brandon Williams Priority: Minor In one of our production clusters we see this error often. Looking through the source, Gossiper.instance.getEndpointStateForEndpoint(endpoint) is returning null for some end point. De we need any config change on our end to resolve this? In any case, cassandra should be updated to protect against this NPE. ERROR [OptionalTasks:1] 2013-07-24 13:40:38,972 AbstractCassandraDaemon.java (line 132) Exception in thread Thread[OptionalTasks:1,5,main] java.lang.NullPointerException at org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:134) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at
[jira] [Assigned] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
[ https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reassigned CASSANDRA-6151: --- Assignee: Alex Liu CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated Key: CASSANDRA-6151 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Russell Alexander Spitzer Assignee: Alex Liu Priority: Minor From http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546 The user was attempting to load a single partition using a where clause in a pig load statement. CQL Table {code} CREATE table data ( occurday text, seqnumber int, occurtimems bigint, unique bigint, fields maptext, text, primary key ((occurday, seqnumber), occurtimems, unique) ) {code} Pig Load statement Query {code} data = LOAD 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27' USING CqlStorage(); {code} This results in an exception when processed by the the CqlPagingRecordReader which attempts to page this query even though it contains at most one partition key. This leads to an invalid CQL statement. CqlPagingRecordReader Query {code} SELECT * FROM data WHERE token(occurday,seqnumber) ? AND token(occurday,seqnumber) = ? AND occurday='A Great Day' AND seqnumber=1 LIMIT 1000 ALLOW FILTERING {code} Exception {code} InvalidRequestException(why:occurday cannot be restricted by more than one relation if it includes an Equal) {code} I'm not sure it is worth the special case but, a modification to not use the paging record reader when the entire partition key is specified would solve this issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
[ https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788231#comment-13788231 ] Brandon Williams commented on CASSANDRA-6151: - To be fair, I'm not sure why you'd use M/R for a single partition, but I'll let Alex decide what to do based on difficulty here. CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated Key: CASSANDRA-6151 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Russell Alexander Spitzer Assignee: Alex Liu Priority: Minor From http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546 The user was attempting to load a single partition using a where clause in a pig load statement. CQL Table {code} CREATE table data ( occurday text, seqnumber int, occurtimems bigint, unique bigint, fields maptext, text, primary key ((occurday, seqnumber), occurtimems, unique) ) {code} Pig Load statement Query {code} data = LOAD 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27' USING CqlStorage(); {code} This results in an exception when processed by the the CqlPagingRecordReader which attempts to page this query even though it contains at most one partition key. This leads to an invalid CQL statement. CqlPagingRecordReader Query {code} SELECT * FROM data WHERE token(occurday,seqnumber) ? AND token(occurday,seqnumber) = ? AND occurday='A Great Day' AND seqnumber=1 LIMIT 1000 ALLOW FILTERING {code} Exception {code} InvalidRequestException(why:occurday cannot be restricted by more than one relation if it includes an Equal) {code} I'm not sure it is worth the special case but, a modification to not use the paging record reader when the entire partition key is specified would solve this issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6137) CQL3 SELECT IN CLAUSE inconsistent
[ https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788238#comment-13788238 ] Constance Eustace commented on CASSANDRA-6137: -- The error is reappearing. Nodetool repairs/compactions/etc triggering didn't do anything. Perhaps this is yet another caching error, this time in the key caches. We're going to try some restarts of the node or schemes to clean the caches to see if this is an actual internal data representation problem. CQL3 SELECT IN CLAUSE inconsistent -- Key: CASSANDRA-6137 URL: https://issues.apache.org/jira/browse/CASSANDRA-6137 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu AWS Cassandra 2.0.1 SINGLE NODE Reporter: Constance Eustace Fix For: 2.0.1 We are encountering inconsistent results from CQL3 queries with column keys using IN clause in WHERE. This has been reproduced in cqlsh. Rowkey is e_entid Column key is p_prop This returns roughly 21 rows for 21 column keys that match p_prop. cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'; These three queries each return one row for the requested single column key in the IN clause: SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:all:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:fail:count'); This query returns ONLY ONE ROW (one column key), not three as I would expect from the three-column-key IN clause: cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); This query does return two rows however for the requested two column keys: cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ( 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); cqlsh describe table internal_submission.entity_job; CREATE TABLE entity_job ( e_entid text, p_prop text, describes text, dndcondition text, e_entlinks text, e_entname text, e_enttype text, ingeststatus text, ingeststatusdetail text, p_flags text, p_propid text, p_proplinks text, p_storage text, p_subents text, p_val text, p_vallang text, p_vallinks text, p_valtype text, p_valunit text, p_vars text, partnerid text, referenceid text, size int, sourceip text, submitdate bigint, submitevent text, userid text, version text, PRIMARY KEY (e_entid, p_prop) ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes); CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job (dndcondition); CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job (ingeststatus); CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job (ingeststatusdetail); CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job (referenceid); CREATE INDEX
[jira] [Commented] (CASSANDRA-5815) NPE from migration manager
[ https://issues.apache.org/jira/browse/CASSANDRA-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788237#comment-13788237 ] Brandon Williams commented on CASSANDRA-5815: - It looks the same to me. The good news is the error is purely cosmetic at this point, there's nothing left to do if the gossiper has removed the node (not to mention it's a fat client) NPE from migration manager -- Key: CASSANDRA-5815 URL: https://issues.apache.org/jira/browse/CASSANDRA-5815 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.12 Reporter: Vishy Kasar Assignee: Brandon Williams Priority: Minor In one of our production clusters we see this error often. Looking through the source, Gossiper.instance.getEndpointStateForEndpoint(endpoint) is returning null for some end point. De we need any config change on our end to resolve this? In any case, cassandra should be updated to protect against this NPE. ERROR [OptionalTasks:1] 2013-07-24 13:40:38,972 AbstractCassandraDaemon.java (line 132) Exception in thread Thread[OptionalTasks:1,5,main] java.lang.NullPointerException at org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:134) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) It turned out that the reason for NPE was we bootstrapped a node with the same token as another node. Cassandra should not throw an NPE here but log a meaningful error message. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788243#comment-13788243 ] darion yaphets commented on CASSANDRA-4718: --- LMAX Disruptor's RingBuffer maybe a good idea for lock free component But maybe set a bigger size for hold the structure in ring buffer to avoid cover by new one And is meaning to use more memory ... More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Jason Brown Priority: Minor Labels: performance Attachments: baq vs trunk.png, op costs of various queues.ods, PerThreadQueue.java Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
[ https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788245#comment-13788245 ] Russell Alexander Spitzer commented on CASSANDRA-6151: -- [~brandon.williams], I agree, but we should at least have a better error message then. CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated Key: CASSANDRA-6151 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Russell Alexander Spitzer Assignee: Alex Liu Priority: Minor From http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546 The user was attempting to load a single partition using a where clause in a pig load statement. CQL Table {code} CREATE table data ( occurday text, seqnumber int, occurtimems bigint, unique bigint, fields maptext, text, primary key ((occurday, seqnumber), occurtimems, unique) ) {code} Pig Load statement Query {code} data = LOAD 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27' USING CqlStorage(); {code} This results in an exception when processed by the the CqlPagingRecordReader which attempts to page this query even though it contains at most one partition key. This leads to an invalid CQL statement. CqlPagingRecordReader Query {code} SELECT * FROM data WHERE token(occurday,seqnumber) ? AND token(occurday,seqnumber) = ? AND occurday='A Great Day' AND seqnumber=1 LIMIT 1000 ALLOW FILTERING {code} Exception {code} InvalidRequestException(why:occurday cannot be restricted by more than one relation if it includes an Equal) {code} I'm not sure it is worth the special case but, a modification to not use the paging record reader when the entire partition key is specified would solve this issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain
[ https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] J.B. Langston updated CASSANDRA-5911: - Attachment: 6528_140171_knwmuqxe9bjv5re_system.log Attached system.log showing commitlog replay. This was produced by running stress against a single-node cassandra cluster, then running drain and restarting. Commit logs are not removed after nodetool flush or nodetool drain -- Key: CASSANDRA-5911 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911 Project: Cassandra Issue Type: Bug Components: Core Reporter: J.B. Langston Assignee: Vijay Priority: Minor Fix For: 2.0.2 Attachments: 6528_140171_knwmuqxe9bjv5re_system.log Commit logs are not removed after nodetool flush or nodetool drain. This can lead to unnecessary commit log replay during startup. I've reproduced this on Apache Cassandra 1.2.8. Usually this isn't much of an issue but on a Solr-indexed column family in DSE, each replayed mutation has to be reindexed which can make startup take a long time (on the order of 20-30 min). Reproduction follows: {code} jblangston:bin jblangston$ ./cassandra /dev/null jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000 /dev/null jblangston:bin jblangston$ du -h ../commitlog 576M ../commitlog jblangston:bin jblangston$ nodetool flush jblangston:bin jblangston$ du -h ../commitlog 576M ../commitlog jblangston:bin jblangston$ nodetool drain jblangston:bin jblangston$ du -h ../commitlog 576M ../commitlog jblangston:bin jblangston$ pkill java jblangston:bin jblangston$ du -h ../commitlog 576M ../commitlog jblangston:bin jblangston$ ./cassandra -f | grep Replaying INFO 10:03:42,915 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log INFO 10:03:42,922 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log INFO 10:03:43,907 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log INFO 10:03:43,907 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log INFO 10:03:43,907 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log INFO 10:03:43,908 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log INFO 10:03:43,908 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log INFO 10:03:43,908 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log INFO 10:03:43,909 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log INFO 10:03:43,909 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log INFO 10:03:43,909 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log INFO 10:03:43,910 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log INFO 10:03:43,910 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log INFO 10:03:43,911 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log INFO 10:03:43,911 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log INFO 10:03:43,911 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log INFO 10:03:43,912 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log INFO 10:03:43,912 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log INFO 10:03:43,912 Replaying
[jira] [Commented] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain
[ https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788250#comment-13788250 ] Jonathan Ellis commented on CASSANDRA-5911: --- {noformat} DEBUG [main] 2013-08-21 10:39:36,311 CommitLogReplayer.java (line 150) Reading mutation at 0 DEBUG [main] 2013-08-21 10:39:36,328 CommitLogReplayer.java (line 150) Reading mutation at 336 DEBUG [main] 2013-08-21 10:39:36,328 CommitLogReplayer.java (line 150) Reading mutation at 672 {noformat} This is what concerns me; I would expect it to start scanning w/ the most recent flush point, which should be the same as the end of the commitlog after drain. Commit logs are not removed after nodetool flush or nodetool drain -- Key: CASSANDRA-5911 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911 Project: Cassandra Issue Type: Bug Components: Core Reporter: J.B. Langston Assignee: Vijay Priority: Minor Fix For: 2.0.2 Attachments: 6528_140171_knwmuqxe9bjv5re_system.log Commit logs are not removed after nodetool flush or nodetool drain. This can lead to unnecessary commit log replay during startup. I've reproduced this on Apache Cassandra 1.2.8. Usually this isn't much of an issue but on a Solr-indexed column family in DSE, each replayed mutation has to be reindexed which can make startup take a long time (on the order of 20-30 min). Reproduction follows: {code} jblangston:bin jblangston$ ./cassandra /dev/null jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000 /dev/null jblangston:bin jblangston$ du -h ../commitlog 576M ../commitlog jblangston:bin jblangston$ nodetool flush jblangston:bin jblangston$ du -h ../commitlog 576M ../commitlog jblangston:bin jblangston$ nodetool drain jblangston:bin jblangston$ du -h ../commitlog 576M ../commitlog jblangston:bin jblangston$ pkill java jblangston:bin jblangston$ du -h ../commitlog 576M ../commitlog jblangston:bin jblangston$ ./cassandra -f | grep Replaying INFO 10:03:42,915 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log INFO 10:03:42,922 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log INFO 10:03:43,907 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log INFO 10:03:43,907 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log INFO 10:03:43,907 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log INFO 10:03:43,908 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log INFO 10:03:43,908 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log INFO 10:03:43,908 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log INFO 10:03:43,909 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log INFO 10:03:43,909 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log INFO 10:03:43,909 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log INFO 10:03:43,910 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log INFO 10:03:43,910 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log INFO 10:03:43,911 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log INFO 10:03:43,911 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log INFO 10:03:43,911 Replaying
[jira] [Commented] (CASSANDRA-6124) Ability to specify a DC to consume from when using ColumnFamilyInputFormat externally
[ https://issues.apache.org/jira/browse/CASSANDRA-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788254#comment-13788254 ] Brandon Williams commented on CASSANDRA-6124: - Worth noting this is another case where a CL.LOCALONE would be useful (cc [~jjordan]) Ability to specify a DC to consume from when using ColumnFamilyInputFormat externally - Key: CASSANDRA-6124 URL: https://issues.apache.org/jira/browse/CASSANDRA-6124 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Patricio Echague Priority: Minor Labels: hadoop Fix For: 1.2.11 Attachments: CASSANDRA-6124.diff, CASSANDRA-6124-v2.diff Our production environment looks like this: - 6 cassandra nodes (online DC) - 3 cassandra nodes (offline DC) - Hadoop cluster. we are interested in connecting to the offline DC from hadoop (not colocated with cassandra offline dc) I've tested this patch and seems to work with our 1.2.5 deployment. Kindly review. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6124) Ability to specify a DC to consume from when using ColumnFamilyInputFormat externally
[ https://issues.apache.org/jira/browse/CASSANDRA-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788259#comment-13788259 ] Jonathan Ellis commented on CASSANDRA-6124: --- I'd favor LOCAL_ONE as a slightly more generally useful solution. Ability to specify a DC to consume from when using ColumnFamilyInputFormat externally - Key: CASSANDRA-6124 URL: https://issues.apache.org/jira/browse/CASSANDRA-6124 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Patricio Echague Priority: Minor Labels: hadoop Fix For: 1.2.11 Attachments: CASSANDRA-6124.diff, CASSANDRA-6124-v2.diff Our production environment looks like this: - 6 cassandra nodes (online DC) - 3 cassandra nodes (offline DC) - Hadoop cluster. we are interested in connecting to the offline DC from hadoop (not colocated with cassandra offline dc) I've tested this patch and seems to work with our 1.2.5 deployment. Kindly review. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6124) Ability to specify a DC to consume from when using ColumnFamilyInputFormat externally
[ https://issues.apache.org/jira/browse/CASSANDRA-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788265#comment-13788265 ] Jeremiah Jordan commented on CASSANDRA-6124: +1 to LOCAL_ONE, I still think it is kind of silly, but I have been thinking of more and more fairly valid use cases for it... Ability to specify a DC to consume from when using ColumnFamilyInputFormat externally - Key: CASSANDRA-6124 URL: https://issues.apache.org/jira/browse/CASSANDRA-6124 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Patricio Echague Priority: Minor Labels: hadoop Fix For: 1.2.11 Attachments: CASSANDRA-6124.diff, CASSANDRA-6124-v2.diff Our production environment looks like this: - 6 cassandra nodes (online DC) - 3 cassandra nodes (offline DC) - Hadoop cluster. we are interested in connecting to the offline DC from hadoop (not colocated with cassandra offline dc) I've tested this patch and seems to work with our 1.2.5 deployment. Kindly review. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-5815) NPE from migration manager
[ https://issues.apache.org/jira/browse/CASSANDRA-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788284#comment-13788284 ] Chris Burroughs commented on CASSANDRA-5815: Whoops, missed the important part for the case I am seeing but might not be part of the original (bootstrapping with the same token would presumably fail anyway). The situation I am seeing post NPE is: * Bootstrapping node expects steams from NPE-node * NPE-node says it has no outstanding streams And thus bootstrap never completes. NPE from migration manager -- Key: CASSANDRA-5815 URL: https://issues.apache.org/jira/browse/CASSANDRA-5815 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.12 Reporter: Vishy Kasar Assignee: Brandon Williams Priority: Minor In one of our production clusters we see this error often. Looking through the source, Gossiper.instance.getEndpointStateForEndpoint(endpoint) is returning null for some end point. De we need any config change on our end to resolve this? In any case, cassandra should be updated to protect against this NPE. ERROR [OptionalTasks:1] 2013-07-24 13:40:38,972 AbstractCassandraDaemon.java (line 132) Exception in thread Thread[OptionalTasks:1,5,main] java.lang.NullPointerException at org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:134) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) It turned out that the reason for NPE was we bootstrapped a node with the same token as another node. Cassandra should not throw an NPE here but log a meaningful error message. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (CASSANDRA-6153) Stress stopped calculating latency stats
Ryan McGuire created CASSANDRA-6153: --- Summary: Stress stopped calculating latency stats Key: CASSANDRA-6153 URL: https://issues.apache.org/jira/browse/CASSANDRA-6153 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Ryan McGuire In trunk, cassandra-stress has stopped calculating all latency information: From trunk: {code} $ ccm node1 stress Created keyspaces. Sleeping 1s for propagation. total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time 89995,8999,8999,0.0,0.0,0.0,10 304267,21427,21427,0.0,0.0,0.0,20 514791,21052,21052,0.0,0.0,0.0,30 727471,21268,21268,0.0,0.0,0.0,40 926467,19899,19899,0.0,0.0,0.0,50 100,7353,7353,0.0,0.0,0.0,54 Averages from the middle 80% of values: interval_op_rate : 21249 interval_key_rate : 21249 latency median: 0.0 latency 95th percentile : 0.0 latency 99.9th percentile : 0.0 Total operation time : 00:00:54 END {code} From 2.0: {code} $ ccm node1 stress Created keyspaces. Sleeping 1s for propagation. total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time 66720,6672,6672,0.2,25.6,201.6,10 289577,22285,22285,0.2,3.4,201.1,20 489105,19952,19952,0.2,1.8,201.2,30 660916,17181,17181,0.2,1.6,87.9,40 847452,18653,18653,0.2,1.6,108.8,50 100,15254,15254,0.2,1.6,108.9,59 Averages from the middle 80% of values: interval_op_rate : 19517 interval_key_rate : 19517 latency median: 0.2 latency 95th percentile : 2.1 latency 99.9th percentile : 149.8 Total operation time : 00:00:59 END {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6137) CQL3 SELECT IN CLAUSE inconsistent
[ https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788294#comment-13788294 ] Constance Eustace commented on CASSANDRA-6137: -- nodetool invalidatekeycache did nothing, and we have already turned off rowcaching for another bug... A wider audit of 15,000 rows in other tables that have quite complex column key names produced no errors. It must be the :count at the end of these column keys... CQL3 SELECT IN CLAUSE inconsistent -- Key: CASSANDRA-6137 URL: https://issues.apache.org/jira/browse/CASSANDRA-6137 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu AWS Cassandra 2.0.1 SINGLE NODE Reporter: Constance Eustace Fix For: 2.0.1 We are encountering inconsistent results from CQL3 queries with column keys using IN clause in WHERE. This has been reproduced in cqlsh. Rowkey is e_entid Column key is p_prop This returns roughly 21 rows for 21 column keys that match p_prop. cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'; These three queries each return one row for the requested single column key in the IN clause: SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:all:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:fail:count'); This query returns ONLY ONE ROW (one column key), not three as I would expect from the three-column-key IN clause: cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); This query does return two rows however for the requested two column keys: cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ( 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); cqlsh describe table internal_submission.entity_job; CREATE TABLE entity_job ( e_entid text, p_prop text, describes text, dndcondition text, e_entlinks text, e_entname text, e_enttype text, ingeststatus text, ingeststatusdetail text, p_flags text, p_propid text, p_proplinks text, p_storage text, p_subents text, p_val text, p_vallang text, p_vallinks text, p_valtype text, p_valunit text, p_vars text, partnerid text, referenceid text, size int, sourceip text, submitdate bigint, submitevent text, userid text, version text, PRIMARY KEY (e_entid, p_prop) ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes); CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job (dndcondition); CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job (ingeststatus); CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job (ingeststatusdetail); CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job (referenceid); CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid); CREATE
[jira] [Commented] (CASSANDRA-5815) NPE from migration manager
[ https://issues.apache.org/jira/browse/CASSANDRA-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788296#comment-13788296 ] Brandon Williams commented on CASSANDRA-5815: - [~cburroughs] I think your problem is something else, since the bootstrapping node has not only been marked down, but it's been down long enough to get removed (which is the race between the gossiper and MM causing this NPE) I will note for myself though that the fat client removal should also wait until the node has been marked down before beginning the 30s countdown to removal. If the node has connected but the gossiper doesn't know about it, they haven't gossiped yet, so there's really nothing for MM to do yet anyway. NPE from migration manager -- Key: CASSANDRA-5815 URL: https://issues.apache.org/jira/browse/CASSANDRA-5815 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.12 Reporter: Vishy Kasar Assignee: Brandon Williams Priority: Minor In one of our production clusters we see this error often. Looking through the source, Gossiper.instance.getEndpointStateForEndpoint(endpoint) is returning null for some end point. De we need any config change on our end to resolve this? In any case, cassandra should be updated to protect against this NPE. ERROR [OptionalTasks:1] 2013-07-24 13:40:38,972 AbstractCassandraDaemon.java (line 132) Exception in thread Thread[OptionalTasks:1,5,main] java.lang.NullPointerException at org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:134) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) It turned out that the reason for NPE was we bootstrapped a node with the same token as another node. Cassandra should not throw an NPE here but log a meaningful error message. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6053) system.peers table not updated after decommissioning nodes in C* 2.0
[ https://issues.apache.org/jira/browse/CASSANDRA-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788300#comment-13788300 ] Brandon Williams commented on CASSANDRA-6053: - If a user knows enough to disable loading the state and that fixes the problem, they can clear the peers table manually. system.peers table not updated after decommissioning nodes in C* 2.0 Key: CASSANDRA-6053 URL: https://issues.apache.org/jira/browse/CASSANDRA-6053 Project: Cassandra Issue Type: Bug Components: Core Environment: Datastax AMI running EC2 m1.xlarge instances Reporter: Guyon Moree Assignee: Brandon Williams Attachments: peers After decommissioning my cluster from 20 to 9 nodes using opscenter, I found all but one of the nodes had incorrect system.peers tables. This became a problem (afaik) when using the python-driver, since this queries the peers table to set up its connection pool. Resulting in very slow startup times, because of timeouts. The output of nodetool didn't seem to be affected. After removing the incorrect entries from the peers tables, the connection issues seem to have disappeared for us. Would like some feedback on if this was the right way to handle the issue or if I'm still left with a broken cluster. Attached is the output of nodetool status, which shows the correct 9 nodes. Below that the output of the system.peers tables on the individual nodes. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-2827) Thrift error
[ https://issues.apache.org/jira/browse/CASSANDRA-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788304#comment-13788304 ] Brandon Williams commented on CASSANDRA-2827: - See CASSANDRA-5529 Thrift error Key: CASSANDRA-2827 URL: https://issues.apache.org/jira/browse/CASSANDRA-2827 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.4 Environment: 2 nodes with 0.7.4 on linux Reporter: Olivier Smadja This exeception occured of a non seed node. ERROR [pool-1-thread-9] 2011-06-25 17:41:37,723 CustomTThreadPoolServer.java (line 218) Thrift error occurred during processing of message. org.apache.thrift.TException: Negative length: -2147418111 at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388) at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363) at org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:15964) at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3023) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-2848) Make the Client API support passing down timeouts
[ https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788320#comment-13788320 ] Brandon Williams commented on CASSANDRA-2848: - The problem with that is, it removes request backpressure control from the server, leaving it vulnerable to bad client behavior. Make the Client API support passing down timeouts - Key: CASSANDRA-2848 URL: https://issues.apache.org/jira/browse/CASSANDRA-2848 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Goffinet Priority: Minor Having a max server RPC timeout is good for worst case, but many applications that have middleware in front of Cassandra, might have higher timeout requirements. In a fail fast environment, if my application starting at say the front-end, only has 20ms to process a request, and it must connect to X services down the stack, by the time it hits Cassandra, we might only have 10ms. I propose we provide the ability to specify the timeout on each call we do optionally. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (CASSANDRA-6154) Inserts are blocked in 2.1
Ryan McGuire created CASSANDRA-6154: --- Summary: Inserts are blocked in 2.1 Key: CASSANDRA-6154 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Priority: Critical With cluster sizes 1 inserts are blocked indefinitely: {code} 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6154) Inserts are blocked in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire updated CASSANDRA-6154: Description: With cluster sizes 1 inserts are blocked indefinitely: {code} 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statment never returns.. was: With cluster sizes 1 inserts are blocked indefinitely: {code} 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} Inserts are blocked in 2.1 -- Key: CASSANDRA-6154 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Priority: Critical With cluster sizes 1 inserts are blocked indefinitely: {code} 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statment never returns.. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6154) Inserts are blocked in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire updated CASSANDRA-6154: Description: With cluster sizes 1 inserts are blocked indefinitely: {code} 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statement never returns.. was: With cluster sizes 1 inserts are blocked indefinitely: {code} 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statment never returns.. Inserts are blocked in 2.1 -- Key: CASSANDRA-6154 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Priority: Critical With cluster sizes 1 inserts are blocked indefinitely: {code} 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statement never returns.. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6154) Inserts are blocked in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire updated CASSANDRA-6154: Description: With cluster sizes 1 inserts are blocked indefinitely: {code} $ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test $ ccm populate -n 2 $ ccm start $ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statement never returns.. was: With cluster sizes 1 inserts are blocked indefinitely: {code} 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statement never returns.. Inserts are blocked in 2.1 -- Key: CASSANDRA-6154 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Priority: Critical With cluster sizes 1 inserts are blocked indefinitely: {code} $ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test $ ccm populate -n 2 $ ccm start $ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statement never returns.. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (CASSANDRA-6155) Big cluster upgrade test
Mark Dewey created CASSANDRA-6155: - Summary: Big cluster upgrade test Key: CASSANDRA-6155 URL: https://issues.apache.org/jira/browse/CASSANDRA-6155 Project: Cassandra Issue Type: Test Components: Tests Reporter: Mark Dewey I am planning on writing a test that would: # Launch a 20 node cluster # Put 100 gigs of data on it using cassandra-stress # Perform a rolling upgrade of the cluster # Read the data using cassandra-stress -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6155) Big cluster upgrade test
[ https://issues.apache.org/jira/browse/CASSANDRA-6155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788347#comment-13788347 ] Mark Dewey commented on CASSANDRA-6155: --- [~jbellis], [~dmeyer] please chime in with any suggestions, like # of columns, width of columns, etc. Is there interest in having this upgrade test happen while reads and/or writes are happening on the cluster? Big cluster upgrade test Key: CASSANDRA-6155 URL: https://issues.apache.org/jira/browse/CASSANDRA-6155 Project: Cassandra Issue Type: Test Components: Tests Reporter: Mark Dewey I am planning on writing a test that would: # Launch a 20 node cluster # Put 100 gigs of data on it using cassandra-stress # Perform a rolling upgrade of the cluster # Read the data using cassandra-stress -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (CASSANDRA-6153) Stress stopped calculating latency stats
[ https://issues.apache.org/jira/browse/CASSANDRA-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-6153: - Assignee: Mikhail Stepura Stress stopped calculating latency stats Key: CASSANDRA-6153 URL: https://issues.apache.org/jira/browse/CASSANDRA-6153 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Ryan McGuire Assignee: Mikhail Stepura Fix For: 2.1 In trunk, cassandra-stress has stopped calculating all latency information: From trunk: {code} $ ccm node1 stress Created keyspaces. Sleeping 1s for propagation. total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time 89995,8999,8999,0.0,0.0,0.0,10 304267,21427,21427,0.0,0.0,0.0,20 514791,21052,21052,0.0,0.0,0.0,30 727471,21268,21268,0.0,0.0,0.0,40 926467,19899,19899,0.0,0.0,0.0,50 100,7353,7353,0.0,0.0,0.0,54 Averages from the middle 80% of values: interval_op_rate : 21249 interval_key_rate : 21249 latency median: 0.0 latency 95th percentile : 0.0 latency 99.9th percentile : 0.0 Total operation time : 00:00:54 END {code} From 2.0: {code} $ ccm node1 stress Created keyspaces. Sleeping 1s for propagation. total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time 66720,6672,6672,0.2,25.6,201.6,10 289577,22285,22285,0.2,3.4,201.1,20 489105,19952,19952,0.2,1.8,201.2,30 660916,17181,17181,0.2,1.6,87.9,40 847452,18653,18653,0.2,1.6,108.8,50 100,15254,15254,0.2,1.6,108.9,59 Averages from the middle 80% of values: interval_op_rate : 19517 interval_key_rate : 19517 latency median: 0.2 latency 95th percentile : 2.1 latency 99.9th percentile : 149.8 Total operation time : 00:00:59 END {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6153) Stress stopped calculating latency stats
[ https://issues.apache.org/jira/browse/CASSANDRA-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6153: -- Fix Version/s: 2.1 Stress stopped calculating latency stats Key: CASSANDRA-6153 URL: https://issues.apache.org/jira/browse/CASSANDRA-6153 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Ryan McGuire Assignee: Mikhail Stepura Fix For: 2.1 In trunk, cassandra-stress has stopped calculating all latency information: From trunk: {code} $ ccm node1 stress Created keyspaces. Sleeping 1s for propagation. total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time 89995,8999,8999,0.0,0.0,0.0,10 304267,21427,21427,0.0,0.0,0.0,20 514791,21052,21052,0.0,0.0,0.0,30 727471,21268,21268,0.0,0.0,0.0,40 926467,19899,19899,0.0,0.0,0.0,50 100,7353,7353,0.0,0.0,0.0,54 Averages from the middle 80% of values: interval_op_rate : 21249 interval_key_rate : 21249 latency median: 0.0 latency 95th percentile : 0.0 latency 99.9th percentile : 0.0 Total operation time : 00:00:54 END {code} From 2.0: {code} $ ccm node1 stress Created keyspaces. Sleeping 1s for propagation. total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time 66720,6672,6672,0.2,25.6,201.6,10 289577,22285,22285,0.2,3.4,201.1,20 489105,19952,19952,0.2,1.8,201.2,30 660916,17181,17181,0.2,1.6,87.9,40 847452,18653,18653,0.2,1.6,108.8,50 100,15254,15254,0.2,1.6,108.9,59 Averages from the middle 80% of values: interval_op_rate : 19517 interval_key_rate : 19517 latency median: 0.2 latency 95th percentile : 2.1 latency 99.9th percentile : 149.8 Total operation time : 00:00:59 END {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
git commit: Fix SP.sendToHintedEndpoints() javadoc
Updated Branches: refs/heads/cassandra-1.2 d396fd47d - 9d31ac14d Fix SP.sendToHintedEndpoints() javadoc Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9d31ac14 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9d31ac14 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9d31ac14 Branch: refs/heads/cassandra-1.2 Commit: 9d31ac14dfec10ede53cbd6fecfd9a08c39bfa45 Parents: d396fd4 Author: Aleksey Yeschenko alek...@apache.org Authored: Tue Oct 8 01:55:29 2013 +0800 Committer: Aleksey Yeschenko alek...@apache.org Committed: Tue Oct 8 01:55:29 2013 +0800 -- src/java/org/apache/cassandra/service/StorageProxy.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9d31ac14/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java b/src/java/org/apache/cassandra/service/StorageProxy.java index 8a6e52e..cdb0bd6 100644 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@ -467,7 +467,7 @@ public class StorageProxy implements StorageProxyMBean * | off| =1 | -- DO NOT fire hints. And DO NOT wait for them to complete. * | off| ANY | -- DO NOT fire hints. And DO NOT wait for them to complete. * - * @throws TimeoutException if the hints cannot be written/enqueued + * @throws OverloadedException if the hints cannot be written/enqueued */ public static void sendToHintedEndpoints(final RowMutation rm, IterableInetAddress targets,
[1/2] git commit: Fix SP.sendToHintedEndpoints() javadoc
Updated Branches: refs/heads/cassandra-2.0 6a603046e - 8e8db1f20 Fix SP.sendToHintedEndpoints() javadoc Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9d31ac14 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9d31ac14 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9d31ac14 Branch: refs/heads/cassandra-2.0 Commit: 9d31ac14dfec10ede53cbd6fecfd9a08c39bfa45 Parents: d396fd4 Author: Aleksey Yeschenko alek...@apache.org Authored: Tue Oct 8 01:55:29 2013 +0800 Committer: Aleksey Yeschenko alek...@apache.org Committed: Tue Oct 8 01:55:29 2013 +0800 -- src/java/org/apache/cassandra/service/StorageProxy.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9d31ac14/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java b/src/java/org/apache/cassandra/service/StorageProxy.java index 8a6e52e..cdb0bd6 100644 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@ -467,7 +467,7 @@ public class StorageProxy implements StorageProxyMBean * | off| =1 | -- DO NOT fire hints. And DO NOT wait for them to complete. * | off| ANY | -- DO NOT fire hints. And DO NOT wait for them to complete. * - * @throws TimeoutException if the hints cannot be written/enqueued + * @throws OverloadedException if the hints cannot be written/enqueued */ public static void sendToHintedEndpoints(final RowMutation rm, IterableInetAddress targets,
[2/2] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8e8db1f2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8e8db1f2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8e8db1f2 Branch: refs/heads/cassandra-2.0 Commit: 8e8db1f20688eef9a89e112f2295d160c9c35075 Parents: 6a60304 9d31ac1 Author: Aleksey Yeschenko alek...@apache.org Authored: Tue Oct 8 01:56:55 2013 +0800 Committer: Aleksey Yeschenko alek...@apache.org Committed: Tue Oct 8 01:56:55 2013 +0800 -- src/java/org/apache/cassandra/service/StorageProxy.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8e8db1f2/src/java/org/apache/cassandra/service/StorageProxy.java --
[jira] [Updated] (CASSANDRA-6155) Big cluster upgrade test
[ https://issues.apache.org/jira/browse/CASSANDRA-6155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Dewey updated CASSANDRA-6155: -- Description: I am planning on writing a test that would: # Launch a 20 node cluster # Put 100 gigs of data on it (per node) using cassandra-stress # Perform a rolling upgrade of the cluster # Read the data using cassandra-stress was: I am planning on writing a test that would: # Launch a 20 node cluster # Put 100 gigs of data on it using cassandra-stress # Perform a rolling upgrade of the cluster # Read the data using cassandra-stress Big cluster upgrade test Key: CASSANDRA-6155 URL: https://issues.apache.org/jira/browse/CASSANDRA-6155 Project: Cassandra Issue Type: Test Components: Tests Reporter: Mark Dewey I am planning on writing a test that would: # Launch a 20 node cluster # Put 100 gigs of data on it (per node) using cassandra-stress # Perform a rolling upgrade of the cluster # Read the data using cassandra-stress -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
[ https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788362#comment-13788362 ] Alex Liu commented on CASSANDRA-6151: - I think it's better not implement this to keep paging algorithm simple. Let Pig/Hive to filter the result by using partition key clause at Hive/Pig side, CqlPagingRecordReader paging through the rows with only user defined where clauses on none-partition keys. Let's document this somewhere. The paging algorithm of CqlPagingRecordReader is based on token range. CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated Key: CASSANDRA-6151 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Russell Alexander Spitzer Assignee: Alex Liu Priority: Minor From http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546 The user was attempting to load a single partition using a where clause in a pig load statement. CQL Table {code} CREATE table data ( occurday text, seqnumber int, occurtimems bigint, unique bigint, fields maptext, text, primary key ((occurday, seqnumber), occurtimems, unique) ) {code} Pig Load statement Query {code} data = LOAD 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27' USING CqlStorage(); {code} This results in an exception when processed by the the CqlPagingRecordReader which attempts to page this query even though it contains at most one partition key. This leads to an invalid CQL statement. CqlPagingRecordReader Query {code} SELECT * FROM data WHERE token(occurday,seqnumber) ? AND token(occurday,seqnumber) = ? AND occurday='A Great Day' AND seqnumber=1 LIMIT 1000 ALLOW FILTERING {code} Exception {code} InvalidRequestException(why:occurday cannot be restricted by more than one relation if it includes an Equal) {code} I'm not sure it is worth the special case but, a modification to not use the paging record reader when the entire partition key is specified would solve this issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
[ https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788365#comment-13788365 ] Alex Liu commented on CASSANDRA-6151: - Or we can add a validate method to validate the user defined where clauses. CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated Key: CASSANDRA-6151 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Russell Alexander Spitzer Assignee: Alex Liu Priority: Minor From http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546 The user was attempting to load a single partition using a where clause in a pig load statement. CQL Table {code} CREATE table data ( occurday text, seqnumber int, occurtimems bigint, unique bigint, fields maptext, text, primary key ((occurday, seqnumber), occurtimems, unique) ) {code} Pig Load statement Query {code} data = LOAD 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27' USING CqlStorage(); {code} This results in an exception when processed by the the CqlPagingRecordReader which attempts to page this query even though it contains at most one partition key. This leads to an invalid CQL statement. CqlPagingRecordReader Query {code} SELECT * FROM data WHERE token(occurday,seqnumber) ? AND token(occurday,seqnumber) = ? AND occurday='A Great Day' AND seqnumber=1 LIMIT 1000 ALLOW FILTERING {code} Exception {code} InvalidRequestException(why:occurday cannot be restricted by more than one relation if it includes an Equal) {code} I'm not sure it is worth the special case but, a modification to not use the paging record reader when the entire partition key is specified would solve this issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788367#comment-13788367 ] Ryan McGuire commented on CASSANDRA-4338: - I started to run a benchmark for this but I found CASSANDRA-6153 and CASSANDRA-6154 standing in my way. [Here's the data|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.4338.CompressedSequentialWriter.jsonmetric=interval_op_rateoperation=stress-writesmoothing=4] for my test with [~krummas]' patch, but it's missing any sort of baseline because of those above bugs. Experiment with direct buffer in SequentialWriter - Key: CASSANDRA-4338 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Marcus Eriksson Priority: Minor Labels: performance Fix For: 2.1 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk-me.png, gc-trunk.png, gc-with-patch-me.png Using a direct buffer instead of a heap-based byte[] should let us avoid a copy into native memory when we flush the buffer. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (CASSANDRA-6156) Poor resilience and recovery for bootstrapping node - unable to fetch range
Alyssa Kwan created CASSANDRA-6156: -- Summary: Poor resilience and recovery for bootstrapping node - unable to fetch range Key: CASSANDRA-6156 URL: https://issues.apache.org/jira/browse/CASSANDRA-6156 Project: Cassandra Issue Type: Bug Components: Core Reporter: Alyssa Kwan Fix For: 1.2.8 We have an 8 node cluster on 1.2.8 using vnodes. One of our nodes failed and we are having lots of trouble bootstrapping it back. On each attempt, bootstrapping eventually fails with a RuntimeException Unable to fetch range. As far as we can tell, long GC pauses on the sender side cause heartbeat drops or delays, which leads the gossip controller to convict the connection and mark the sender dead. We've done significant GC tuning to minimize the duration of pauses and raised phi_convict to its max. It merely lets the bootstrap process take longer to fail. The inability to reliably add nodes significantly affects our ability to scale. We're not the only ones: http://stackoverflow.com/questions/19199349/cassandra-bootstrap-fails-with-unable-to-fetch-range What can we do in the immediate term to bring this node in? And what's the long term solution? One possible solution would be to allow bootstrapping to be an incremental process with individual transfers of vnode ownership instead of attempting to transfer the whole set of vnodes transactionally. (I assume that's what's happening now.) I don't know what would have to change on the gossip and token-aware client side to support this. Another solution would be to partition sstable files by vnode and allow transfer of those files directly with some sort of checkpointing of and incremental transfer of writes after the sstable is transferred. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-2848) Make the Client API support passing down timeouts
[ https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788390#comment-13788390 ] sankalp kohli commented on CASSANDRA-2848: -- So what we can do is that rpc specified in server can be a max. If client is passing timeout less than that, there is no need to keep the read request running for longer. This is very useful as when cluster is on its knees, these extra read requests getting killed can help a lot. Make the Client API support passing down timeouts - Key: CASSANDRA-2848 URL: https://issues.apache.org/jira/browse/CASSANDRA-2848 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Goffinet Priority: Minor Having a max server RPC timeout is good for worst case, but many applications that have middleware in front of Cassandra, might have higher timeout requirements. In a fail fast environment, if my application starting at say the front-end, only has 20ms to process a request, and it must connect to X services down the stack, by the time it hits Cassandra, we might only have 10ms. I propose we provide the ability to specify the timeout on each call we do optionally. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6128) Add more data mappings for Pig
[ https://issues.apache.org/jira/browse/CASSANDRA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788402#comment-13788402 ] Alex Liu commented on CASSANDRA-6128: - Decimal has different precision than float/double, we will lose precision if we convert a decimal to a float/double. It is explained in this link http://stackoverflow.com/questions/5749615/losing-precision-converting-from-java-bigdecimal-to-double If we don't need preserve the precision, we can use a double instead of a string. Add more data mappings for Pig -- Key: CASSANDRA-6128 URL: https://issues.apache.org/jira/browse/CASSANDRA-6128 Project: Cassandra Issue Type: Bug Reporter: Alex Liu Assignee: Alex Liu Attachments: 6128-1.2-branch.txt We need add more data mappings for {code} DecimalType InetAddressType LexicalUUIDType TimeUUIDType UUIDType {code} Existing implementation throws exception for those data type -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lyuben Todorov updated CASSANDRA-4809: -- Attachment: 4809__v2.patch Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch, 4809__v2.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6149) OOM in Cassandra 2.0.1
[ https://issues.apache.org/jira/browse/CASSANDRA-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788422#comment-13788422 ] Pavel Yaskevich commented on CASSANDRA-6149: +1 OOM in Cassandra 2.0.1 -- Key: CASSANDRA-6149 URL: https://issues.apache.org/jira/browse/CASSANDRA-6149 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7 64 bit. Java 64-bit 1.7.0_25. Cassandra 2.0.1 Reporter: Kai Wang Assignee: Jonathan Ellis Fix For: 2.0.2 Attachments: 6149-debug.txt, 6149.txt I have a program to stress test Cassandra. What it does is remove/insert rows with a small set of row keys as fast as possible. Two CFs are involved. When I test against C* 1.2.3 with default configurations, it ran for 24 hours and C* doesn't having any issue. However after I upgraded to C* 2.0.1, C* crashes on OOM within 1-2 minutes. I can consistently reproduce this. I built C* from the source and found out the last good changeset is cfa097cdd5e28d7fe8204248e246a1fae226d2c0. As soon as I include the next changeset 1e0d9513b748fae4ec0737283da71c65e9272102, C* starts to crash. What's interesting is although it seems the change was reverted by fc1a7206fe15882fd64e7ba8eb68ba9dc320275f. C* built from fc1a7206fe15882fd64e7ba8eb68ba9dc320275f has the same problem - OOM within minutes. I didn't test against the official 2.0.0. But the C* built from 03045ca22b11b0e5fc85c4fabd83ce6121b5709b seems OK. I assume that's what 2.0.0 is. I use default configurations in all cases. I didn't tune anything. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788416#comment-13788416 ] Carl Yeksigian commented on CASSANDRA-4809: --- Sorry, missed the question before. Overall, the proposed updates to the patch look good (I had forgotten about the low quality of the first patch -- done during a meetup). I'm wondering why the keyspace checks have been moved inside of the loop, but the control flow is still a return. Seems like either we can check it once before the loop and exit the method, or we continue instead of return if we want to check each mutation. Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch, 4809__v2.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[1/3] git commit: Fix SP.sendToHintedEndpoints() javadoc
Updated Branches: refs/heads/trunk e43b82ba6 - b966e1ad2 Fix SP.sendToHintedEndpoints() javadoc Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9d31ac14 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9d31ac14 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9d31ac14 Branch: refs/heads/trunk Commit: 9d31ac14dfec10ede53cbd6fecfd9a08c39bfa45 Parents: d396fd4 Author: Aleksey Yeschenko alek...@apache.org Authored: Tue Oct 8 01:55:29 2013 +0800 Committer: Aleksey Yeschenko alek...@apache.org Committed: Tue Oct 8 01:55:29 2013 +0800 -- src/java/org/apache/cassandra/service/StorageProxy.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9d31ac14/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java b/src/java/org/apache/cassandra/service/StorageProxy.java index 8a6e52e..cdb0bd6 100644 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@ -467,7 +467,7 @@ public class StorageProxy implements StorageProxyMBean * | off| =1 | -- DO NOT fire hints. And DO NOT wait for them to complete. * | off| ANY | -- DO NOT fire hints. And DO NOT wait for them to complete. * - * @throws TimeoutException if the hints cannot be written/enqueued + * @throws OverloadedException if the hints cannot be written/enqueued */ public static void sendToHintedEndpoints(final RowMutation rm, IterableInetAddress targets,
[2/3] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8e8db1f2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8e8db1f2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8e8db1f2 Branch: refs/heads/trunk Commit: 8e8db1f20688eef9a89e112f2295d160c9c35075 Parents: 6a60304 9d31ac1 Author: Aleksey Yeschenko alek...@apache.org Authored: Tue Oct 8 01:56:55 2013 +0800 Committer: Aleksey Yeschenko alek...@apache.org Committed: Tue Oct 8 01:56:55 2013 +0800 -- src/java/org/apache/cassandra/service/StorageProxy.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8e8db1f2/src/java/org/apache/cassandra/service/StorageProxy.java --
[3/3] git commit: Merge branch 'cassandra-2.0' into trunk
Merge branch 'cassandra-2.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b966e1ad Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b966e1ad Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b966e1ad Branch: refs/heads/trunk Commit: b966e1ad21a345838a2b50e1790e0257fab30c7f Parents: e43b82b 8e8db1f Author: Aleksey Yeschenko alek...@apache.org Authored: Tue Oct 8 02:39:59 2013 +0800 Committer: Aleksey Yeschenko alek...@apache.org Committed: Tue Oct 8 02:39:59 2013 +0800 -- src/java/org/apache/cassandra/service/StorageProxy.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b966e1ad/src/java/org/apache/cassandra/service/StorageProxy.java --
[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788425#comment-13788425 ] Benedict commented on CASSANDRA-4718: - Disruptors are very difficult to use as a drop in replacement for the executor service, so I tried to knock up some queues that could provide similar performance without ripping apart the whole application. The resulting queues I benchmarked under high load, in isolation, against LinkedBlockingQueue, BlockingArrayQueue and the Disruptor, and plotted the average op costs in the op costs of various queues attachment*. As can be seen, these queues and the Disruptor are substantially faster under high load than LinkedBlockingQueue, however it can also be seen that: - The average op cost for LinkedBlockingQueue is still very low, in fact only around 300ns at worst - BlockingArrayQueue is considerably worse than LinkedBlockingQueue under all conditions These suggest both that the overhead attributed to LinkedBlockingQueue for a 1Mop workload (as run above) should be at most a few seconds of the overall cost (probably much less); and that BlockingArrayQueue is unlikely to make any cost incurred by LinkedBlockingQueue substantially better. This made me suspect the previous result might be attributable to random variance, but to be sure I ran a number of ccm -stress tests with the different queues, and plotted the results in stress op rate with various queues.ods, which show the following: 1) No meaningful difference between BAQ, LBQ and SlowQueue (though the latter has a clear ~1% slow down) 2) UltraSlow (~10x slow down, or 2000ns spinning each op) is approximately 5% slower 3) The faster queue actually slows down the process, by about 9% - more than the queue supposedly much slower than it! Anyway, I've been concurrently looking at where I might be able to improve performance independent of this, and have found the following: A) Raw performance of local reads is ~6-7x faster than through Stress B) Raw performance of local reads run asynchronously is ~4x faster C) Raw performance of local reads run asynchronously using the fast queue is ~4.7x faster D) Performance of local reads from the Thrift server-side methods is ~3x faster E) Performance of remote (i.e. local non-optimised) reads is ~1.5x faster In particular (C) is interesting, as it demonstrates the queue really is faster in use, but I've yet to absolutely determine why that translates into an overall decline in throughput. It looks as though it's possible it causes greater congestion in LockSupport.unpark(), but this is a new piece of information, derived from YourKit. As these sorts of methods are difficult to meter accurately I don't necessarily trust it, and haven't had a chance to figure out what I can do with the information. If it is accurate, and I can figure out how to reduce the overhead, we might get a modest speed boost, which will accumulate as we find other places to improve. As to the overall problem of improving throughput, it seems to me that there are two big avenues to explore: 1) the networking (software) overhead is large; 2) possibly the cost of managing thread liveness (e.g. park/unpark/scheduler costs); though the evidence for this is as yet inconclusive... given the op rate and other evidence it doesn't seem to be synchronization overhead. I'm still trying to pin this down. Once the costs here are nailed down as tight as they can go, I'm pretty confident we can get some noticeable improvements to the actual work being done, but since that currently accounts for only a fraction of the time spent (probably less than 20%), I'd rather wait until it was a higher percentage so any improvement is multiplied. * These can be replicated by running org.apache.cassandra.concurrent.test.bench.Benchmark on any of the linked branches on github. https://github.com/belliottsmith/cassandra/tree/4718-lbq [using LinkedBlockingQueue] https://github.com/belliottsmith/cassandra/tree/4718-baq [using BlockingArrayQueue] https://github.com/belliottsmith/cassandra/tree/4718-lpbq [using a new high performance queue] https://github.com/belliottsmith/cassandra/tree/4718-slow [using a LinkedBlockingQueue with 200ns spinning each op] https://github.com/belliottsmith/cassandra/tree/4718-ultraslow [using a LinkedBlockingQueue with 2000ns spinning each op] More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Jason Brown Priority: Minor Labels: performance Attachments: baq vs trunk.png, op costs of various queues.ods,
[jira] [Updated] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-4718: Attachment: stress op rate with various queues.ods More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Jason Brown Priority: Minor Labels: performance Attachments: baq vs trunk.png, op costs of various queues.ods, PerThreadQueue.java, stress op rate with various queues.ods Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788426#comment-13788426 ] Jonathan Ellis commented on CASSANDRA-4718: --- bq. The faster queue actually slows down the process, by about 9% - more than the queue supposedly much slower than it So this actually confirms Ryan's original measurement of C*/BAQ [slow queue] faster than C*/LBQ [fast queue]? More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Jason Brown Priority: Minor Labels: performance Attachments: baq vs trunk.png, op costs of various queues.ods, PerThreadQueue.java, stress op rate with various queues.ods Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results
[ https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788437#comment-13788437 ] Li Zou commented on CASSANDRA-5932: --- This morning's trunk load has a slightly different symptom, and is even more serious than last Friday's load, as this time just commenting out the assert statement in the {{MessagingService.addCallback()}} will not help. I copy the {{/var/log/cassandra/system.log}} exception errors below. {noformat} ERROR [Thrift:12] 2013-10-07 14:42:39,396 Caller+0 at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134) - Exception in thread Thread[Thrift:12,5,main] java.lang.AssertionError: null at org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:543) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:591) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:869) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:123) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:739) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:511) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:581) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:379) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:363) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:126) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:267) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.thrift.CassandraServer.execute_prepared_cql3_query(CassandraServer.java:2061) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.thrift.Cassandra$Processor$execute_prepared_cql3_query.getResult(Cassandra.java:4502) ~[apache-cassandra-thrift-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.thrift.Cassandra$Processor$execute_prepared_cql3_query.getResult(Cassandra.java:4486) ~[apache-cassandra-thrift-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[libthrift-0.9.1.jar:0.9.1] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[libthrift-0.9.1.jar:0.9.1] at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:194) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_25] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_25] at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25] {noformat} Speculative read performance data show unexpected results - Key: CASSANDRA-5932 URL: https://issues.apache.org/jira/browse/CASSANDRA-5932 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Assignee: Aleksey Yeschenko Fix For: 2.0.2 Attachments: 5932.6692c50412ef7d.compaction.png, 5932-6692c50412ef7d.png, 5932.6692c50412ef7d.rr0.png, 5932.6692c50412ef7d.rr1.png, 5932.ded39c7e1c2fa.logs.tar.gz, 5932.txt, 5933-128_and_200rc1.png, 5933-7a87fc11.png, 5933-logs.tar.gz, 5933-randomized-dsnitch-replica.2.png, 5933-randomized-dsnitch-replica.3.png, 5933-randomized-dsnitch-replica.png, compaction-makes-slow.png, compaction-makes-slow-stats.png, eager-read-looks-promising.png, eager-read-looks-promising-stats.png, eager-read-not-consistent.png, eager-read-not-consistent-stats.png, node-down-increase-performance.png I've done a series of stress tests with eager retries enabled that show undesirable behavior. I'm grouping these behaviours into one ticket as they are most likely related. 1) Killing off a node in a 4 node cluster actually increases performance.
[1/2] git commit: Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128
Updated Branches: refs/heads/cassandra-2.0 8e8db1f20 - c374aca19 Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3633aea4 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3633aea4 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3633aea4 Branch: refs/heads/cassandra-2.0 Commit: 3633aea42d7689fa0252c104f62b0646d0858624 Parents: d396fd4 Author: Brandon Williams brandonwilli...@apache.org Authored: Mon Oct 7 13:57:45 2013 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Mon Oct 7 13:57:45 2013 -0500 -- .../hadoop/pig/AbstractCassandraStorage.java| 30 +++- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 7 ++--- 3 files changed, 26 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/3633aea4/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java -- diff --git a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java index ce92014..6ad4f9e 100644 --- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java +++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java @@ -110,7 +110,7 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store ListCompositeComponent result = comparator.deconstruct(name); Tuple t = TupleFactory.getInstance().newTuple(result.size()); for (int i=0; iresult.size(); i++) -setTupleValue(t, i, result.get(i).comparator.compose(result.get(i).value)); +setTupleValue(t, i, cassandraToObj(result.get(i).comparator, result.get(i).value)); return t; } @@ -124,7 +124,7 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store if(comparator instanceof AbstractCompositeType) setTupleValue(pair, 0, composeComposite((AbstractCompositeType)comparator,col.name())); else -setTupleValue(pair, 0, comparator.compose(col.name())); +setTupleValue(pair, 0, cassandraToObj(comparator, col.name())); // value if (col instanceof Column) @@ -134,10 +134,10 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store if (validators.get(col.name()) == null) { MapMarshallerType, AbstractType marshallers = getDefaultMarshallers(cfDef); -setTupleValue(pair, 1, marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value())); +setTupleValue(pair, 1, cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value())); } else -setTupleValue(pair, 1, validators.get(col.name()).compose(col.value())); +setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), col.value())); return pair; } else @@ -327,9 +327,12 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store return DataType.LONG; else if (type instanceof IntegerType || type instanceof Int32Type) // IntegerType will overflow at 2**31, but is kept for compatibility until pig has a BigInteger return DataType.INTEGER; -else if (type instanceof AsciiType) -return DataType.CHARARRAY; -else if (type instanceof UTF8Type) +else if (type instanceof AsciiType || +type instanceof UTF8Type || +type instanceof DecimalType || +type instanceof InetAddressType || +type instanceof LexicalUUIDType || +type instanceof UUIDType ) return DataType.CHARARRAY; else if (type instanceof FloatType) return DataType.FLOAT; @@ -772,5 +775,18 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store } return null; } + +protected Object cassandraToObj(AbstractType validator, ByteBuffer value) +{ +if (validator instanceof DecimalType || +validator instanceof InetAddressType || +validator instanceof LexicalUUIDType || +validator instanceof UUIDType) +{ +return validator.getString(value); +} +else +return validator.compose(value); +} }
[2/2] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Conflicts: src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c374aca1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c374aca1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c374aca1 Branch: refs/heads/cassandra-2.0 Commit: c374aca19ea39fbbc588a2309c669c422e0318cd Parents: 8e8db1f 3633aea Author: Brandon Williams brandonwilli...@apache.org Authored: Mon Oct 7 14:02:39 2013 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Mon Oct 7 14:02:39 2013 -0500 -- .../hadoop/pig/AbstractCassandraStorage.java| 30 +++- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 7 ++--- 3 files changed, 26 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java -- diff --cc src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java index 1e207b3,6ad4f9e..c881734 --- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java +++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java @@@ -124,17 -124,31 +124,17 @@@ public abstract class AbstractCassandra if(comparator instanceof AbstractCompositeType) setTupleValue(pair, 0, composeComposite((AbstractCompositeType)comparator,col.name())); else - setTupleValue(pair, 0, comparator.compose(col.name())); + setTupleValue(pair, 0, cassandraToObj(comparator, col.name())); // value -if (col instanceof Column) +MapByteBuffer,AbstractType validators = getValidatorMap(cfDef); +if (validators.get(col.name()) == null) { -// standard -MapByteBuffer,AbstractType validators = getValidatorMap(cfDef); -if (validators.get(col.name()) == null) -{ -MapMarshallerType, AbstractType marshallers = getDefaultMarshallers(cfDef); -setTupleValue(pair, 1, cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value())); -} -else -setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), col.value())); -return pair; +MapMarshallerType, AbstractType marshallers = getDefaultMarshallers(cfDef); - setTupleValue(pair, 1, marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value())); ++setTupleValue(pair, 1, cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value())); } else - setTupleValue(pair, 1, validators.get(col.name()).compose(col.value())); -{ -// super -ArrayListTuple subcols = new ArrayListTuple(); -for (IColumn subcol : col.getSubColumns()) -subcols.add(columnToTuple(subcol, cfDef, parseType(cfDef.getSubcomparator_type(; - -pair.set(1, new DefaultDataBag(subcols)); -} ++setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), col.value())); return pair; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java --
[jira] [Commented] (CASSANDRA-6131) JAVA_HOME on cassandra-env.sh is ignored on Debian packages
[ https://issues.apache.org/jira/browse/CASSANDRA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788446#comment-13788446 ] Sebastián Lacuesta commented on CASSANDRA-6131: --- Tried the patch with the 2.0.1 source tarball from the debian src file extracted from .dsc, patching file debian/init Hunk #4 FAILED at 53. Hunk #5 FAILED at 95. 2 out of 5 hunks FAILED -- saving rejects to file debian/init.rej content of debian/init.rej: --- debian/init +++ debian/init @@ -53,10 +29,6 @@ # Depend on lsb-base (= 3.0-6) to ensure that this file is present. . /lib/lsb/init-functions -# If JNA is installed, add it to EXTRA_CLASSPATH -# -EXTRA_CLASSPATH=/usr/share/java/jna.jar:$EXTRA_CLASSPATH - # # Function that returns 0 if process is running, or nonzero if not. # @@ -95,7 +67,7 @@ [ -e `dirname $PIDFILE` ] || \ install -d -ocassandra -gcassandra -m750 `dirname $PIDFILE` -export EXTRA_CLASSPATH + start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -q -p $PIDFILE -t /dev/null || return 1 JAVA_HOME on cassandra-env.sh is ignored on Debian packages --- Key: CASSANDRA-6131 URL: https://issues.apache.org/jira/browse/CASSANDRA-6131 Project: Cassandra Issue Type: Bug Components: Packaging Environment: I've just got upgraded to 2.0.1 package from the apache repositories using apt. I had the JAVA_HOME environment variable set in /etc/cassandra/cassandra-env.sh but after the upgrade it only worked by setting it on /usr/sbin/cassandra script. I can't configure java 7 system wide, only for cassandra. Off-toppic: Thanks for getting rid of the jsvc mess. Reporter: Sebastián Lacuesta Assignee: Eric Evans Labels: debian Fix For: 2.0.2 Attachments: 6131.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[1/2] git commit: Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128
Updated Branches: refs/heads/cassandra-1.2 9d31ac14d - bdb7bb16f refs/heads/trunk b966e1ad2 - 538039a70 Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bdb7bb16 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bdb7bb16 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bdb7bb16 Branch: refs/heads/cassandra-1.2 Commit: bdb7bb16facda0fbe266390bd3213f092d02c0dc Parents: 9d31ac1 Author: Brandon Williams brandonwilli...@apache.org Authored: Mon Oct 7 13:57:45 2013 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Mon Oct 7 14:03:08 2013 -0500 -- .../hadoop/pig/AbstractCassandraStorage.java| 30 +++- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 7 ++--- 3 files changed, 26 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/bdb7bb16/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java -- diff --git a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java index ce92014..6ad4f9e 100644 --- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java +++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java @@ -110,7 +110,7 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store ListCompositeComponent result = comparator.deconstruct(name); Tuple t = TupleFactory.getInstance().newTuple(result.size()); for (int i=0; iresult.size(); i++) -setTupleValue(t, i, result.get(i).comparator.compose(result.get(i).value)); +setTupleValue(t, i, cassandraToObj(result.get(i).comparator, result.get(i).value)); return t; } @@ -124,7 +124,7 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store if(comparator instanceof AbstractCompositeType) setTupleValue(pair, 0, composeComposite((AbstractCompositeType)comparator,col.name())); else -setTupleValue(pair, 0, comparator.compose(col.name())); +setTupleValue(pair, 0, cassandraToObj(comparator, col.name())); // value if (col instanceof Column) @@ -134,10 +134,10 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store if (validators.get(col.name()) == null) { MapMarshallerType, AbstractType marshallers = getDefaultMarshallers(cfDef); -setTupleValue(pair, 1, marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value())); +setTupleValue(pair, 1, cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value())); } else -setTupleValue(pair, 1, validators.get(col.name()).compose(col.value())); +setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), col.value())); return pair; } else @@ -327,9 +327,12 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store return DataType.LONG; else if (type instanceof IntegerType || type instanceof Int32Type) // IntegerType will overflow at 2**31, but is kept for compatibility until pig has a BigInteger return DataType.INTEGER; -else if (type instanceof AsciiType) -return DataType.CHARARRAY; -else if (type instanceof UTF8Type) +else if (type instanceof AsciiType || +type instanceof UTF8Type || +type instanceof DecimalType || +type instanceof InetAddressType || +type instanceof LexicalUUIDType || +type instanceof UUIDType ) return DataType.CHARARRAY; else if (type instanceof FloatType) return DataType.FLOAT; @@ -772,5 +775,18 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store } return null; } + +protected Object cassandraToObj(AbstractType validator, ByteBuffer value) +{ +if (validator instanceof DecimalType || +validator instanceof InetAddressType || +validator instanceof LexicalUUIDType || +validator instanceof UUIDType) +{ +return validator.getString(value); +} +else +return validator.compose(value); +} }
[2/2] git commit: Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128
Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/538039a7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/538039a7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/538039a7 Branch: refs/heads/trunk Commit: 538039a7001a4db0ff87dafbfe0be2877310b14f Parents: b966e1a Author: Brandon Williams brandonwilli...@apache.org Authored: Mon Oct 7 13:57:45 2013 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Mon Oct 7 14:06:29 2013 -0500 -- .../hadoop/pig/AbstractCassandraStorage.java| 31 +++- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 7 ++--- 3 files changed, 26 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/538039a7/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java -- diff --git a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java index 1e207b3..0766adf 100644 --- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java +++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java @@ -110,7 +110,7 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store ListCompositeComponent result = comparator.deconstruct(name); Tuple t = TupleFactory.getInstance().newTuple(result.size()); for (int i=0; iresult.size(); i++) -setTupleValue(t, i, result.get(i).comparator.compose(result.get(i).value)); +setTupleValue(t, i, cassandraToObj(result.get(i).comparator, result.get(i).value)); return t; } @@ -124,17 +124,16 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store if(comparator instanceof AbstractCompositeType) setTupleValue(pair, 0, composeComposite((AbstractCompositeType)comparator,col.name())); else -setTupleValue(pair, 0, comparator.compose(col.name())); +setTupleValue(pair, 0, cassandraToObj(comparator, col.name())); // value -MapByteBuffer,AbstractType validators = getValidatorMap(cfDef); if (validators.get(col.name()) == null) { MapMarshallerType, AbstractType marshallers = getDefaultMarshallers(cfDef); -setTupleValue(pair, 1, marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value())); +setTupleValue(pair, 1, cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value())); } else -setTupleValue(pair, 1, validators.get(col.name()).compose(col.value())); +setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), col.value())); return pair; } @@ -313,9 +312,12 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store return DataType.LONG; else if (type instanceof IntegerType || type instanceof Int32Type) // IntegerType will overflow at 2**31, but is kept for compatibility until pig has a BigInteger return DataType.INTEGER; -else if (type instanceof AsciiType) -return DataType.CHARARRAY; -else if (type instanceof UTF8Type) +else if (type instanceof AsciiType || +type instanceof UTF8Type || +type instanceof DecimalType || +type instanceof InetAddressType || +type instanceof LexicalUUIDType || +type instanceof UUIDType ) return DataType.CHARARRAY; else if (type instanceof FloatType) return DataType.FLOAT; @@ -758,5 +760,18 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store } return null; } + +protected Object cassandraToObj(AbstractType validator, ByteBuffer value) +{ +if (validator instanceof DecimalType || +validator instanceof InetAddressType || +validator instanceof LexicalUUIDType || +validator instanceof UUIDType) +{ +return validator.getString(value); +} +else +return validator.compose(value); +} } http://git-wip-us.apache.org/repos/asf/cassandra/blob/538039a7/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java -- diff --git a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
[jira] [Commented] (CASSANDRA-6128) Add more data mappings for Pig
[ https://issues.apache.org/jira/browse/CASSANDRA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788450#comment-13788450 ] Brandon Williams commented on CASSANDRA-6128: - Well, crap: PIG-2764 I guess we'll have to use a string for now, otherwise we box people into the corner of precision loss with no way out. At least with strings they can do something in a UDF, so +1 and committed. Add more data mappings for Pig -- Key: CASSANDRA-6128 URL: https://issues.apache.org/jira/browse/CASSANDRA-6128 Project: Cassandra Issue Type: Bug Reporter: Alex Liu Assignee: Alex Liu Attachments: 6128-1.2-branch.txt We need add more data mappings for {code} DecimalType InetAddressType LexicalUUIDType TimeUUIDType UUIDType {code} Existing implementation throws exception for those data type -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-6131) JAVA_HOME on cassandra-env.sh is ignored on Debian packages
[ https://issues.apache.org/jira/browse/CASSANDRA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788446#comment-13788446 ] Sebastián Lacuesta edited comment on CASSANDRA-6131 at 10/7/13 7:07 PM: Tried the patch with the 2.0.1 source tarball from the debian src file extracted from .dsc, {code} patching file debian/init Hunk #4 FAILED at 53. Hunk #5 FAILED at 95. 2 out of 5 hunks FAILED -- saving rejects to file debian/init.rej {code} content of debian/init.rej: {code:title=debian/init.rej|borderStyle=solid} --- debian/init +++ debian/init @@ -53,10 +29,6 @@ # Depend on lsb-base (= 3.0-6) to ensure that this file is present. . /lib/lsb/init-functions -# If JNA is installed, add it to EXTRA_CLASSPATH -# -EXTRA_CLASSPATH=/usr/share/java/jna.jar:$EXTRA_CLASSPATH - # # Function that returns 0 if process is running, or nonzero if not. # @@ -95,7 +67,7 @@ [ -e `dirname $PIDFILE` ] || \ install -d -ocassandra -gcassandra -m750 `dirname $PIDFILE` -export EXTRA_CLASSPATH + start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -q -p $PIDFILE -t /dev/null || return 1 {code} was (Author: sebastianlacuesta): Tried the patch with the 2.0.1 source tarball from the debian src file extracted from .dsc, patching file debian/init Hunk #4 FAILED at 53. Hunk #5 FAILED at 95. 2 out of 5 hunks FAILED -- saving rejects to file debian/init.rej content of debian/init.rej: --- debian/init +++ debian/init @@ -53,10 +29,6 @@ # Depend on lsb-base (= 3.0-6) to ensure that this file is present. . /lib/lsb/init-functions -# If JNA is installed, add it to EXTRA_CLASSPATH -# -EXTRA_CLASSPATH=/usr/share/java/jna.jar:$EXTRA_CLASSPATH - # # Function that returns 0 if process is running, or nonzero if not. # @@ -95,7 +67,7 @@ [ -e `dirname $PIDFILE` ] || \ install -d -ocassandra -gcassandra -m750 `dirname $PIDFILE` -export EXTRA_CLASSPATH + start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -q -p $PIDFILE -t /dev/null || return 1 JAVA_HOME on cassandra-env.sh is ignored on Debian packages --- Key: CASSANDRA-6131 URL: https://issues.apache.org/jira/browse/CASSANDRA-6131 Project: Cassandra Issue Type: Bug Components: Packaging Environment: I've just got upgraded to 2.0.1 package from the apache repositories using apt. I had the JAVA_HOME environment variable set in /etc/cassandra/cassandra-env.sh but after the upgrade it only worked by setting it on /usr/sbin/cassandra script. I can't configure java 7 system wide, only for cassandra. Off-toppic: Thanks for getting rid of the jsvc mess. Reporter: Sebastián Lacuesta Assignee: Eric Evans Labels: debian Fix For: 2.0.2 Attachments: 6131.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6154) Inserts are blocked in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788453#comment-13788453 ] Brandon Williams commented on CASSANDRA-6154: - Bisect points at CASSANDRA-6132 Inserts are blocked in 2.1 -- Key: CASSANDRA-6154 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Priority: Critical With cluster sizes 1 inserts are blocked indefinitely: {code} $ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test $ ccm populate -n 2 $ ccm start $ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statement never returns.. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6128) Add more data mappings for Pig
[ https://issues.apache.org/jira/browse/CASSANDRA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Liu updated CASSANDRA-6128: Description: We need add more data mappings for {code} DecimalType InetAddressType {code} Existing implementation throws exception for those data type was: We need add more data mappings for {code} DecimalType InetAddressType LexicalUUIDType TimeUUIDType UUIDType {code} Existing implementation throws exception for those data type Add more data mappings for Pig -- Key: CASSANDRA-6128 URL: https://issues.apache.org/jira/browse/CASSANDRA-6128 Project: Cassandra Issue Type: Bug Reporter: Alex Liu Assignee: Alex Liu Fix For: 1.2.11, 2.0.2 Attachments: 6128-1.2-branch.txt We need add more data mappings for {code} DecimalType InetAddressType {code} Existing implementation throws exception for those data type -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-6154) Inserts are blocked in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788453#comment-13788453 ] Brandon Williams edited comment on CASSANDRA-6154 at 10/7/13 7:12 PM: -- Bisect points at CASSANDRA-6132, specfically the ninja commit in 5440a0a6767544d6ea1ba34f5d2a3e223f260fb5 was (Author: brandon.williams): Bisect points at CASSANDRA-6132 Inserts are blocked in 2.1 -- Key: CASSANDRA-6154 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Priority: Critical With cluster sizes 1 inserts are blocked indefinitely: {code} $ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test $ ccm populate -n 2 $ ccm start $ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statement never returns.. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6128) Add more data mappings for Pig
[ https://issues.apache.org/jira/browse/CASSANDRA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Liu updated CASSANDRA-6128: Description: We need add more data mappings for {code} DecimalType InetAddressType LexicalUUIDType TimeUUIDType UUIDType {code} Existing implementation throws exception for those data type was: We need add more data mappings for {code} DecimalType InetAddressType {code} Existing implementation throws exception for those data type Add more data mappings for Pig -- Key: CASSANDRA-6128 URL: https://issues.apache.org/jira/browse/CASSANDRA-6128 Project: Cassandra Issue Type: Bug Reporter: Alex Liu Assignee: Alex Liu Fix For: 1.2.11, 2.0.2 Attachments: 6128-1.2-branch.txt We need add more data mappings for {code} DecimalType InetAddressType LexicalUUIDType TimeUUIDType UUIDType {code} Existing implementation throws exception for those data type -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-5202) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name
[ https://issues.apache.org/jira/browse/CASSANDRA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788456#comment-13788456 ] Jonathan Ellis commented on CASSANDRA-5202: --- Is there anything that we want to do as part of this ticket instead of 6060? CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name Key: CASSANDRA-5202 URL: https://issues.apache.org/jira/browse/CASSANDRA-5202 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.9 Environment: OS: Windows 7, Server: Cassandra 1.1.9 release drop Client: astyanax 1.56.21, JVM: Sun/Oracle JVM 64 bit (jdk1.6.0_27) Reporter: Marat Bedretdinov Assignee: Yuki Morishita Labels: test Fix For: 2.1 Attachments: 5202-1.1.txt, 5202-2.0.0.txt, astyanax-stress-driver.zip Attached is a driver that sequentially: 1. Drops keyspace 2. Creates keyspace 4. Creates 2 column families 5. Seeds 1M rows with 100 columns 6. Queries these 2 column families The above steps are repeated 1000 times. The following exception is observed at random (race - SEDA?): ERROR [ReadStage:55] 2013-01-29 19:24:52,676 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ReadStage:55,5,main] java.lang.AssertionError: DecoratedKey(-1, ) != DecoratedKey(62819832764241410631599989027761269388, 313a31) in C:\var\lib\cassandra\data\user_role_reverse_index\business_entity_role\user_role_reverse_index-business_entity_role-hf-1-Data.db at org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60) at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1367) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1229) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1164) at org.apache.cassandra.db.Table.getRow(Table.java:378) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:822) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) This exception appears in the server at the time of client submitting a query request (row slice) and not at the time data is seeded. The client times out and this data can no longer be queried as the same exception would always occur from there on. Also on iteration 201, it appears that dropping column families failed and as a result their recreation failed with unique column family name violation (see exception below). Note that the data files are actually gone, so it appears that the server runtime responsible for creating column family was out of sync with the piece that dropped them: Starting dropping column families Dropped column families Starting dropping keyspace Dropped keyspace Starting creating column families Created column families Starting seeding data Total rows inserted: 100 in 5105 ms Iteration: 200; Total running time for 1000 queries is 232; Average running time of 1000 queries is 0 ms Starting dropping column families Dropped column families Starting dropping keyspace Dropped keyspace Starting creating column families Created column families Starting seeding data Total rows inserted: 100 in 5361 ms Iteration: 201; Total running time for 1000 queries is 222; Average running time of 1000 queries is 0 ms Starting dropping column families Starting creating column families Exception in thread main com.netflix.astyanax.connectionpool.exceptions.BadRequestException: BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=2468(2469), attempts=1]InvalidRequestException(why:Keyspace names must be case-insensitively unique (user_role_reverse_index conflicts with user_role_reverse_index)) at
[jira] [Comment Edited] (CASSANDRA-5932) Speculative read performance data show unexpected results
[ https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788437#comment-13788437 ] Li Zou edited comment on CASSANDRA-5932 at 10/7/13 7:17 PM: [~jbellis], this morning's trunk load has a slightly different symptom, and is even more serious than last Friday's load, as this time just commenting out the assert statement in the {{MessagingService.addCallback()}} will not help. I copy the {{/var/log/cassandra/system.log}} exception errors below. {noformat} ERROR [Thrift:12] 2013-10-07 14:42:39,396 Caller+0 at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134) - Exception in thread Thread[Thrift:12,5,main] java.lang.AssertionError: null at org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:543) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:591) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:869) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:123) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:739) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:511) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:581) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:379) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:363) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:126) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:267) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.thrift.CassandraServer.execute_prepared_cql3_query(CassandraServer.java:2061) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.thrift.Cassandra$Processor$execute_prepared_cql3_query.getResult(Cassandra.java:4502) ~[apache-cassandra-thrift-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.thrift.Cassandra$Processor$execute_prepared_cql3_query.getResult(Cassandra.java:4486) ~[apache-cassandra-thrift-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[libthrift-0.9.1.jar:0.9.1] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[libthrift-0.9.1.jar:0.9.1] at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:194) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_25] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_25] at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25] {noformat} was (Author: lizou): This morning's trunk load has a slightly different symptom, and is even more serious than last Friday's load, as this time just commenting out the assert statement in the {{MessagingService.addCallback()}} will not help. I copy the {{/var/log/cassandra/system.log}} exception errors below. {noformat} ERROR [Thrift:12] 2013-10-07 14:42:39,396 Caller+0 at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134) - Exception in thread Thread[Thrift:12,5,main] java.lang.AssertionError: null at org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:543) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:591) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:869) ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] at
[jira] [Commented] (CASSANDRA-5202) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name
[ https://issues.apache.org/jira/browse/CASSANDRA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788471#comment-13788471 ] Yuki Morishita commented on CASSANDRA-5202: --- Add CF ID to directory name if we still want to distinguish one KS/CF directory to another. Updating key cache key to use CF ID is another one, but I think that will be done through 6060. CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name Key: CASSANDRA-5202 URL: https://issues.apache.org/jira/browse/CASSANDRA-5202 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.9 Environment: OS: Windows 7, Server: Cassandra 1.1.9 release drop Client: astyanax 1.56.21, JVM: Sun/Oracle JVM 64 bit (jdk1.6.0_27) Reporter: Marat Bedretdinov Assignee: Yuki Morishita Labels: test Fix For: 2.1 Attachments: 5202-1.1.txt, 5202-2.0.0.txt, astyanax-stress-driver.zip Attached is a driver that sequentially: 1. Drops keyspace 2. Creates keyspace 4. Creates 2 column families 5. Seeds 1M rows with 100 columns 6. Queries these 2 column families The above steps are repeated 1000 times. The following exception is observed at random (race - SEDA?): ERROR [ReadStage:55] 2013-01-29 19:24:52,676 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ReadStage:55,5,main] java.lang.AssertionError: DecoratedKey(-1, ) != DecoratedKey(62819832764241410631599989027761269388, 313a31) in C:\var\lib\cassandra\data\user_role_reverse_index\business_entity_role\user_role_reverse_index-business_entity_role-hf-1-Data.db at org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60) at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1367) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1229) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1164) at org.apache.cassandra.db.Table.getRow(Table.java:378) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:822) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) This exception appears in the server at the time of client submitting a query request (row slice) and not at the time data is seeded. The client times out and this data can no longer be queried as the same exception would always occur from there on. Also on iteration 201, it appears that dropping column families failed and as a result their recreation failed with unique column family name violation (see exception below). Note that the data files are actually gone, so it appears that the server runtime responsible for creating column family was out of sync with the piece that dropped them: Starting dropping column families Dropped column families Starting dropping keyspace Dropped keyspace Starting creating column families Created column families Starting seeding data Total rows inserted: 100 in 5105 ms Iteration: 200; Total running time for 1000 queries is 232; Average running time of 1000 queries is 0 ms Starting dropping column families Dropped column families Starting dropping keyspace Dropped keyspace Starting creating column families Created column families Starting seeding data Total rows inserted: 100 in 5361 ms Iteration: 201; Total running time for 1000 queries is 222; Average running time of 1000 queries is 0 ms Starting dropping column families Starting creating column families Exception in thread main com.netflix.astyanax.connectionpool.exceptions.BadRequestException: BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=2468(2469), attempts=1]InvalidRequestException(why:Keyspace names must be case-insensitively
git commit: Fix FileCacheService regressions patch by jbellis; reviewed by pyaskevich and tested by Kai Wang for CASSANDRA-6149
Updated Branches: refs/heads/cassandra-2.0 c374aca19 - 01a57eea8 Fix FileCacheService regressions patch by jbellis; reviewed by pyaskevich and tested by Kai Wang for CASSANDRA-6149 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/01a57eea Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/01a57eea Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/01a57eea Branch: refs/heads/cassandra-2.0 Commit: 01a57eea841e51fb4a97329ab9fa0f59d0b826f6 Parents: c374aca Author: Jonathan Ellis jbel...@apache.org Authored: Mon Oct 7 14:20:42 2013 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Mon Oct 7 14:20:42 2013 -0500 -- CHANGES.txt | 1 + .../compress/CompressedRandomAccessReader.java | 5 ++ .../cassandra/io/util/RandomAccessReader.java | 2 +- .../apache/cassandra/io/util/SegmentedFile.java | 3 +- .../cassandra/service/FileCacheService.java | 87 ++-- 5 files changed, 54 insertions(+), 44 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 94fa927..ddd976e 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.2 + * Fix FileCacheService regressions (CASSANDRA-6149) * Never return WriteTimeout for CL.ANY (CASSANDRA-6032) * Fix race conditions in bulk loader (CASSANDRA-6129) * Add configurable metrics reporting (CASSANDRA-4430) http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java -- diff --git a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java index b6cffa2..131a4d6 100644 --- a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java +++ b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java @@ -154,6 +154,11 @@ public class CompressedRandomAccessReader extends RandomAccessReader return checksumBytes.getInt(0); } +public int getTotalBufferSize() +{ +return super.getTotalBufferSize() + compressed.capacity(); +} + @Override public long length() { http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/util/RandomAccessReader.java -- diff --git a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java index 4ceb3c4..9a03480 100644 --- a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java +++ b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java @@ -152,7 +152,7 @@ public class RandomAccessReader extends RandomAccessFile implements FileDataInpu return filePath; } -public int getBufferSize() +public int getTotalBufferSize() { return buffer.length; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/util/SegmentedFile.java -- diff --git a/src/java/org/apache/cassandra/io/util/SegmentedFile.java b/src/java/org/apache/cassandra/io/util/SegmentedFile.java index 6231fd7..d4da177 100644 --- a/src/java/org/apache/cassandra/io/util/SegmentedFile.java +++ b/src/java/org/apache/cassandra/io/util/SegmentedFile.java @@ -19,6 +19,7 @@ package org.apache.cassandra.io.util; import java.io.DataInput; import java.io.DataOutput; +import java.io.File; import java.io.IOException; import java.nio.MappedByteBuffer; import java.util.Iterator; @@ -57,7 +58,7 @@ public abstract class SegmentedFile protected SegmentedFile(String path, long length, long onDiskLength) { -this.path = path; +this.path = new File(path).getAbsolutePath(); this.length = length; this.onDiskLength = onDiskLength; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/service/FileCacheService.java -- diff --git a/src/java/org/apache/cassandra/service/FileCacheService.java b/src/java/org/apache/cassandra/service/FileCacheService.java index e6bc3e5..c939a6f 100644 --- a/src/java/org/apache/cassandra/service/FileCacheService.java +++ b/src/java/org/apache/cassandra/service/FileCacheService.java @@ -22,11 +22,9 @@ import java.util.concurrent.Callable; import java.util.concurrent.ConcurrentLinkedQueue; import
[3/5] git commit: Fix FileCacheService regressions patch by jbellis; reviewed by pyaskevich and tested by Kai Wang for CASSANDRA-6149
Fix FileCacheService regressions patch by jbellis; reviewed by pyaskevich and tested by Kai Wang for CASSANDRA-6149 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/01a57eea Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/01a57eea Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/01a57eea Branch: refs/heads/trunk Commit: 01a57eea841e51fb4a97329ab9fa0f59d0b826f6 Parents: c374aca Author: Jonathan Ellis jbel...@apache.org Authored: Mon Oct 7 14:20:42 2013 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Mon Oct 7 14:20:42 2013 -0500 -- CHANGES.txt | 1 + .../compress/CompressedRandomAccessReader.java | 5 ++ .../cassandra/io/util/RandomAccessReader.java | 2 +- .../apache/cassandra/io/util/SegmentedFile.java | 3 +- .../cassandra/service/FileCacheService.java | 87 ++-- 5 files changed, 54 insertions(+), 44 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 94fa927..ddd976e 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.2 + * Fix FileCacheService regressions (CASSANDRA-6149) * Never return WriteTimeout for CL.ANY (CASSANDRA-6032) * Fix race conditions in bulk loader (CASSANDRA-6129) * Add configurable metrics reporting (CASSANDRA-4430) http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java -- diff --git a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java index b6cffa2..131a4d6 100644 --- a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java +++ b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java @@ -154,6 +154,11 @@ public class CompressedRandomAccessReader extends RandomAccessReader return checksumBytes.getInt(0); } +public int getTotalBufferSize() +{ +return super.getTotalBufferSize() + compressed.capacity(); +} + @Override public long length() { http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/util/RandomAccessReader.java -- diff --git a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java index 4ceb3c4..9a03480 100644 --- a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java +++ b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java @@ -152,7 +152,7 @@ public class RandomAccessReader extends RandomAccessFile implements FileDataInpu return filePath; } -public int getBufferSize() +public int getTotalBufferSize() { return buffer.length; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/util/SegmentedFile.java -- diff --git a/src/java/org/apache/cassandra/io/util/SegmentedFile.java b/src/java/org/apache/cassandra/io/util/SegmentedFile.java index 6231fd7..d4da177 100644 --- a/src/java/org/apache/cassandra/io/util/SegmentedFile.java +++ b/src/java/org/apache/cassandra/io/util/SegmentedFile.java @@ -19,6 +19,7 @@ package org.apache.cassandra.io.util; import java.io.DataInput; import java.io.DataOutput; +import java.io.File; import java.io.IOException; import java.nio.MappedByteBuffer; import java.util.Iterator; @@ -57,7 +58,7 @@ public abstract class SegmentedFile protected SegmentedFile(String path, long length, long onDiskLength) { -this.path = path; +this.path = new File(path).getAbsolutePath(); this.length = length; this.onDiskLength = onDiskLength; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/service/FileCacheService.java -- diff --git a/src/java/org/apache/cassandra/service/FileCacheService.java b/src/java/org/apache/cassandra/service/FileCacheService.java index e6bc3e5..c939a6f 100644 --- a/src/java/org/apache/cassandra/service/FileCacheService.java +++ b/src/java/org/apache/cassandra/service/FileCacheService.java @@ -22,11 +22,9 @@ import java.util.concurrent.Callable; import java.util.concurrent.ConcurrentLinkedQueue; import java.util.concurrent.ExecutionException; import java.util.concurrent.TimeUnit; +import
[5/5] git commit: Merge remote-tracking branch 'origin/trunk' into trunk
Merge remote-tracking branch 'origin/trunk' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/558a9e57 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/558a9e57 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/558a9e57 Branch: refs/heads/trunk Commit: 558a9e57bb2f443b69ef1acb31fd90aa8b373e5d Parents: 6990f95 538039a Author: Jonathan Ellis jbel...@apache.org Authored: Mon Oct 7 14:23:59 2013 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Mon Oct 7 14:23:59 2013 -0500 -- .../org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java | 1 - 1 file changed, 1 deletion(-) --
[1/5] git commit: Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128
Updated Branches: refs/heads/trunk 538039a70 - 558a9e57b Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3633aea4 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3633aea4 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3633aea4 Branch: refs/heads/trunk Commit: 3633aea42d7689fa0252c104f62b0646d0858624 Parents: d396fd4 Author: Brandon Williams brandonwilli...@apache.org Authored: Mon Oct 7 13:57:45 2013 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Mon Oct 7 13:57:45 2013 -0500 -- .../hadoop/pig/AbstractCassandraStorage.java| 30 +++- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 7 ++--- 3 files changed, 26 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/3633aea4/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java -- diff --git a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java index ce92014..6ad4f9e 100644 --- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java +++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java @@ -110,7 +110,7 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store ListCompositeComponent result = comparator.deconstruct(name); Tuple t = TupleFactory.getInstance().newTuple(result.size()); for (int i=0; iresult.size(); i++) -setTupleValue(t, i, result.get(i).comparator.compose(result.get(i).value)); +setTupleValue(t, i, cassandraToObj(result.get(i).comparator, result.get(i).value)); return t; } @@ -124,7 +124,7 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store if(comparator instanceof AbstractCompositeType) setTupleValue(pair, 0, composeComposite((AbstractCompositeType)comparator,col.name())); else -setTupleValue(pair, 0, comparator.compose(col.name())); +setTupleValue(pair, 0, cassandraToObj(comparator, col.name())); // value if (col instanceof Column) @@ -134,10 +134,10 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store if (validators.get(col.name()) == null) { MapMarshallerType, AbstractType marshallers = getDefaultMarshallers(cfDef); -setTupleValue(pair, 1, marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value())); +setTupleValue(pair, 1, cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value())); } else -setTupleValue(pair, 1, validators.get(col.name()).compose(col.value())); +setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), col.value())); return pair; } else @@ -327,9 +327,12 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store return DataType.LONG; else if (type instanceof IntegerType || type instanceof Int32Type) // IntegerType will overflow at 2**31, but is kept for compatibility until pig has a BigInteger return DataType.INTEGER; -else if (type instanceof AsciiType) -return DataType.CHARARRAY; -else if (type instanceof UTF8Type) +else if (type instanceof AsciiType || +type instanceof UTF8Type || +type instanceof DecimalType || +type instanceof InetAddressType || +type instanceof LexicalUUIDType || +type instanceof UUIDType ) return DataType.CHARARRAY; else if (type instanceof FloatType) return DataType.FLOAT; @@ -772,5 +775,18 @@ public abstract class AbstractCassandraStorage extends LoadFunc implements Store } return null; } + +protected Object cassandraToObj(AbstractType validator, ByteBuffer value) +{ +if (validator instanceof DecimalType || +validator instanceof InetAddressType || +validator instanceof LexicalUUIDType || +validator instanceof UUIDType) +{ +return validator.getString(value); +} +else +return validator.compose(value); +} }
[4/5] git commit: Merge branch 'cassandra-2.0' into trunk
Merge branch 'cassandra-2.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6990f95b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6990f95b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6990f95b Branch: refs/heads/trunk Commit: 6990f95b1dff4e4132ae43c037d0f117487ffb6e Parents: b966e1a 01a57ee Author: Jonathan Ellis jbel...@apache.org Authored: Mon Oct 7 14:20:48 2013 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Mon Oct 7 14:20:48 2013 -0500 -- CHANGES.txt | 1 + .../hadoop/pig/AbstractCassandraStorage.java| 30 +-- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 7 +- .../compress/CompressedRandomAccessReader.java | 5 ++ .../cassandra/io/util/RandomAccessReader.java | 2 +- .../apache/cassandra/io/util/SegmentedFile.java | 3 +- .../cassandra/service/FileCacheService.java | 87 ++-- 8 files changed, 80 insertions(+), 57 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6990f95b/CHANGES.txt -- diff --cc CHANGES.txt index 4ec387c,ddd976e..2e6e06a --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,13 -1,5 +1,14 @@@ +2.1 + * Parallelize fetching rows for low-cardinality indexes (CASSANDRA-1337) + * change logging from log4j to logback (CASSANDRA-5883) + * switch to LZ4 compression for internode communication (CASSANDRA-5887) + * Stop using Thrift-generated Index* classes internally (CASSANDRA-5971) + * Remove 1.2 network compatibility code (CASSANDRA-5960) + * Remove leveled json manifest migration code (CASSANDRA-5996) + + 2.0.2 + * Fix FileCacheService regressions (CASSANDRA-6149) * Never return WriteTimeout for CL.ANY (CASSANDRA-6032) * Fix race conditions in bulk loader (CASSANDRA-6129) * Add configurable metrics reporting (CASSANDRA-4430)
[2/5] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Conflicts: src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c374aca1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c374aca1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c374aca1 Branch: refs/heads/trunk Commit: c374aca19ea39fbbc588a2309c669c422e0318cd Parents: 8e8db1f 3633aea Author: Brandon Williams brandonwilli...@apache.org Authored: Mon Oct 7 14:02:39 2013 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Mon Oct 7 14:02:39 2013 -0500 -- .../hadoop/pig/AbstractCassandraStorage.java| 30 +++- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 7 ++--- 3 files changed, 26 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java -- diff --cc src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java index 1e207b3,6ad4f9e..c881734 --- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java +++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java @@@ -124,17 -124,31 +124,17 @@@ public abstract class AbstractCassandra if(comparator instanceof AbstractCompositeType) setTupleValue(pair, 0, composeComposite((AbstractCompositeType)comparator,col.name())); else - setTupleValue(pair, 0, comparator.compose(col.name())); + setTupleValue(pair, 0, cassandraToObj(comparator, col.name())); // value -if (col instanceof Column) +MapByteBuffer,AbstractType validators = getValidatorMap(cfDef); +if (validators.get(col.name()) == null) { -// standard -MapByteBuffer,AbstractType validators = getValidatorMap(cfDef); -if (validators.get(col.name()) == null) -{ -MapMarshallerType, AbstractType marshallers = getDefaultMarshallers(cfDef); -setTupleValue(pair, 1, cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value())); -} -else -setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), col.value())); -return pair; +MapMarshallerType, AbstractType marshallers = getDefaultMarshallers(cfDef); - setTupleValue(pair, 1, marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value())); ++setTupleValue(pair, 1, cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value())); } else - setTupleValue(pair, 1, validators.get(col.name()).compose(col.value())); -{ -// super -ArrayListTuple subcols = new ArrayListTuple(); -for (IColumn subcol : col.getSubColumns()) -subcols.add(columnToTuple(subcol, cfDef, parseType(cfDef.getSubcomparator_type(; - -pair.set(1, new DefaultDataBag(subcols)); -} ++setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), col.value())); return pair; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java --
[jira] [Commented] (CASSANDRA-5202) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name
[ https://issues.apache.org/jira/browse/CASSANDRA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788476#comment-13788476 ] Jonathan Ellis commented on CASSANDRA-5202: --- bq. Add CF ID to directory name if we still want to distinguish one KS/CF directory to another. All right, I'm down to narrow the scope here to that. CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name Key: CASSANDRA-5202 URL: https://issues.apache.org/jira/browse/CASSANDRA-5202 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.9 Environment: OS: Windows 7, Server: Cassandra 1.1.9 release drop Client: astyanax 1.56.21, JVM: Sun/Oracle JVM 64 bit (jdk1.6.0_27) Reporter: Marat Bedretdinov Assignee: Yuki Morishita Labels: test Fix For: 2.1 Attachments: 5202-1.1.txt, 5202-2.0.0.txt, astyanax-stress-driver.zip Attached is a driver that sequentially: 1. Drops keyspace 2. Creates keyspace 4. Creates 2 column families 5. Seeds 1M rows with 100 columns 6. Queries these 2 column families The above steps are repeated 1000 times. The following exception is observed at random (race - SEDA?): ERROR [ReadStage:55] 2013-01-29 19:24:52,676 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ReadStage:55,5,main] java.lang.AssertionError: DecoratedKey(-1, ) != DecoratedKey(62819832764241410631599989027761269388, 313a31) in C:\var\lib\cassandra\data\user_role_reverse_index\business_entity_role\user_role_reverse_index-business_entity_role-hf-1-Data.db at org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60) at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1367) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1229) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1164) at org.apache.cassandra.db.Table.getRow(Table.java:378) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:822) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) This exception appears in the server at the time of client submitting a query request (row slice) and not at the time data is seeded. The client times out and this data can no longer be queried as the same exception would always occur from there on. Also on iteration 201, it appears that dropping column families failed and as a result their recreation failed with unique column family name violation (see exception below). Note that the data files are actually gone, so it appears that the server runtime responsible for creating column family was out of sync with the piece that dropped them: Starting dropping column families Dropped column families Starting dropping keyspace Dropped keyspace Starting creating column families Created column families Starting seeding data Total rows inserted: 100 in 5105 ms Iteration: 200; Total running time for 1000 queries is 232; Average running time of 1000 queries is 0 ms Starting dropping column families Dropped column families Starting dropping keyspace Dropped keyspace Starting creating column families Created column families Starting seeding data Total rows inserted: 100 in 5361 ms Iteration: 201; Total running time for 1000 queries is 222; Average running time of 1000 queries is 0 ms Starting dropping column families Starting creating column families Exception in thread main com.netflix.astyanax.connectionpool.exceptions.BadRequestException: BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=2468(2469), attempts=1]InvalidRequestException(why:Keyspace names must be case-insensitively unique (user_role_reverse_index conflicts
[jira] [Commented] (CASSANDRA-4785) Secondary Index Sporadically Doesn't Return Rows
[ https://issues.apache.org/jira/browse/CASSANDRA-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788480#comment-13788480 ] Tom van den Berge commented on CASSANDRA-4785: -- I'm seeing this problem, too, (cassandra 1.2.3), but my CF has caching KEYS_ONLY. It only happens to specific rows in the CF; not all. Also, it only happens on one single node in my 2-node cluster (replication factor 2). Storing the indexed value again solves the problem for this particular row, but I've seen this problem happen several times now, even on the same rows -- also after having fixed it as I just described. I'm not 100% sure, but I think the problem occurred again after having rebuilt the node. Secondary Index Sporadically Doesn't Return Rows Key: CASSANDRA-4785 URL: https://issues.apache.org/jira/browse/CASSANDRA-4785 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.5, 1.1.6 Environment: Ubuntu 10.04 Java 6 Sun Cassandra 1.1.5 upgraded from 1.1.2 - 1.1.3 - 1.1.5 Reporter: Arya Goudarzi Assignee: Sam Tunnicliffe Attachments: entity_aliases.txt, repro.py I have a ColumnFamily with caching = ALL. I have 2 secondary indexes on it. I have noticed if I query using the secondary index in the where clause, sometimes I get the results and sometimes I don't. Until 2 weeks ago, the caching option on this CF was set to NONE. So, I suspect something happened in secondary index caching scheme. Here are things I tried: 1. I rebuild indexes for that CF on all nodes; 2. I set the caching to KEYS_ONLY and rebuild the index again; 3. I set the caching to NONE and rebuild the index again; None of the above helped. I suppose the caching still exists as this behavior looks like cache mistmatch. I did a bit research, and found CASSANDRA-4197 that could be related. Please advice. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results
[ https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788485#comment-13788485 ] Jonathan Ellis commented on CASSANDRA-5932: --- There's a ticket open for trunk over at CASSANDRA-6154. Speculative read performance data show unexpected results - Key: CASSANDRA-5932 URL: https://issues.apache.org/jira/browse/CASSANDRA-5932 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Assignee: Aleksey Yeschenko Fix For: 2.0.2 Attachments: 5932.6692c50412ef7d.compaction.png, 5932-6692c50412ef7d.png, 5932.6692c50412ef7d.rr0.png, 5932.6692c50412ef7d.rr1.png, 5932.ded39c7e1c2fa.logs.tar.gz, 5932.txt, 5933-128_and_200rc1.png, 5933-7a87fc11.png, 5933-logs.tar.gz, 5933-randomized-dsnitch-replica.2.png, 5933-randomized-dsnitch-replica.3.png, 5933-randomized-dsnitch-replica.png, compaction-makes-slow.png, compaction-makes-slow-stats.png, eager-read-looks-promising.png, eager-read-looks-promising-stats.png, eager-read-not-consistent.png, eager-read-not-consistent-stats.png, node-down-increase-performance.png I've done a series of stress tests with eager retries enabled that show undesirable behavior. I'm grouping these behaviours into one ticket as they are most likely related. 1) Killing off a node in a 4 node cluster actually increases performance. 2) Compactions make nodes slow, even after the compaction is done. 3) Eager Reads tend to lessen the *immediate* performance impact of a node going down, but not consistently. My Environment: 1 stress machine: node0 4 C* nodes: node4, node5, node6, node7 My script: node0 writes some data: stress -d node4 -F 3000 -n 3000 -i 5 -l 2 -K 20 node0 reads some data: stress -d node4 -n 3000 -o read -i 5 -K 20 h3. Examples: h5. A node going down increases performance: !node-down-increase-performance.png! [Data for this test here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1] At 450s, I kill -9 one of the nodes. There is a brief decrease in performance as the snitch adapts, but then it recovers... to even higher performance than before. h5. Compactions make nodes permanently slow: !compaction-makes-slow.png! !compaction-makes-slow-stats.png! The green and orange lines represent trials with eager retry enabled, they never recover their op-rate from before the compaction as the red and blue lines do. [Data for this test here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.compaction.2.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1] h5. Speculative Read tends to lessen the *immediate* impact: !eager-read-looks-promising.png! !eager-read-looks-promising-stats.png! This graph looked the most promising to me, the two trials with eager retry, the green and orange line, at 450s showed the smallest dip in performance. [Data for this test here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1] h5. But not always: !eager-read-not-consistent.png! !eager-read-not-consistent-stats.png! This is a retrial with the same settings as above, yet the 95percentile eager retry (red line) did poorly this time at 450s. [Data for this test here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.rc1.try2.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1] -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6154) Inserts are blocked in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6154: -- Attachment: 6154.txt Looks like merge to trunk from 2.0 was syntactically correct but semantically broken. Attached. Inserts are blocked in 2.1 -- Key: CASSANDRA-6154 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Priority: Critical Attachments: 6154.txt With cluster sizes 1 inserts are blocked indefinitely: {code} $ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test $ ccm populate -n 2 $ ccm start $ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statement never returns.. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lyuben Todorov updated CASSANDRA-4809: -- Attachment: (was: 4809__v2.patch) Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788501#comment-13788501 ] Benedict commented on CASSANDRA-4718: - Not necessarily. I still think that was most likely variance: - I have BAQ at same speed as LBQ in application - a 2x slow down of LBQ - 0.01x slow down of application - a 10x slow down of LBQ - 0.05x slow down of application = the queue speed is currently only ~1% of application cost. It's possible the faster queue is causing greater contention at a sync point, but this wouldn't work in the opposite direction if the contention at the sync point is low. Either way, if this were true we'd see the artificially slow queues also improve stress performance. Ryan also ran some of my tests and found no difference. I wouldn't absolutely rule out the possibility his test was valid, though, as I did not swap out the queues in OutboundTcpConnection for these tests as, at the time, I was concerned about the calls to size() which are expensive for my test queues, and I wanted the queue swap to be on equal terms across the board. I realise now these are only called via JMX, so shouldn't stop me swapping them in. I've just tried a quick test of directly (in process) stressing through the MessagingService and found no measureable difference to putting BAQ in the OutboundTcpConnection, though if I swap out across the board it is about 25% slower, which itself is interesting as this is close to a full stress, minus thrift. More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Jason Brown Priority: Minor Labels: performance Attachments: baq vs trunk.png, op costs of various queues.ods, PerThreadQueue.java, stress op rate with various queues.ods Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lyuben Todorov updated CASSANDRA-4809: -- Attachment: 4809_v2.patch Removed two redundant System.getProperties(...) lines. Allow restoring specific column families from archived commitlog Key: CASSANDRA-4809 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Nick Bailey Assignee: Lyuben Todorov Labels: lhf Fix For: 2.0.2 Attachments: 4809.patch, 4809_v2.patch Currently you can only restore the entire contents of a commit log archive. It would be useful to specify the keyspaces/column families you want to restore from an archived commitlog. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-5818) Duplicated error messages on directory creation error at startup
[ https://issues.apache.org/jira/browse/CASSANDRA-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788514#comment-13788514 ] koray sariteke commented on CASSANDRA-5818: --- +1 Duplicated error messages on directory creation error at startup Key: CASSANDRA-5818 URL: https://issues.apache.org/jira/browse/CASSANDRA-5818 Project: Cassandra Issue Type: Bug Reporter: Michaël Figuière Assignee: koray sariteke Priority: Trivial Fix For: 2.1 Attachments: trunk-5818.patch When I start Cassandra without the appropriate OS access rights to the default Cassandra directories, I get a flood of {{ERROR}} messages at startup, whereas one per directory would be more appropriate. See bellow: {code} ERROR 13:37:39,792 Failed to create /var/lib/cassandra/data/system/schema_triggers directory ERROR 13:37:39,797 Failed to create /var/lib/cassandra/data/system/schema_triggers directory ERROR 13:37:39,798 Failed to create /var/lib/cassandra/data/system/schema_triggers directory ERROR 13:37:39,798 Failed to create /var/lib/cassandra/data/system/schema_triggers directory ERROR 13:37:39,799 Failed to create /var/lib/cassandra/data/system/schema_triggers directory ERROR 13:37:39,800 Failed to create /var/lib/cassandra/data/system/batchlog directory ERROR 13:37:39,801 Failed to create /var/lib/cassandra/data/system/batchlog directory ERROR 13:37:39,801 Failed to create /var/lib/cassandra/data/system/batchlog directory ERROR 13:37:39,802 Failed to create /var/lib/cassandra/data/system/batchlog directory ERROR 13:37:39,802 Failed to create /var/lib/cassandra/data/system/peer_events directory ERROR 13:37:39,803 Failed to create /var/lib/cassandra/data/system/peer_events directory ERROR 13:37:39,803 Failed to create /var/lib/cassandra/data/system/peer_events directory ERROR 13:37:39,804 Failed to create /var/lib/cassandra/data/system/compactions_in_progress directory ERROR 13:37:39,805 Failed to create /var/lib/cassandra/data/system/compactions_in_progress directory ERROR 13:37:39,805 Failed to create /var/lib/cassandra/data/system/compactions_in_progress directory ERROR 13:37:39,806 Failed to create /var/lib/cassandra/data/system/compactions_in_progress directory ERROR 13:37:39,807 Failed to create /var/lib/cassandra/data/system/compactions_in_progress directory ERROR 13:37:39,808 Failed to create /var/lib/cassandra/data/system/hints directory ERROR 13:37:39,809 Failed to create /var/lib/cassandra/data/system/hints directory ERROR 13:37:39,809 Failed to create /var/lib/cassandra/data/system/hints directory ERROR 13:37:39,811 Failed to create /var/lib/cassandra/data/system/hints directory ERROR 13:37:39,811 Failed to create /var/lib/cassandra/data/system/hints directory ERROR 13:37:39,812 Failed to create /var/lib/cassandra/data/system/schema_keyspaces directory ERROR 13:37:39,812 Failed to create /var/lib/cassandra/data/system/schema_keyspaces directory ERROR 13:37:39,813 Failed to create /var/lib/cassandra/data/system/schema_keyspaces directory ERROR 13:37:39,814 Failed to create /var/lib/cassandra/data/system/schema_keyspaces directory ERROR 13:37:39,814 Failed to create /var/lib/cassandra/data/system/schema_keyspaces directory ERROR 13:37:39,815 Failed to create /var/lib/cassandra/data/system/range_xfers directory ERROR 13:37:39,816 Failed to create /var/lib/cassandra/data/system/range_xfers directory ERROR 13:37:39,817 Failed to create /var/lib/cassandra/data/system/range_xfers directory ERROR 13:37:39,817 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,818 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,818 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,820 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,821 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,821 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,822 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,822 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,823 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,824 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,824 Failed to create /var/lib/cassandra/data/system/schema_columnfamilies directory ERROR 13:37:39,825 Failed to create
[jira] [Commented] (CASSANDRA-6154) Inserts are blocked in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788517#comment-13788517 ] Brandon Williams commented on CASSANDRA-6154: - Not quite. noformat ERROR [GossipStage:1] 2013-10-07 20:09:15,849 Caller+0 at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134) - Exception in thread Thread[GossipStage:1,5,main] java.lang.AssertionError: null at org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:552) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:576) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) ~[main/:na] at org.apache.cassandra.gms.Gossiper.markAlive(Gossiper.java:808) ~[main/:na] at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:849) ~[main/:na] at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:934) ~[main/:na] at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49) ~[main/:na] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) ~[main/:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_17] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_17] noformat Inserts are blocked in 2.1 -- Key: CASSANDRA-6154 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Assignee: Jonathan Ellis Priority: Critical Attachments: 6154.txt With cluster sizes 1 inserts are blocked indefinitely: {code} $ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test $ ccm populate -n 2 $ ccm start $ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statement never returns.. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-6154) Inserts are blocked in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788517#comment-13788517 ] Brandon Williams edited comment on CASSANDRA-6154 at 10/7/13 8:14 PM: -- Not quite. {noformat} ERROR [GossipStage:1] 2013-10-07 20:09:15,849 Caller+0 at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134) - Exception in thread Thread[GossipStage:1,5,main] java.lang.AssertionError: null at org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:552) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:576) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) ~[main/:na] at org.apache.cassandra.gms.Gossiper.markAlive(Gossiper.java:808) ~[main/:na] at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:849) ~[main/:na] at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:934) ~[main/:na] at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49) ~[main/:na] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) ~[main/:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_17] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_17] {noformat} was (Author: brandon.williams): Not quite. noformat ERROR [GossipStage:1] 2013-10-07 20:09:15,849 Caller+0 at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134) - Exception in thread Thread[GossipStage:1,5,main] java.lang.AssertionError: null at org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:552) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:576) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) ~[main/:na] at org.apache.cassandra.gms.Gossiper.markAlive(Gossiper.java:808) ~[main/:na] at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:849) ~[main/:na] at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:934) ~[main/:na] at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49) ~[main/:na] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) ~[main/:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_17] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_17] noformat Inserts are blocked in 2.1 -- Key: CASSANDRA-6154 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Assignee: Jonathan Ellis Priority: Critical Attachments: 6154.txt With cluster sizes 1 inserts are blocked indefinitely: {code} $ ccm create -v git:trunk test Fetching Cassandra updates... Current cluster is now: test $ ccm populate -n 2 $ ccm start $ ccm node1 cqlsh Connected to test at 127.0.0.1:9160. [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.37.0] Use HELP for help. cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE timeline; cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value text, PRIMARY KEY (userid, event)); cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 'ryan', '2013-10-07', 'attempt'); {code} The last INSERT statement never returns.. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-5916) gossip and tokenMetadata get hostId out of sync on failed replace_node with the same IP address
[ https://issues.apache.org/jira/browse/CASSANDRA-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788523#comment-13788523 ] Brandon Williams commented on CASSANDRA-5916: - First, thanks for testing, [~ravilr]! bq. does it make sense to allow the operator to specify replace_token with the token(s) along with the replace_address to recover That could work, but I find it a bit ugly and confusing, especially since replace_token alone is supposed to work right now, but does not. bq. I think remaining in shadow mode may not work optimally well for cases where the node being replaced was down for more than hint window. So, all the nodes would have stopped hinting, and after replace, it would require repair to be ran to get the new data fed during the replace. That is true regardless of shadow mode though, since hibernate is a dead state and the node doesn't go live to reset the hint timer until the replace has completed. gossip and tokenMetadata get hostId out of sync on failed replace_node with the same IP address --- Key: CASSANDRA-5916 URL: https://issues.apache.org/jira/browse/CASSANDRA-5916 Project: Cassandra Issue Type: Bug Reporter: Brandon Williams Assignee: Brandon Williams Fix For: 1.2.11 Attachments: 5916.txt If you try to replace_node an existing, live hostId, it will error out. However if you're using an existing IP to do this (as in, you chose the wrong uuid to replace on accident) then the newly generated hostId wipes out the old one in TMD, and when you do try to replace it replace_node will complain it does not exist. Examination of gossipinfo still shows the old hostId, however now you can't replace it either. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6102) CassandraStorage broken for bigints and ints
[ https://issues.apache.org/jira/browse/CASSANDRA-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788529#comment-13788529 ] Brandon Williams commented on CASSANDRA-6102: - This part: {noformat} +// Don't want to create another TBase class, so use CfDef.populate_io_cache_on_flush +// to store flag of compact storage cql table. +if (cql3Table !(parseType(cfDef.comparator_type) instanceof AbstractCompositeType)) +cfDef.setPopulate_io_cache_on_flush(true); + +// Don't want to create another TBase class, so use CfDef.replicate_on_write +// to store flag of cql table. +if (cql3Table) +cfDef.setReplicate_on_write(true); {noformat} Feels like a hack that is going to bite us down the road when those options really do get removed. CassandraStorage broken for bigints and ints Key: CASSANDRA-6102 URL: https://issues.apache.org/jira/browse/CASSANDRA-6102 Project: Cassandra Issue Type: Bug Components: Hadoop Environment: Cassandra 1.2.9 1.2.10, Pig 0.11.1, OSX 10.8.x Reporter: Janne Jalkanen Assignee: Alex Liu Attachments: 6102-1.2-branch.txt I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle integer values. Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1. Single node for testing this. First a table: {noformat} CREATE TABLE testc ( key text PRIMARY KEY, ivalue int, svalue text, value bigint ) WITH COMPACT STORAGE; insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65); select * from testc; key | ivalue | svalue | value -+++--- foo | 10 |bar | 65 {noformat} For my Pig setup, I then use libraries from different C* versions to actually talk to my database (which stays on 1.2.10 all the time). Cassandra 1.0.12 (using cassandra_storage.jar): {noformat} testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage(); dump testc (foo,(svalue,bar),(ivalue,10),(value,65),{}) {noformat} Cassandra 1.1.10: {noformat} testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage(); dump testc (foo,(svalue,bar),(ivalue,10),(value,65),{}) {noformat} Cassandra 1.2.10: {noformat} (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage(); dump testc foo,{(ivalue, ),(svalue,bar),(value,A)}) {noformat} To me it appears that ints and bigints are interpreted as ascii values in cass 1.2.10. Did something change for CassandraStorage, is there a regression, or am I doing something wrong? Quick perusal of the JIRA didn't reveal anything that I could directly pin on this. Note that using compact storage does not seem to affect the issue, though it obviously changes the resulting pig format. In addition, trying to use Pygmalion {noformat} tf = foreach testc generate key, flatten(FromCassandraBag('ivalue,svalue,value',columns)) as (ivalue:int,svalue:chararray,lvalue:long); dump tf (foo, ,bar,A) {noformat} So no help there. Explicitly casting the values to (long) or (int) just results in a ClassCastException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[3/3] git commit: Merge branch 'cassandra-2.0' into trunk
Merge branch 'cassandra-2.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a5798165 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a5798165 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a5798165 Branch: refs/heads/trunk Commit: a57981650453470d6ee204329edf0dd5d008d18e Parents: 558a9e5 a2b1278 Author: Yuki Morishita yu...@apache.org Authored: Mon Oct 7 15:33:17 2013 -0500 Committer: Yuki Morishita yu...@apache.org Committed: Mon Oct 7 15:33:17 2013 -0500 -- CHANGES.txt | 1 + NEWS.txt| 3 + .../org/apache/cassandra/config/CFMetaData.java | 11 +++ .../org/apache/cassandra/config/KSMetaData.java | 1 + .../org/apache/cassandra/db/SystemKeyspace.java | 25 ++ .../CompactionHistoryTabularData.java | 84 .../db/compaction/CompactionManager.java| 14 .../db/compaction/CompactionManagerMBean.java | 4 + .../cassandra/db/compaction/CompactionTask.java | 52 ++-- .../org/apache/cassandra/tools/NodeCmd.java | 26 ++ .../org/apache/cassandra/tools/NodeProbe.java | 6 ++ .../apache/cassandra/tools/NodeToolHelp.yaml| 3 + 12 files changed, 204 insertions(+), 26 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/NEWS.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/src/java/org/apache/cassandra/config/CFMetaData.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/src/java/org/apache/cassandra/db/SystemKeyspace.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/src/java/org/apache/cassandra/db/compaction/CompactionTask.java --
[2/3] git commit: Save compaction history to system keyspace
Save compaction history to system keyspace patch by lantao yan; reviewed by yukim for CASSANDRA-5078 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a2b12784 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a2b12784 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a2b12784 Branch: refs/heads/trunk Commit: a2b12784fe3785fe96d9c0e2d7e8c72bfc88ac7c Parents: 01a57ee Author: lantao yan yanlan...@hotmail.com Authored: Mon Oct 7 15:22:11 2013 -0500 Committer: Yuki Morishita yu...@apache.org Committed: Mon Oct 7 15:30:50 2013 -0500 -- CHANGES.txt | 1 + NEWS.txt| 3 + .../org/apache/cassandra/config/CFMetaData.java | 11 +++ .../org/apache/cassandra/config/KSMetaData.java | 1 + .../org/apache/cassandra/db/SystemKeyspace.java | 25 ++ .../CompactionHistoryTabularData.java | 84 .../db/compaction/CompactionManager.java| 14 .../db/compaction/CompactionManagerMBean.java | 4 + .../cassandra/db/compaction/CompactionTask.java | 52 ++-- .../org/apache/cassandra/tools/NodeCmd.java | 26 ++ .../org/apache/cassandra/tools/NodeProbe.java | 6 ++ .../apache/cassandra/tools/NodeToolHelp.yaml| 3 + 12 files changed, 204 insertions(+), 26 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index ddd976e..ee631a0 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -20,6 +20,7 @@ * Allow alter keyspace on system_traces (CASSANDRA-6016) * Disallow empty column names in cql (CASSANDRA-6136) * Use Java7 file-handling APIs and fix file moving on Windows (CASSANDRA-5383) + * Save compaction history to system keyspace (CASSANDRA-5078) Merged from 1.2: * Limit CQL prepared statement cache by size instead of count (CASSANDRA-6107) * Tracing should log write failure rather than raw exceptions (CASSANDRA-6133) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 6ed8449..37fbae7 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -23,6 +23,9 @@ New features (See blog post at TODO) - Configurable metrics reporting (see conf/metrics-reporter-config-sample.yaml) +- Compaction history and stats are now saved to system keyspace + (system.compaction_history table). You can access historiy via + new 'nodetool compactionhistory' command or CQL. Upgrading - http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/src/java/org/apache/cassandra/config/CFMetaData.java -- diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java b/src/java/org/apache/cassandra/config/CFMetaData.java index 8c4075c..bbea21e 100644 --- a/src/java/org/apache/cassandra/config/CFMetaData.java +++ b/src/java/org/apache/cassandra/config/CFMetaData.java @@ -268,6 +268,17 @@ public final class CFMetaData + PRIMARY KEY ((keyspace_name, columnfamily_name, generation)) + ) WITH COMMENT='historic sstable read rates'); +public static final CFMetaData CompactionHistoryCf = compile(CREATE TABLE + SystemKeyspace.COMPACTION_HISTORY_CF + ( + + id uuid, + + keyspace_name text, + + columnfamily_name text, + + compacted_at timestamp, + + bytes_in bigint, + + bytes_out bigint, + + rows_merged mapint, bigint, + + PRIMARY KEY (id) + + ) WITH COMMENT='show all compaction history' AND DEFAULT_TIME_TO_LIVE=604800); + public enum Caching { ALL, KEYS_ONLY, ROWS_ONLY, NONE; http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/src/java/org/apache/cassandra/config/KSMetaData.java -- diff --git a/src/java/org/apache/cassandra/config/KSMetaData.java b/src/java/org/apache/cassandra/config/KSMetaData.java index
[1/3] git commit: Save compaction history to system keyspace
Updated Branches: refs/heads/cassandra-2.0 01a57eea8 - a2b12784f refs/heads/trunk 558a9e57b - a57981650 Save compaction history to system keyspace patch by lantao yan; reviewed by yukim for CASSANDRA-5078 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a2b12784 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a2b12784 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a2b12784 Branch: refs/heads/cassandra-2.0 Commit: a2b12784fe3785fe96d9c0e2d7e8c72bfc88ac7c Parents: 01a57ee Author: lantao yan yanlan...@hotmail.com Authored: Mon Oct 7 15:22:11 2013 -0500 Committer: Yuki Morishita yu...@apache.org Committed: Mon Oct 7 15:30:50 2013 -0500 -- CHANGES.txt | 1 + NEWS.txt| 3 + .../org/apache/cassandra/config/CFMetaData.java | 11 +++ .../org/apache/cassandra/config/KSMetaData.java | 1 + .../org/apache/cassandra/db/SystemKeyspace.java | 25 ++ .../CompactionHistoryTabularData.java | 84 .../db/compaction/CompactionManager.java| 14 .../db/compaction/CompactionManagerMBean.java | 4 + .../cassandra/db/compaction/CompactionTask.java | 52 ++-- .../org/apache/cassandra/tools/NodeCmd.java | 26 ++ .../org/apache/cassandra/tools/NodeProbe.java | 6 ++ .../apache/cassandra/tools/NodeToolHelp.yaml| 3 + 12 files changed, 204 insertions(+), 26 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index ddd976e..ee631a0 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -20,6 +20,7 @@ * Allow alter keyspace on system_traces (CASSANDRA-6016) * Disallow empty column names in cql (CASSANDRA-6136) * Use Java7 file-handling APIs and fix file moving on Windows (CASSANDRA-5383) + * Save compaction history to system keyspace (CASSANDRA-5078) Merged from 1.2: * Limit CQL prepared statement cache by size instead of count (CASSANDRA-6107) * Tracing should log write failure rather than raw exceptions (CASSANDRA-6133) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 6ed8449..37fbae7 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -23,6 +23,9 @@ New features (See blog post at TODO) - Configurable metrics reporting (see conf/metrics-reporter-config-sample.yaml) +- Compaction history and stats are now saved to system keyspace + (system.compaction_history table). You can access historiy via + new 'nodetool compactionhistory' command or CQL. Upgrading - http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/src/java/org/apache/cassandra/config/CFMetaData.java -- diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java b/src/java/org/apache/cassandra/config/CFMetaData.java index 8c4075c..bbea21e 100644 --- a/src/java/org/apache/cassandra/config/CFMetaData.java +++ b/src/java/org/apache/cassandra/config/CFMetaData.java @@ -268,6 +268,17 @@ public final class CFMetaData + PRIMARY KEY ((keyspace_name, columnfamily_name, generation)) + ) WITH COMMENT='historic sstable read rates'); +public static final CFMetaData CompactionHistoryCf = compile(CREATE TABLE + SystemKeyspace.COMPACTION_HISTORY_CF + ( + + id uuid, + + keyspace_name text, + + columnfamily_name text, + + compacted_at timestamp, + + bytes_in bigint, + + bytes_out bigint, + + rows_merged mapint, bigint, + + PRIMARY KEY (id) + + ) WITH COMMENT='show all compaction history' AND DEFAULT_TIME_TO_LIVE=604800); + public enum Caching { ALL, KEYS_ONLY, ROWS_ONLY, NONE; http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/src/java/org/apache/cassandra/config/KSMetaData.java -- diff --git
[jira] [Commented] (CASSANDRA-6137) CQL3 SELECT IN CLAUSE inconsistent
[ https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788541#comment-13788541 ] Constance Eustace commented on CASSANDRA-6137: -- We are currently simply selecting all the column keys rather than doing a subset of the columns using a where columnkey in (columnkeylist). The row we are doing this on isn't horribly worse than just getting the desired ones, but it's still not ideal. CQL3 SELECT IN CLAUSE inconsistent -- Key: CASSANDRA-6137 URL: https://issues.apache.org/jira/browse/CASSANDRA-6137 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu AWS Cassandra 2.0.1 SINGLE NODE Reporter: Constance Eustace Fix For: 2.0.1 We are encountering inconsistent results from CQL3 queries with column keys using IN clause in WHERE. This has been reproduced in cqlsh. Rowkey is e_entid Column key is p_prop This returns roughly 21 rows for 21 column keys that match p_prop. cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'; These three queries each return one row for the requested single column key in the IN clause: SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:all:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:fail:count'); This query returns ONLY ONE ROW (one column key), not three as I would expect from the three-column-key IN clause: cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); This query does return two rows however for the requested two column keys: cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ( 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); cqlsh describe table internal_submission.entity_job; CREATE TABLE entity_job ( e_entid text, p_prop text, describes text, dndcondition text, e_entlinks text, e_entname text, e_enttype text, ingeststatus text, ingeststatusdetail text, p_flags text, p_propid text, p_proplinks text, p_storage text, p_subents text, p_val text, p_vallang text, p_vallinks text, p_valtype text, p_valunit text, p_vars text, partnerid text, referenceid text, size int, sourceip text, submitdate bigint, submitevent text, userid text, version text, PRIMARY KEY (e_entid, p_prop) ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes); CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job (dndcondition); CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job (ingeststatus); CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job (ingeststatusdetail); CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job (referenceid); CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid); CREATE INDEX
[jira] [Updated] (CASSANDRA-6137) CQL3 SELECT IN CLAUSE inconsistent
[ https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Constance Eustace updated CASSANDRA-6137: - Description: We are encountering inconsistent results from CQL3 queries with column keys using IN clause in WHERE. This has been reproduced in cqlsh and the jdbc driver. Rowkey is e_entid Column key is p_prop This returns roughly 21 rows for 21 column keys that match p_prop. cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'; These three queries each return one row for the requested single column key in the IN clause: SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:all:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:fail:count'); This query returns ONLY ONE ROW (one column key), not three as I would expect from the three-column-key IN clause: cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); This query does return two rows however for the requested two column keys: cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ( 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); cqlsh describe table internal_submission.entity_job; CREATE TABLE entity_job ( e_entid text, p_prop text, describes text, dndcondition text, e_entlinks text, e_entname text, e_enttype text, ingeststatus text, ingeststatusdetail text, p_flags text, p_propid text, p_proplinks text, p_storage text, p_subents text, p_val text, p_vallang text, p_vallinks text, p_valtype text, p_valunit text, p_vars text, partnerid text, referenceid text, size int, sourceip text, submitdate bigint, submitevent text, userid text, version text, PRIMARY KEY (e_entid, p_prop) ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes); CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job (dndcondition); CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job (ingeststatus); CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job (ingeststatusdetail); CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job (referenceid); CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid); CREATE INDEX internal_submission__JobVersionIDX ON entity_job (version); --- My suspicion is that the three-column-key IN Clause is translated (improperly or not) to a two-column key range with the assumption that the third column key is present in that range, but it isn't... was: We are encountering inconsistent results from CQL3 queries with column keys using IN clause in WHERE. This has been reproduced in cqlsh. Rowkey is e_entid Column key is p_prop This returns roughly 21 rows for 21 column keys that match p_prop. cqlsh SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid =
[jira] [Commented] (CASSANDRA-5916) gossip and tokenMetadata get hostId out of sync on failed replace_node with the same IP address
[ https://issues.apache.org/jira/browse/CASSANDRA-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788570#comment-13788570 ] Ravi Prasad commented on CASSANDRA-5916: That is true regardless of shadow mode though, since hibernate is a dead state and the node doesn't go live to reset the hint timer until the replace has completed. my understanding is due to the generation change of the replacing node, gossiper.handleMajorStateChange marks the node as dead, as hibernate is one of the DEAD_STATES. So, the other nodes marks the replacing node as dead before the token bootstrap starts, hence should be storing hints to the replacing node from that point. gossip and tokenMetadata get hostId out of sync on failed replace_node with the same IP address --- Key: CASSANDRA-5916 URL: https://issues.apache.org/jira/browse/CASSANDRA-5916 Project: Cassandra Issue Type: Bug Reporter: Brandon Williams Assignee: Brandon Williams Fix For: 1.2.11 Attachments: 5916.txt If you try to replace_node an existing, live hostId, it will error out. However if you're using an existing IP to do this (as in, you chose the wrong uuid to replace on accident) then the newly generated hostId wipes out the old one in TMD, and when you do try to replace it replace_node will complain it does not exist. Examination of gossipinfo still shows the old hostId, however now you can't replace it either. -- This message was sent by Atlassian JIRA (v6.1#6144)