[jira] [Commented] (CASSANDRA-6977) attempting to create 10K column families fails with 100 node cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075947#comment-14075947 ] Michael Nelson commented on CASSANDRA-6977: --- This is a showstopper for a very large customer. They need the ability to create new keyspaces as they add new customers. Their use case is multi-tenancy, due to HIPPA and PCI, so that each customer is a separate keyspace, keeping the data separate. attempting to create 10K column families fails with 100 node cluster Key: CASSANDRA-6977 URL: https://issues.apache.org/jira/browse/CASSANDRA-6977 Project: Cassandra Issue Type: Bug Environment: 100 nodes, Ubuntu 12.04.3 LTS, AWS m1.large instances Reporter: Daniel Meyer Assignee: Russ Hatch Priority: Minor Attachments: 100_nodes_all_data.png, all_data_5_nodes.png, keyspace_create.py, logs.tar, tpstats.txt, visualvm_tracer_data.csv During this test we are attempting to create a total of 1K keyspaces with 10 column families each to bring the total column families to 10K. With a 5 node cluster this operation can be completed; however, it fails with 100 nodes. Please see the two charts. For the 5 node case the time required to create each keyspace and subsequent 10 column families increases linearly until the number of keyspaces is 1K. For a 100 node cluster there is a sudden increase in latency between 450 keyspaces and 550 keyspaces. The test ends when the test script times out. After the test script times out it is impossible to reconnect to the cluster with the datastax python driver because it cannot connect to the host: cassandra.cluster.NoHostAvailable: ('Unable to connect to any servers', {'10.199.5.98': OperationTimedOut()} It was found that running the following stress command does work from the same machine the test script runs on. cassandra-stress -d 10.199.5.98 -l 2 -e QUORUM -L3 -b -o INSERT It should be noted that this test was initially done with DSE 4.0 and c* version 2.0.5.24 and in that case it was not possible to run stress against the cluster even locally on a node due to not finding the host. Attached are system logs from one of the nodes, charts showing schema creation latency for 5 and 100 node clusters and virtualvm tracer data for cpu, memory, num_threads and gc runs, tpstat output and the test script. The test script was on an m1.large aws instance outside of the cluster under test. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7575) Custom 2i validation
[ https://issues.apache.org/jira/browse/CASSANDRA-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076044#comment-14076044 ] Sergio Bossa commented on CASSANDRA-7575: - [~adelapena], following the review of your patch, I believe that while it works in practice, pulling the index searchers in SelectStatement#getRangeCommand and validating them that way is a bit odd, more specifically: * SelectStatement#getRangeCommand may be called even if a 2i query is not present, so enforcing 2i validation there is a bit misleading and unexpected. * SecondaryIndexSearcher#validate is called with the whole list of index expressions, which means each searcher implementation will have to go through the list to inspect each expression and decide if that specific expression was targeted for it and is wrong, or was just for another searcher. I'd rather rework the patch in the following way: * Add a SecondaryIndexManager#validateIndexSearchersForQuery method that works similarly to getIndexSearchersForQuery, but rather than just getting the index by each column, it also validates it against the proper column/expression by calling SecondaryIndexSearcher#validate(IndexExpression). * Call SecondaryIndexManager#validateIndexSearchersForQuery from SelectStatement#RawStatement#validateSecondaryIndexSelections That should improve encapsulation and responsibility placement and provide better 2i APIs. Finally, I would add a few tests. Custom 2i validation Key: CASSANDRA-7575 URL: https://issues.apache.org/jira/browse/CASSANDRA-7575 Project: Cassandra Issue Type: Improvement Components: API Reporter: Andrés de la Peña Assignee: Andrés de la Peña Priority: Minor Labels: 2i, cql3, secondaryIndex, secondary_index, select Fix For: 2.1.0, 3.0 Attachments: 2i_validation.patch There are several projects using custom secondary indexes as an extension point to integrate C* with other systems such as Solr or Lucene. The usual approach is to embed third party indexing queries in CQL clauses. For example, [DSE Search|http://www.datastax.com/what-we-offer/products-services/datastax-enterprise] embeds Solr syntax this way: {code} SELECT title FROM solr WHERE solr_query='title:natio*'; {code} [Stratio platform|https://github.com/Stratio/stratio-cassandra] embeds custom JSON syntax for searching in Lucene indexes: {code} SELECT * FROM tweets WHERE lucene='{ filter : { type: range, field: time, lower: 2014/04/25, upper: 2014/04/1 }, query : { type: phrase, field: body, values: [big, data] }, sort : {fields: [ {field:time, reverse:true} ] } }'; {code} Tuplejump [Stargate|http://tuplejump.github.io/stargate/] also uses the Stratio's open source JSON syntax: {code} SELECT name,company FROM PERSON WHERE stargate ='{ filter: { type: range, field: company, lower: a, upper: p }, sort:{ fields: [{field:name,reverse:true}] } }'; {code} These syntaxes are validated by the corresponding 2i implementation. This validation is done behind the StorageProxy command distribution. So, far as I know, there is no way to give rich feedback about syntax errors to CQL users. I'm uploading a patch with some changes trying to improve this. I propose adding an empty validation method to SecondaryIndexSearcher that can be overridden by custom 2i implementations: {code} public void validate(ListIndexExpression clause) {} {code} And call it from SelectStatement#getRangeCommand: {code} ColumnFamilyStore cfs = Keyspace.open(keyspace()).getColumnFamilyStore(columnFamily()); for (SecondaryIndexSearcher searcher : cfs.indexManager.getIndexSearchersForQuery(expressions)) { try { searcher.validate(expressions); } catch (RuntimeException e) { String exceptionMessage = e.getMessage(); if (exceptionMessage != null !exceptionMessage.trim().isEmpty()) throw new InvalidRequestException( Invalid index expression: + e.getMessage()); else throw new InvalidRequestException( Invalid index expression); } } {code} In this way C* allows custom 2i implementations to give feedback about syntax errors. We are currently using these changes in a fork with no problems. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7593) Errors when upgrading through several versions to 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076068#comment-14076068 ] Marcus Eriksson commented on CASSANDRA-7593: no, they should not be empty We are inserting a RangeTombstone with start='token' and end='token' (ie, delete the set for this row). In 2.0 we only make the end have an EOC (https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/cql3/Sets.java#L234) while in 2.1 both do: https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/cql3/Sets.java#L252 + https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/db/composites/AbstractComposite.java#L69 Errors when upgrading through several versions to 2.1 - Key: CASSANDRA-7593 URL: https://issues.apache.org/jira/browse/CASSANDRA-7593 Project: Cassandra Issue Type: Bug Environment: java 1.7 Reporter: Russ Hatch Assignee: Marcus Eriksson Priority: Critical Fix For: 2.1.0 I'm seeing two different errors cropping up in the dtest which upgrades a cluster through several versions. This is the more common error: {noformat} ERROR [GossipStage:10] 2014-07-22 13:14:30,028 CassandraDaemon.java:168 - Exception in thread Thread[GossipStage:10,5,main] java.lang.AssertionError: null at org.apache.cassandra.db.filter.SliceQueryFilter.shouldInclude(SliceQueryFilter.java:347) ~[main/:na] at org.apache.cassandra.db.filter.QueryFilter.shouldInclude(QueryFilter.java:249) ~[main/:na] at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249) ~[main/:na] at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60) ~[main/:na] at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1873) ~[main/:na] at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1681) ~[main/:na] at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:345) ~[main/:na] at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:59) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.readLocally(SelectStatement.java:293) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:302) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:60) ~[main/:na] at org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:263) ~[main/:na] at org.apache.cassandra.db.SystemKeyspace.getPreferredIP(SystemKeyspace.java:514) ~[main/:na] at org.apache.cassandra.net.OutboundTcpConnectionPool.init(OutboundTcpConnectionPool.java:51) ~[main/:na] at org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:522) ~[main/:na] at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:536) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:689) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:663) ~[main/:na] at org.apache.cassandra.service.EchoVerbHandler.doVerb(EchoVerbHandler.java:40) ~[main/:na] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) ~[main/:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_60] at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] {noformat} The same test sometimes fails with this exception instead: {noformat} ERROR [CompactionExecutor:4] 2014-07-22 16:18:21,008 CassandraDaemon.java:168 - Exception in thread Thread[CompactionExecutor:4,1,RMI Runtime] java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7059d3e9 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@108f1504[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 95] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) ~[na:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) ~[na:1.7.0_60] at
[jira] [Assigned] (CASSANDRA-7596) Don't swap min/max column names when mutating level or repairedAt
[ https://issues.apache.org/jira/browse/CASSANDRA-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson reassigned CASSANDRA-7596: -- Assignee: Marcus Eriksson Don't swap min/max column names when mutating level or repairedAt - Key: CASSANDRA-7596 URL: https://issues.apache.org/jira/browse/CASSANDRA-7596 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Marcus Eriksson Fix For: 2.1.0 Attachments: 0001-dont-swap.patch Seems we swap min/max col names when mutating sstable metadata -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: Don't swap max/min column names when mutating sstable metadata.
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1.0 6f15fe260 - ee62ae104 Don't swap max/min column names when mutating sstable metadata. Patch by marcuse; reviewed by benedict for CASSANDRA-7596. Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ee62ae10 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ee62ae10 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ee62ae10 Branch: refs/heads/cassandra-2.1.0 Commit: ee62ae104ee2c69d852b488f904b1854aa58aa2a Parents: 6f15fe2 Author: Marcus Eriksson marc...@apache.org Authored: Mon Jul 28 12:48:24 2014 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Mon Jul 28 12:48:24 2014 +0200 -- CHANGES.txt | 1 + .../org/apache/cassandra/io/sstable/metadata/StatsMetadata.java | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ee62ae10/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 0a1ba51..c6aaef9 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -14,6 +14,7 @@ * Fix tracing of range slices and secondary index lookups that are local to the coordinator (CASSANDRA-7599) * Set -Dcassandra.storagedir for all tool shell scripts (CASSANDRA-7587) + * Don't swap max/min col names when mutating sstable metadata (CASSANDRA-7596) Merged from 2.0: * Fix ReversedType(DateType) mapping to native protocol (CASSANDRA-7576) * Always merge ranges owned by a single node (CASSANDRA-6930) http://git-wip-us.apache.org/repos/asf/cassandra/blob/ee62ae10/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java b/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java index 900bd4e..a557b88 100644 --- a/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java @@ -124,8 +124,8 @@ public class StatsMetadata extends MetadataComponent compressionRatio, estimatedTombstoneDropTime, newLevel, - maxColumnNames, minColumnNames, + maxColumnNames, hasLegacyCounterShards, repairedAt); } @@ -141,8 +141,8 @@ public class StatsMetadata extends MetadataComponent compressionRatio, estimatedTombstoneDropTime, sstableLevel, - maxColumnNames, minColumnNames, + maxColumnNames, hasLegacyCounterShards, newRepairedAt); }
[1/2] git commit: Don't swap max/min column names when mutating sstable metadata.
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 3744d7792 - 2236afb7a Don't swap max/min column names when mutating sstable metadata. Patch by marcuse; reviewed by benedict for CASSANDRA-7596. Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ee62ae10 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ee62ae10 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ee62ae10 Branch: refs/heads/cassandra-2.1 Commit: ee62ae104ee2c69d852b488f904b1854aa58aa2a Parents: 6f15fe2 Author: Marcus Eriksson marc...@apache.org Authored: Mon Jul 28 12:48:24 2014 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Mon Jul 28 12:48:24 2014 +0200 -- CHANGES.txt | 1 + .../org/apache/cassandra/io/sstable/metadata/StatsMetadata.java | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ee62ae10/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 0a1ba51..c6aaef9 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -14,6 +14,7 @@ * Fix tracing of range slices and secondary index lookups that are local to the coordinator (CASSANDRA-7599) * Set -Dcassandra.storagedir for all tool shell scripts (CASSANDRA-7587) + * Don't swap max/min col names when mutating sstable metadata (CASSANDRA-7596) Merged from 2.0: * Fix ReversedType(DateType) mapping to native protocol (CASSANDRA-7576) * Always merge ranges owned by a single node (CASSANDRA-6930) http://git-wip-us.apache.org/repos/asf/cassandra/blob/ee62ae10/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java b/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java index 900bd4e..a557b88 100644 --- a/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java @@ -124,8 +124,8 @@ public class StatsMetadata extends MetadataComponent compressionRatio, estimatedTombstoneDropTime, newLevel, - maxColumnNames, minColumnNames, + maxColumnNames, hasLegacyCounterShards, repairedAt); } @@ -141,8 +141,8 @@ public class StatsMetadata extends MetadataComponent compressionRatio, estimatedTombstoneDropTime, sstableLevel, - maxColumnNames, minColumnNames, + maxColumnNames, hasLegacyCounterShards, newRepairedAt); }
[2/2] git commit: Merge branch 'cassandra-2.1.0' into cassandra-2.1
Merge branch 'cassandra-2.1.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2236afb7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2236afb7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2236afb7 Branch: refs/heads/cassandra-2.1 Commit: 2236afb7a06725f9ceb13bab8c2180eb0d6134f5 Parents: 3744d77 ee62ae1 Author: Marcus Eriksson marc...@apache.org Authored: Mon Jul 28 12:49:06 2014 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Mon Jul 28 12:49:06 2014 +0200 -- CHANGES.txt | 1 + .../org/apache/cassandra/io/sstable/metadata/StatsMetadata.java | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/2236afb7/CHANGES.txt --
[2/3] git commit: Merge branch 'cassandra-2.1.0' into cassandra-2.1
Merge branch 'cassandra-2.1.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2236afb7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2236afb7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2236afb7 Branch: refs/heads/trunk Commit: 2236afb7a06725f9ceb13bab8c2180eb0d6134f5 Parents: 3744d77 ee62ae1 Author: Marcus Eriksson marc...@apache.org Authored: Mon Jul 28 12:49:06 2014 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Mon Jul 28 12:49:06 2014 +0200 -- CHANGES.txt | 1 + .../org/apache/cassandra/io/sstable/metadata/StatsMetadata.java | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/2236afb7/CHANGES.txt --
[3/3] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0fd1a0bb Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0fd1a0bb Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0fd1a0bb Branch: refs/heads/trunk Commit: 0fd1a0bb47f66eaa29ce821aa4836c52b65e46e1 Parents: f3aa83b 2236afb Author: Marcus Eriksson marc...@apache.org Authored: Mon Jul 28 12:49:28 2014 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Mon Jul 28 12:49:28 2014 +0200 -- CHANGES.txt | 1 + .../org/apache/cassandra/io/sstable/metadata/StatsMetadata.java | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0fd1a0bb/CHANGES.txt --
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076202#comment-14076202 ] Jonathan Ellis commented on CASSANDRA-7582: --- How do we know what's in the system ks when all we have is a cfid that doesn't match anything known? More generally, I'm not sure how stop on unknown cfid is going to be a useful feature. It's definitely going to happen if you replay a commitlog after dropping a table, for instance, if we have an unclean shutdown in between. This is normal behavior and not a bug per se, so whacking users and not starting up is definitely antisocial. On the other hand I can't picture a scenario where the user *can* take meaningful action based on failing startup here. Put another way, ignoring the mutations is the Right Thing to do in every scenario I can think of. So I propose we just log it at info and ignore. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076203#comment-14076203 ] Aleksey Yeschenko commented on CASSANDRA-7582: -- Indeed, there is no obvious way to recover from it that I can think of. +1 on logging it and going on. -Dcassandra.commitlog.stop_on_missing_tables should also go. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7056) Add RAMP transactions
[ https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076297#comment-14076297 ] Jonathan Ellis commented on CASSANDRA-7056: --- bq. I'd also vote for making UNLOGGED the default (implicit) BATCH behavior, now that the LOGGED batches would cost even more than they do now. UNLOGGED is still a misfeature, so I don't see how the cost of RAMP affects our choice of default. (And for the record I think RAMP should definitely be the default; it matches users' assumptions so much better.) I guess we could add UN_ISOLATED to request logged-without-ramp though. Add RAMP transactions - Key: CASSANDRA-7056 URL: https://issues.apache.org/jira/browse/CASSANDRA-7056 Project: Cassandra Issue Type: Wish Components: Core Reporter: Tupshin Harper Priority: Minor We should take a look at [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/] transactions, and figure out if they can be used to provide more efficient LWT (or LWT-like) operations. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7056) Add RAMP transactions
[ https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076300#comment-14076300 ] Jeremiah Jordan commented on CASSANDRA-7056: bq. UNLOGGED is still a misfeature UNLOGGED is not always a misfeature. If I was doing batch writes to a single partition, I would make them unlogged. No point in having the overhead of a logged batch for that. But I would not make UNLOGGED the default. Add RAMP transactions - Key: CASSANDRA-7056 URL: https://issues.apache.org/jira/browse/CASSANDRA-7056 Project: Cassandra Issue Type: Wish Components: Core Reporter: Tupshin Harper Priority: Minor We should take a look at [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/] transactions, and figure out if they can be used to provide more efficient LWT (or LWT-like) operations. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7576) DateType columns not properly converted to TimestampType when in ReversedType columns.
[ https://issues.apache.org/jira/browse/CASSANDRA-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076313#comment-14076313 ] Karl Rieb commented on CASSANDRA-7576: -- bq. Karl Rieb I know it wasn't a big deal to you, but anyway - when cherry-picking the patch back to 2.1.0, I did correct the name to yours in 'patch by' (: Thanks [~iamaleksey]! DateType columns not properly converted to TimestampType when in ReversedType columns. -- Key: CASSANDRA-7576 URL: https://issues.apache.org/jira/browse/CASSANDRA-7576 Project: Cassandra Issue Type: Bug Components: Core Reporter: Karl Rieb Assignee: Karl Rieb Fix For: 2.0.10, 2.1.0 Attachments: DataType_CASSANDRA_7576.patch Original Estimate: 0.25h Remaining Estimate: 0.25h The {{org.apache.cassandra.transport.DataType.fromType(AbstractType)}} method has a bug that prevents sending the correct Protocol ID for reversed {{DateType}} columns. This results in clients receiving Protocol ID {{0}}, which maps to a {{CUSTOM}} type, for timestamp columns that are clustered in reverse order. Some clients can handle this properly since they recognize the {{org.apache.cassandra.db.marshal.DateType}} marshaling type, however the native Datastax java-driver does not. It will produce errors like the one below when trying to prepare queries against such tables: {noformat} com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 2 of CQL type 'org.apache.cassandra.db.marshal.DateType', expecting class java.nio.ByteBuffer but class java.util.Date provided at com.datastax.driver.core.BoundStatement.bind(BoundStatement.java:190) at com.datastax.driver.core.DefaultPreparedStatement.bind(DefaultPreparedStatement.java:103) {noformat} On the Cassandra side, there is a check for {{DateType}} columns that is supposed to convert these columns to TimestampType. However, the check is skipped when the column is also reversed. Specifically: {code:title=DataType.java|borderStyle=solid} public static PairDataType, Object fromType(AbstractType type) { // For CQL3 clients, ReversedType is an implementation detail and they // shouldn't have to care about it. if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) else if (type instanceof DateType) type = TimestampType.instance; // ... {code} The *else if* should be changed to just an *if*, like so: {code:title=DataType.java|borderStyle=solid} public static PairDataType, Object fromType(AbstractType type) { // For CQL3 clients, ReversedType is an implementation detail and they // shouldn't have to care about it. if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) if (type instanceof DateType) type = TimestampType.instance; // ... {code} This bug is preventing us from upgrading our 1.2.11 cluster to 2.0.9 because our clients keep throwing exceptions trying to read or write data to tables with reversed timestamp columns. This issue can be reproduced by creating a CQL table in Cassandra 1.2.11 that clusters on a timestamp in reverse, then upgrading the node to 2.0.9. When querying the metadata for the table, the node will return Protocol ID 0 (CUSTOM) instead of Protocol ID 11 (TIMESTAMP). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
[ https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] graham sanderson updated CASSANDRA-7546: Attachment: 7546.20_5.txt I've add 7646.20_5.txt which is the same as 7546_20.4.txt but with a minor change that allows it to function correctly to close to the full 32 bits of time range vs 31 bits. # Any thoughts on metrics? I'm thinking a simple CF (and rolled up KS) metric which simply counts number of highly contented rows over time. Note, we do know when a row was partially contented, but I don't know that we can assign a meaningful value between 0 1. Note, we could do a ratio of good vs bad rows on flush, but I think the raw count is more interesting # Note, I plan to move the static {} block at the top to a test case for sanity checking - it doesn't belong mixed in the code... Once we're all set I'll submit an actual patch for 2.0.x and 2.1.x - should we patch this in 1.1/1.2 also? # Any other thoughts? I'd like to start testing this (but don't want to do so if it you want to make major changes). I'll test on top of 2.0.10 in beta with our code and cassandra stress (hopefully some scenarios you have in 2.1 both with a node down for hinting and not), and maybe after that with the tracking/metric on but the synchronized off in production just to check that it exactly detects our hint storms and nothing else in production (we have no application tables that should be heavily contented on the partition level). I'll make and test a patch on 2.1 also, however I'll have to finish testing on 2.0.x before I can upgrade a (fast h/w) cluster to 2.1 AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory - Key: CASSANDRA-7546 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546 Project: Cassandra Issue Type: Bug Components: Core Reporter: graham sanderson Assignee: graham sanderson Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 7546.20_4.txt, 7546.20_5.txt, 7546.20_alt.txt, suggestion1.txt, suggestion1_21.txt In order to preserve atomicity, this code attempts to read, clone/update, then CAS the state of the partition. Under heavy contention for updating a single partition this can cause some fairly staggering memory growth (the more cores on your machine the worst it gets). Whilst many usage patterns don't do highly concurrent updates to the same partition, hinting today, does, and in this case wild (order(s) of magnitude more than expected) memory allocation rates can be seen (especially when the updates being hinted are small updates to different partitions which can happen very fast on their own) - see CASSANDRA-7545 It would be best to eliminate/reduce/limit the spinning memory allocation whilst not slowing down the very common un-contended case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076399#comment-14076399 ] Jonathan Ellis commented on CASSANDRA-7582: --- This was introduced by CASSANDRA-7125 for 2.1.1 and is not in the 2.1.0 branch. Is this actually a problem with rc4 [~enigmacurry]? 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7056) Add RAMP transactions
[ https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076404#comment-14076404 ] Jonathan Ellis commented on CASSANDRA-7056: --- FTR, we transform single-partition batches to UNLOGGED automagically, since you are right; there is no point in the logging overhead there. Add RAMP transactions - Key: CASSANDRA-7056 URL: https://issues.apache.org/jira/browse/CASSANDRA-7056 Project: Cassandra Issue Type: Wish Components: Core Reporter: Tupshin Harper Priority: Minor We should take a look at [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/] transactions, and figure out if they can be used to provide more efficient LWT (or LWT-like) operations. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7593) Errors when upgrading through several versions to 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076411#comment-14076411 ] Jonathan Ellis commented on CASSANDRA-7593: --- bq. Could we assume EOC.START for RT.min in 2.1 when deserializing old sstables? That sounds like the right fix to me. Errors when upgrading through several versions to 2.1 - Key: CASSANDRA-7593 URL: https://issues.apache.org/jira/browse/CASSANDRA-7593 Project: Cassandra Issue Type: Bug Environment: java 1.7 Reporter: Russ Hatch Assignee: Marcus Eriksson Priority: Critical Fix For: 2.1.0 I'm seeing two different errors cropping up in the dtest which upgrades a cluster through several versions. This is the more common error: {noformat} ERROR [GossipStage:10] 2014-07-22 13:14:30,028 CassandraDaemon.java:168 - Exception in thread Thread[GossipStage:10,5,main] java.lang.AssertionError: null at org.apache.cassandra.db.filter.SliceQueryFilter.shouldInclude(SliceQueryFilter.java:347) ~[main/:na] at org.apache.cassandra.db.filter.QueryFilter.shouldInclude(QueryFilter.java:249) ~[main/:na] at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249) ~[main/:na] at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60) ~[main/:na] at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1873) ~[main/:na] at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1681) ~[main/:na] at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:345) ~[main/:na] at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:59) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.readLocally(SelectStatement.java:293) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:302) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:60) ~[main/:na] at org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:263) ~[main/:na] at org.apache.cassandra.db.SystemKeyspace.getPreferredIP(SystemKeyspace.java:514) ~[main/:na] at org.apache.cassandra.net.OutboundTcpConnectionPool.init(OutboundTcpConnectionPool.java:51) ~[main/:na] at org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:522) ~[main/:na] at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:536) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:689) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:663) ~[main/:na] at org.apache.cassandra.service.EchoVerbHandler.doVerb(EchoVerbHandler.java:40) ~[main/:na] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) ~[main/:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_60] at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] {noformat} The same test sometimes fails with this exception instead: {noformat} ERROR [CompactionExecutor:4] 2014-07-22 16:18:21,008 CassandraDaemon.java:168 - Exception in thread Thread[CompactionExecutor:4,1,RMI Runtime] java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7059d3e9 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@108f1504[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 95] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) ~[na:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) ~[na:1.7.0_60] at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325) ~[na:1.7.0_60] at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530) ~[na:1.7.0_60] at java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:619) ~[na:1.7.0_60] at org.apache.cassandra.io.sstable.SSTableReader.scheduleTidy(SSTableReader.java:628) ~[main/:na] at
[jira] [Created] (CASSANDRA-7631) Allow Stress to write directly to SSTables
Russell Alexander Spitzer created CASSANDRA-7631: Summary: Allow Stress to write directly to SSTables Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7629) tracing no longer logs when the request completed
[ https://issues.apache.org/jira/browse/CASSANDRA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076416#comment-14076416 ] Jonathan Ellis commented on CASSANDRA-7629: --- I think you're mis-remembering. The only request complete in 2.0 is a debug log entry, which is still there in 2.1: {code} public void stopSession() { TraceState state = this.state.get(); if (state == null) // inline isTracing to avoid implicit two calls to state.get() { logger.debug(request complete); } {code} tracing no longer logs when the request completed - Key: CASSANDRA-7629 URL: https://issues.apache.org/jira/browse/CASSANDRA-7629 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Fix For: 2.1.1 In 2.0 and before, there is a Request complete entry in tracing, which no longer appears in 2.1. This makes it difficult to reason about latency/performance problems in a trace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7629) tracing no longer logs when the request completed
[ https://issues.apache.org/jira/browse/CASSANDRA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7629: -- Priority: Minor (was: Major) Fix Version/s: (was: 2.1.0) 2.1.1 tracing no longer logs when the request completed - Key: CASSANDRA-7629 URL: https://issues.apache.org/jira/browse/CASSANDRA-7629 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Priority: Minor Fix For: 2.1.1 In 2.0 and before, there is a Request complete entry in tracing, which no longer appears in 2.1. This makes it difficult to reason about latency/performance problems in a trace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076415#comment-14076415 ] Russell Alexander Spitzer commented on CASSANDRA-7631: -- I think we can implement this by writing a new client, SSTableClient which would create the directory structure and instead of executing cql statements will add lines to a CQLSSTable writer. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Alexander Spitzer reassigned CASSANDRA-7631: Assignee: Russell Alexander Spitzer Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7575) Custom 2i validation
[ https://issues.apache.org/jira/browse/CASSANDRA-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7575: -- Fix Version/s: (was: 2.1.0) (was: 3.0) 2.1.1 Custom 2i validation Key: CASSANDRA-7575 URL: https://issues.apache.org/jira/browse/CASSANDRA-7575 Project: Cassandra Issue Type: Improvement Components: API Reporter: Andrés de la Peña Assignee: Andrés de la Peña Priority: Minor Labels: 2i, cql3, secondaryIndex, secondary_index, select Fix For: 2.1.1 Attachments: 2i_validation.patch There are several projects using custom secondary indexes as an extension point to integrate C* with other systems such as Solr or Lucene. The usual approach is to embed third party indexing queries in CQL clauses. For example, [DSE Search|http://www.datastax.com/what-we-offer/products-services/datastax-enterprise] embeds Solr syntax this way: {code} SELECT title FROM solr WHERE solr_query='title:natio*'; {code} [Stratio platform|https://github.com/Stratio/stratio-cassandra] embeds custom JSON syntax for searching in Lucene indexes: {code} SELECT * FROM tweets WHERE lucene='{ filter : { type: range, field: time, lower: 2014/04/25, upper: 2014/04/1 }, query : { type: phrase, field: body, values: [big, data] }, sort : {fields: [ {field:time, reverse:true} ] } }'; {code} Tuplejump [Stargate|http://tuplejump.github.io/stargate/] also uses the Stratio's open source JSON syntax: {code} SELECT name,company FROM PERSON WHERE stargate ='{ filter: { type: range, field: company, lower: a, upper: p }, sort:{ fields: [{field:name,reverse:true}] } }'; {code} These syntaxes are validated by the corresponding 2i implementation. This validation is done behind the StorageProxy command distribution. So, far as I know, there is no way to give rich feedback about syntax errors to CQL users. I'm uploading a patch with some changes trying to improve this. I propose adding an empty validation method to SecondaryIndexSearcher that can be overridden by custom 2i implementations: {code} public void validate(ListIndexExpression clause) {} {code} And call it from SelectStatement#getRangeCommand: {code} ColumnFamilyStore cfs = Keyspace.open(keyspace()).getColumnFamilyStore(columnFamily()); for (SecondaryIndexSearcher searcher : cfs.indexManager.getIndexSearchersForQuery(expressions)) { try { searcher.validate(expressions); } catch (RuntimeException e) { String exceptionMessage = e.getMessage(); if (exceptionMessage != null !exceptionMessage.trim().isEmpty()) throw new InvalidRequestException( Invalid index expression: + e.getMessage()); else throw new InvalidRequestException( Invalid index expression); } } {code} In this way C* allows custom 2i implementations to give feedback about syntax errors. We are currently using these changes in a fork with no problems. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7632) NPE in AutoSavingCache$Writer.deleteOldCacheFiles
Vishy Kasar created CASSANDRA-7632: -- Summary: NPE in AutoSavingCache$Writer.deleteOldCacheFiles Key: CASSANDRA-7632 URL: https://issues.apache.org/jira/browse/CASSANDRA-7632 Project: Cassandra Issue Type: Bug Components: Core Reporter: Vishy Kasar Priority: Minor Observed this NPE in one of our production cluster (2.0.9). Does not seem to be causing harm but good to resolve. ERROR [CompactionExecutor:1188] 2014-07-27 21:57:08,225 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1188,1,main] clusterName=clouddb_p03 java.lang.NullPointerException at org.apache.cassandra.cache.AutoSavingCache$Writer.deleteOldCacheFiles(AutoSavingCache.java:265) at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:195) at org.apache.cassandra.db.compaction.CompactionManager$10.run(CompactionManager.java:862) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7628) Tools java driver needs to be updated
[ https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076433#comment-14076433 ] Brandon Williams commented on CASSANDRA-7628: - Looks like upgrading to 2.1 is going to require code changes. Tools java driver needs to be updated - Key: CASSANDRA-7628 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Brandon Williams Priority: Minor Fix For: 2.1.0 When you run stress currently you get a bunch of harmless stacktraces like: {noformat} ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions() will return null java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) [cassandra-driver-core-2.0.1.jar:na] at org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159) [stress/:na] at org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) [stress/:na] {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (CASSANDRA-7628) Tools java driver needs to be updated
[ https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reassigned CASSANDRA-7628: --- Assignee: Benedict Tools java driver needs to be updated - Key: CASSANDRA-7628 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Brandon Williams Assignee: Benedict Priority: Minor Fix For: 2.1.0 When you run stress currently you get a bunch of harmless stacktraces like: {noformat} ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions() will return null java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) [cassandra-driver-core-2.0.1.jar:na] at org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159) [stress/:na] at org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) [stress/:na] {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7629) tracing no longer logs when the request completed
[ https://issues.apache.org/jira/browse/CASSANDRA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076435#comment-14076435 ] Brandon Williams commented on CASSANDRA-7629: - I'm really not though, see my paste in CASSANDRA-7567 tracing no longer logs when the request completed - Key: CASSANDRA-7629 URL: https://issues.apache.org/jira/browse/CASSANDRA-7629 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Priority: Minor Fix For: 2.1.1 In 2.0 and before, there is a Request complete entry in tracing, which no longer appears in 2.1. This makes it difficult to reason about latency/performance problems in a trace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7629) tracing no longer logs when the request completed
[ https://issues.apache.org/jira/browse/CASSANDRA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076436#comment-14076436 ] Brandon Williams commented on CASSANDRA-7629: - Aha, it's cqlsh that adds that: {noformat} pylib/cqlshlib/tracing.py:rows.append(['Request complete', finished_at, coordinator, duration]) {noformat} tracing no longer logs when the request completed - Key: CASSANDRA-7629 URL: https://issues.apache.org/jira/browse/CASSANDRA-7629 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Priority: Minor Fix For: 2.1.1 In 2.0 and before, there is a Request complete entry in tracing, which no longer appears in 2.1. This makes it difficult to reason about latency/performance problems in a trace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (CASSANDRA-7632) NPE in AutoSavingCache$Writer.deleteOldCacheFiles
[ https://issues.apache.org/jira/browse/CASSANDRA-7632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reassigned CASSANDRA-7632: --- Assignee: Marcus Eriksson NPE in AutoSavingCache$Writer.deleteOldCacheFiles - Key: CASSANDRA-7632 URL: https://issues.apache.org/jira/browse/CASSANDRA-7632 Project: Cassandra Issue Type: Bug Components: Core Reporter: Vishy Kasar Assignee: Marcus Eriksson Priority: Minor Fix For: 2.0.10 Observed this NPE in one of our production cluster (2.0.9). Does not seem to be causing harm but good to resolve. ERROR [CompactionExecutor:1188] 2014-07-27 21:57:08,225 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1188,1,main] clusterName=clouddb_p03 java.lang.NullPointerException at org.apache.cassandra.cache.AutoSavingCache$Writer.deleteOldCacheFiles(AutoSavingCache.java:265) at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:195) at org.apache.cassandra.db.compaction.CompactionManager$10.run(CompactionManager.java:862) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7632) NPE in AutoSavingCache$Writer.deleteOldCacheFiles
[ https://issues.apache.org/jira/browse/CASSANDRA-7632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-7632: Fix Version/s: 2.0.10 NPE in AutoSavingCache$Writer.deleteOldCacheFiles - Key: CASSANDRA-7632 URL: https://issues.apache.org/jira/browse/CASSANDRA-7632 Project: Cassandra Issue Type: Bug Components: Core Reporter: Vishy Kasar Assignee: Marcus Eriksson Priority: Minor Fix For: 2.0.10 Observed this NPE in one of our production cluster (2.0.9). Does not seem to be causing harm but good to resolve. ERROR [CompactionExecutor:1188] 2014-07-27 21:57:08,225 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1188,1,main] clusterName=clouddb_p03 java.lang.NullPointerException at org.apache.cassandra.cache.AutoSavingCache$Writer.deleteOldCacheFiles(AutoSavingCache.java:265) at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:195) at org.apache.cassandra.db.compaction.CompactionManager$10.run(CompactionManager.java:862) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7628) Tools java driver needs to be updated
[ https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076447#comment-14076447 ] Benedict commented on CASSANDRA-7628: - bq. going to require code changes. Could you elaborate? This is appears to be a Java Driver bug. Tools java driver needs to be updated - Key: CASSANDRA-7628 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Brandon Williams Assignee: Benedict Priority: Minor Fix For: 2.1.0 When you run stress currently you get a bunch of harmless stacktraces like: {noformat} ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions() will return null java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) [cassandra-driver-core-2.0.1.jar:na] at org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159) [stress/:na] at org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) [stress/:na] {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7628) Tools java driver needs to be updated
[ https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7628: -- Fix Version/s: (was: 2.1.0) 2.1.1 (This only affects stress. Pushing to 2.1.1.) Tools java driver needs to be updated - Key: CASSANDRA-7628 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Brandon Williams Assignee: Benedict Priority: Minor Fix For: 2.1.1 When you run stress currently you get a bunch of harmless stacktraces like: {noformat} ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions() will return null java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) [cassandra-driver-core-2.0.1.jar:na] at org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159) [stress/:na] at org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) [stress/:na] {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7628) Tools java driver needs to be updated
[ https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076450#comment-14076450 ] Jonathan Ellis edited comment on CASSANDRA-7628 at 7/28/14 5:43 PM: (This only affects stress and hadoop. Pushing to 2.1.1.) was (Author: jbellis): (This only affects stress. Pushing to 2.1.1.) Tools java driver needs to be updated - Key: CASSANDRA-7628 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Brandon Williams Assignee: Benedict Priority: Minor Fix For: 2.1.1 When you run stress currently you get a bunch of harmless stacktraces like: {noformat} ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions() will return null java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) [cassandra-driver-core-2.0.1.jar:na] at org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159) [stress/:na] at org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) [stress/:na] {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7628) Tools java driver needs to be updated
[ https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076461#comment-14076461 ] Brandon Williams commented on CASSANDRA-7628: - Well, I tried it and got compile errors in stress. https://github.com/datastax/java-driver/blob/2.1/driver-core/Upgrade_guide_to_2.1.rst Tools java driver needs to be updated - Key: CASSANDRA-7628 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Brandon Williams Assignee: Benedict Priority: Minor Fix For: 2.1.1 When you run stress currently you get a bunch of harmless stacktraces like: {noformat} ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions() will return null java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) [cassandra-driver-core-2.0.1.jar:na] at org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159) [stress/:na] at org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) [stress/:na] {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076459#comment-14076459 ] Matt Kennedy commented on CASSANDRA-7631: - Having a mechanism like this is extremely important for testing large scale clusters. We don't necessarily want/need to test a large scale ingest each time, so the sooner we can go from spinning up 100 nodes, to running a mixed workload, the better. If one invocation of stress can tell 100 stressd processes to write local SSTables according to the user defined yaml, that should be massively more efficient than running a write job. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire updated CASSANDRA-7582: Since Version: 2.1.1 (was: 2.1 rc3) 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076465#comment-14076465 ] Ryan McGuire commented on CASSANDRA-7582: - I think this was version tagged incorrectly. I'm seeing CASSANDRA-7593 on rc4 instead of this one. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7593) Errors when upgrading through several versions to 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire updated CASSANDRA-7593: Reproduced In: 2.1 rc4 Errors when upgrading through several versions to 2.1 - Key: CASSANDRA-7593 URL: https://issues.apache.org/jira/browse/CASSANDRA-7593 Project: Cassandra Issue Type: Bug Environment: java 1.7 Reporter: Russ Hatch Assignee: Marcus Eriksson Priority: Critical Fix For: 2.1.0 I'm seeing two different errors cropping up in the dtest which upgrades a cluster through several versions. This is the more common error: {noformat} ERROR [GossipStage:10] 2014-07-22 13:14:30,028 CassandraDaemon.java:168 - Exception in thread Thread[GossipStage:10,5,main] java.lang.AssertionError: null at org.apache.cassandra.db.filter.SliceQueryFilter.shouldInclude(SliceQueryFilter.java:347) ~[main/:na] at org.apache.cassandra.db.filter.QueryFilter.shouldInclude(QueryFilter.java:249) ~[main/:na] at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249) ~[main/:na] at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60) ~[main/:na] at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1873) ~[main/:na] at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1681) ~[main/:na] at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:345) ~[main/:na] at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:59) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.readLocally(SelectStatement.java:293) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:302) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:60) ~[main/:na] at org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:263) ~[main/:na] at org.apache.cassandra.db.SystemKeyspace.getPreferredIP(SystemKeyspace.java:514) ~[main/:na] at org.apache.cassandra.net.OutboundTcpConnectionPool.init(OutboundTcpConnectionPool.java:51) ~[main/:na] at org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:522) ~[main/:na] at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:536) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:689) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:663) ~[main/:na] at org.apache.cassandra.service.EchoVerbHandler.doVerb(EchoVerbHandler.java:40) ~[main/:na] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) ~[main/:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_60] at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] {noformat} The same test sometimes fails with this exception instead: {noformat} ERROR [CompactionExecutor:4] 2014-07-22 16:18:21,008 CassandraDaemon.java:168 - Exception in thread Thread[CompactionExecutor:4,1,RMI Runtime] java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7059d3e9 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@108f1504[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 95] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) ~[na:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) ~[na:1.7.0_60] at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325) ~[na:1.7.0_60] at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530) ~[na:1.7.0_60] at java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:619) ~[na:1.7.0_60] at org.apache.cassandra.io.sstable.SSTableReader.scheduleTidy(SSTableReader.java:628) ~[main/:na] at org.apache.cassandra.io.sstable.SSTableReader.tidy(SSTableReader.java:609) ~[main/:na] at
[jira] [Commented] (CASSANDRA-7628) Tools java driver needs to be updated
[ https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076466#comment-14076466 ] Benedict commented on CASSANDRA-7628: - Ah, right. Version namespace clash with java driver confused me. Tools java driver needs to be updated - Key: CASSANDRA-7628 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Brandon Williams Assignee: Benedict Priority: Minor Fix For: 2.1.1 When you run stress currently you get a bunch of harmless stacktraces like: {noformat} ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions() will return null java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) ~[cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) [cassandra-driver-core-2.0.1.jar:na] at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) [cassandra-driver-core-2.0.1.jar:na] at org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177) [stress/:na] at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159) [stress/:na] at org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) [stress/:na] {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire updated CASSANDRA-7582: Fix Version/s: (was: 2.1.0) 2.1.1 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076469#comment-14076469 ] Jonathan Ellis commented on CASSANDRA-7582: --- Thanks, Ryan. Benedict, I'm starting to think 7125 was misguided. If the CL errors out there just isn't much you can do about it except pass the flag and try again, so why not cut out the extra step? 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7593) Errors when upgrading through several versions to 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076472#comment-14076472 ] Benedict commented on CASSANDRA-7593: - This fix doesn't address the situation where we legitimately get an sstable with only range tombstones where the lowerbound has fewer components to the upper bounds (let's say we have a flood of DELETE from T where pk=K and a X and (a,b) (Y, Z)) Also, whilst we _may_ separately want to insert an EOC.START (since we now populate it, so presumably it should be populated, though it would be good to understand why this is now the case to document here for posterity), according to the commenting in the grabbing of min/max column names, we only care about clustering columns, so with or without the extra EOC we should not be fetching the whole of this BoundedComposite into min/max - we should only be fetching up to the number of clustering columns (0). Or we should be fetching the column name (and potentially any further components for sets/maps/etc.) from CellName as well. Errors when upgrading through several versions to 2.1 - Key: CASSANDRA-7593 URL: https://issues.apache.org/jira/browse/CASSANDRA-7593 Project: Cassandra Issue Type: Bug Environment: java 1.7 Reporter: Russ Hatch Assignee: Marcus Eriksson Priority: Critical Fix For: 2.1.0 I'm seeing two different errors cropping up in the dtest which upgrades a cluster through several versions. This is the more common error: {noformat} ERROR [GossipStage:10] 2014-07-22 13:14:30,028 CassandraDaemon.java:168 - Exception in thread Thread[GossipStage:10,5,main] java.lang.AssertionError: null at org.apache.cassandra.db.filter.SliceQueryFilter.shouldInclude(SliceQueryFilter.java:347) ~[main/:na] at org.apache.cassandra.db.filter.QueryFilter.shouldInclude(QueryFilter.java:249) ~[main/:na] at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249) ~[main/:na] at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60) ~[main/:na] at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1873) ~[main/:na] at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1681) ~[main/:na] at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:345) ~[main/:na] at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:59) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.readLocally(SelectStatement.java:293) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:302) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:60) ~[main/:na] at org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:263) ~[main/:na] at org.apache.cassandra.db.SystemKeyspace.getPreferredIP(SystemKeyspace.java:514) ~[main/:na] at org.apache.cassandra.net.OutboundTcpConnectionPool.init(OutboundTcpConnectionPool.java:51) ~[main/:na] at org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:522) ~[main/:na] at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:536) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:689) ~[main/:na] at org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:663) ~[main/:na] at org.apache.cassandra.service.EchoVerbHandler.doVerb(EchoVerbHandler.java:40) ~[main/:na] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) ~[main/:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_60] at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] {noformat} The same test sometimes fails with this exception instead: {noformat} ERROR [CompactionExecutor:4] 2014-07-22 16:18:21,008 CassandraDaemon.java:168 - Exception in thread Thread[CompactionExecutor:4,1,RMI Runtime] java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7059d3e9 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@108f1504[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 95] at
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076509#comment-14076509 ] Benedict commented on CASSANDRA-7582: - I'm -1000 on encountering an error and silently swallowing it on something as core to correctness as the commit log - this at least gives the user a big red flag they may want to seek expert help. I think there are two distinct problems here - there are the 'unexpected' errors which should almost certainly involve the user seeking help from an expert to diagnose (or perhaps JIRA, since it possibly means a bug), and the unknown table exceptions. The latter are debatably more ok to ignore, but I would much rather we simply retain information about dropped tables, much as we do truncated tables, so that we can suppress those known to have been dropped (with knowledge of exactly _when_ they were dropped, so if we see CL records past that time we can still fail and ask the user to at least file a bug report). Consider the following (pretty plausible scenario): * User turns on CL saving * User creates table X, populates it with some data (let's say it's a fairly static dataset) * User uses the database for a period, mostly changing other tables * At time T, user drops table X, recreates it (instead of, e.g. truncate (which is separately also subtly dangerous in this scenario), and repopulates it with subtly but business-wise importantly different data * Some time after T, user has to restore the cluster, and restores the schema from prior to T by mistake (let's say the team member restoring doesn't realise the table was recreated since then), then performs a PIT restore The user now has no idea they have stale business data in their tables. Now, assuming we have saved the ids of all dropped tables we could report to the user that they are likely restoring data from a future schema, and they could then decide if this was safe or not; in this case they would be able to restore a newer schema (assuming they had saved it) and a major business error would have been averted. In general this fail-fast is likely to result in an increase in JIRA filing, and possibly for relatively benign bugs, but on the whole I would prefer that scenario than leaving subtle bugs in the CL. We've already caught at least one as a result of this, and we've had long standing bugs with respect to drain that still affect 2.0 that would have been caught a long time ago with better reporting. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076516#comment-14076516 ] Benedict commented on CASSANDRA-7582: - Hmm. Separately this scenario also points out that TRUNCATE is even more broken than I thought - since it doesn't get logged to the CL, if you restore a schema prior to a TRUNCATE you will simply get the old data supplemented with the new data. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076540#comment-14076540 ] Jonathan Ellis commented on CASSANDRA-7582: --- I see two actual classes of CL errors: # Table is dropped and we are replaying stale data that should also have been dropped. Blocking startup is the Wrong Solution. # Hardware problem caused a checksum mismatch. Blocking startup is the Wrong Solution. Granted that blocking startup can help prevent user errors during PIT recover, that's an entirely hypothetical situation today; PIT is only nominally usable. (Fork the JVM every time a CL segment finishes? Yeah.) So let's not optimize for that at the expense of scenarios we see frequently. I think we should roll back 7125 until we can do it right. Doing it right probably means, remembering old cfids in 2.1.x, then we can get paranoid about seeing them in the CL for 3.0. (Getting paranoid in the same version as we start remembering is bad for obvious reasons.) 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076586#comment-14076586 ] Benedict commented on CASSANDRA-7582: - 3. We've busted something This is the main type I'm trying to catch with this behaviour. I would prefer to know earlier if 2.1 is broken instead of corrupting the user's CL in some way without realising it. We've had several bugs in the 2.1 release cycle that would have been caught earlier had we had this feature enabled, and I would be surprised if we don't see some more once it gets released into the wild as a result of this. There are still bugs in 2.0 that we've fixed in 2.1 that we would certainly have caught earlier. Enforcing correctness from other avenues is a strong secondary concern. This isn't a point of optimisation, we're talking about providing an unsafe PIT feature (and we've already got a ticket filed for removing forking), and also more importantly risking an unsafe regular _replay_. I disagree that hardware problem causing checksum mismatch shouldn't block startup - in this case you may have alternative copies of the data that are not corrupted, or can choose to analyse the logs yourself to establish what is happening. If you don't care, you set the don't care flag; but without the failure you maybe don't even know there are records that haven't been replayed (possibly whole files) 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076721#comment-14076721 ] Jonathan Ellis commented on CASSANDRA-7582: --- But that's still not a common *production* scenario. So we're still optimizing bassackwards. How about this? Leave the checks in, but backwards: they're disabled, *unless* there's a flag. Then we set the flag in utest and dtest. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076746#comment-14076746 ] Benedict commented on CASSANDRA-7582: - What isn't a common production scenario? Commit Log bugs? We know there are some still in 2.0. There are potentially some in 2.1 too, and we probably won't spot them without something like this to help users know they encountered them and report them. Optimizing != Correctness. I am very negative on disabling this. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076786#comment-14076786 ] T Jake Luciani commented on CASSANDRA-7631: --- So you aren't interested in stressing writes, you only care about reads? Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076793#comment-14076793 ] Joshua McKenzie commented on CASSANDRA-7523: I ended up going outside scope of strictly C* changes on this while testing - here's a snapshot of what I have thus far: * [cql-internal python driver changes|https://github.com/josh-mckenzie/cql-internal/compare/7523] * [cqlshlib and cqlsh changes|https://github.com/josh-mckenzie/cassandra/compare/7523_cqlshlib] * [Java type addition|https://github.com/josh-mckenzie/cassandra/compare/7523_java_types_only] * [Combined commit, new cql-internal archive|https://github.com/josh-mckenzie/cassandra/compare/7523_combined] A few points I could use some feedback on: # Is it reasonable to consider the new Date and Time types DATETIME w/regards to PEP249? # What kind of conversion enforcement do we want on SimpleDate and Time types? I'm thinking reduction only w/warning on both, promotion to Timestamp w/Date object. # I don't like changing the ui-time_format cqlshrc option underneath people but if we add a time type and time_format points to timestamp... I still have some testing to implement (cqlsh, unit, potentially python driver if we merge these changes in) but wanted to get this out there to get feedback since this is a new area of the code-base for me. add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Fix For: 2.0.10 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076797#comment-14076797 ] Jonathan Ellis commented on CASSANDRA-7582: --- I'm skeptical. Looking at the 2.0 changelog, we've fixed CASSANDRA-6652 and CASSANDRA-6714 since 2.0.0 final, and this wouldn't have helped catch those. So, I'm not saying that ignoring errors is a Good Thing, but when there's more false positives than true positives, users will learn to ignore it anyway and we're not actually helping anyone. At the very least, this is demonstrably broken in 2.1.1 given this ticket right here. So I see two reasonable courses of action: # remembering old cfids in 2.1.x, then we can get paranoid about seeing them in the CL for 3.0. # using the checks as a kind of assert that we enable for tests but not (without opt-in) for production I'm open to alternatives, but leaving things the way they are now is not one of them. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7523: -- Reviewer: Tyler Hobbs [~thobbs] to review add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Fix For: 2.0.10 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076801#comment-14076801 ] Matt Kennedy commented on CASSANDRA-7631: - In many cases, we primarily care about mixed workloads, but those need a populated cluster to run on. So yes, writes are important, but mostly in the context of concurrent reads also happening. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7594) Disruptor Thrift server worker thread pool not adjustable
[ https://issues.apache.org/jira/browse/CASSANDRA-7594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076805#comment-14076805 ] Pavel Yaskevich commented on CASSANDRA-7594: [~rbranson] That was on my list for a while now but nobody seemed to care so I de-prioritized it, thanks for reporting! I'm currently OOO but will try to look into it ASAP. Disruptor Thrift server worker thread pool not adjustable - Key: CASSANDRA-7594 URL: https://issues.apache.org/jira/browse/CASSANDRA-7594 Project: Cassandra Issue Type: Bug Reporter: Rick Branson Assignee: Pavel Yaskevich For the THsHaDisruptorServer, there may not be enough threads to run blocking StorageProxy methods. The current number of worker threads is hardcoded at 2 per selector, so 2 * numAvailableProcessors(), or 64 threads on a 16-core hyperthreaded machine. StorageProxy methods block these threads, so this puts an upper bound on the throughput if hsha is enabled. If operations take 10ms on average, the node can only handle a maximum of 6,400 operations per second. This is a regression from hsha on 1.2.x, where the thread pool was tunable using rpc_min_threads and rpc_max_threads. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076820#comment-14076820 ] Benedict commented on CASSANDRA-7582: - I am less adamant about CfId checks as I am about failing on commit log checksum/mutation replay failures. I could just about live with (2), but naturally we will get better coverage by enabling this with all users. We don't know what bugs we might catch with it. So, I would prefer one of: 1) Start remembering old cfids in 2.1.1 along with this feature, so we can start complaining immediately; or 2) For now simply assert on non-CfId errors (i.e. make that opt-in rather than opt-out), introduce CfId recording at some point and make it opt-out at some point after 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076843#comment-14076843 ] T Jake Luciani commented on CASSANDRA-7631: --- So you want a way to quickly get a bunch of data on the cluster, then run a mixed workload using traditional cql reads/writes? Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076854#comment-14076854 ] Matt Kennedy commented on CASSANDRA-7631: - Yes, ideally formatted using your new user-defined schema stuff. I don't mean to speak for Russ, but we fleshed out this idea jointly. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7633) Speculating retry for LOCAL_QUORUM send requests to other DC
sankalp kohli created CASSANDRA-7633: Summary: Speculating retry for LOCAL_QUORUM send requests to other DC Key: CASSANDRA-7633 URL: https://issues.apache.org/jira/browse/CASSANDRA-7633 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Minor C* can potentially send an extra request to other DC for LOCAL_QUORUM which did not get counted. This is a waste effort and we should not send this request. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076906#comment-14076906 ] Russell Alexander Spitzer commented on CASSANDRA-7631: -- +1 Basically Put a TB on the cluster as fast as possible, Then run a mixed user-defined workload Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076976#comment-14076976 ] Brandon Williams commented on CASSANDRA-7631: - I'll just note that stress itself is probably the wrong place for this, it'll likely need to be a new utility that uses SSTableSimpleUnsortedWriter. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077005#comment-14077005 ] Benedict commented on CASSANDRA-7631: - Stress seems like a perfectly reasonable place to put this, really. It also means we know the data generated is compatible with the stress workload, which is important. It's even possible to have stress output one single file per node in one pass, but that would require some (small-ish) amount of work. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
[ https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077018#comment-14077018 ] Benedict commented on CASSANDRA-7546: - My biggest concern with metrics is that what we expose as a metric will probably change when we change tack to a lock-free lazy-update design, since it will be more expensive to maintain. Certainly tracking the amount of 'wasted' work will be meaningless then, although possibly we could track the raw occurrences of failure to make a change atomically without interference (which in the lazy case would be failure to acquire exclusivity to merge your changes in) I'm currently on holiday but will try to review your patch shortly. AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory - Key: CASSANDRA-7546 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546 Project: Cassandra Issue Type: Bug Components: Core Reporter: graham sanderson Assignee: graham sanderson Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 7546.20_4.txt, 7546.20_5.txt, 7546.20_alt.txt, suggestion1.txt, suggestion1_21.txt In order to preserve atomicity, this code attempts to read, clone/update, then CAS the state of the partition. Under heavy contention for updating a single partition this can cause some fairly staggering memory growth (the more cores on your machine the worst it gets). Whilst many usage patterns don't do highly concurrent updates to the same partition, hinting today, does, and in this case wild (order(s) of magnitude more than expected) memory allocation rates can be seen (especially when the updates being hinted are small updates to different partitions which can happen very fast on their own) - see CASSANDRA-7545 It would be best to eliminate/reduce/limit the spinning memory allocation whilst not slowing down the very common un-contended case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077020#comment-14077020 ] Russell Alexander Spitzer commented on CASSANDRA-7631: -- https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java wraps SSTableSimpleUnsorted Writer so I think we are ok there. The main reason I would like this as part of stress is that we already have all the data generation code backed in for arbitrary schemas, Thanks [~tjake]! This way we could prepare for a test that uses a large amount of data and a mixed workload much faster. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077020#comment-14077020 ] Russell Alexander Spitzer edited comment on CASSANDRA-7631 at 7/28/14 10:32 PM: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java wraps SSTableSimpleUnsorted Writer so I think we are ok there. The main reason I would like this as part of stress is that we already have all the data generation code written in for arbitrary schemas, Thanks [~tjake]! This way we could prepare for a test that writes a large amount of data and then runs a mixed workload much faster. was (Author: rspitzer): https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java wraps SSTableSimpleUnsorted Writer so I think we are ok there. The main reason I would like this as part of stress is that we already have all the data generation code backed in for arbitrary schemas, Thanks [~tjake]! This way we could prepare for a test that uses a large amount of data and a mixed workload much faster. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077050#comment-14077050 ] Brandon Williams commented on CASSANDRA-7631: - bq. Stress seems like a perfectly reasonable place to put this, really. It also means we know the data generated is compatible with the stress workload, which is important. I agree with your latter point, but we could still reuse the code in a separate utility. It just seems like stress has enough options as it is, and introducing an sstable writer would make a lot of them nonsensical (like consistency level, replication, etc.) I'd somewhat prefer having a clear delineation, util-wise, between going over the network and writing to disk. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
[ https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077057#comment-14077057 ] graham sanderson commented on CASSANDRA-7546: - Ok, thank you... yeah my only reason for recording something in the actual codebase was to indicate that to the user that they had ultra heavy partition contention that might be detrimental to performance, and they should perhaps review their schema. Given that this may not be the case at all in 3.0 (i.e. it may be gracefully handled in all cases), I'll try out locally with a WARN statement instead. I'll probably do it at memtable flush anyway which has more useful context (e.g. the CF in question), and would be less spam-y (i.e. one warn with the number of contended partitions, though perhaps the contended key(s) are interesting at a lower log level)... whether we include such logging in the final patch I don't know. AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory - Key: CASSANDRA-7546 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546 Project: Cassandra Issue Type: Bug Components: Core Reporter: graham sanderson Assignee: graham sanderson Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 7546.20_4.txt, 7546.20_5.txt, 7546.20_alt.txt, suggestion1.txt, suggestion1_21.txt In order to preserve atomicity, this code attempts to read, clone/update, then CAS the state of the partition. Under heavy contention for updating a single partition this can cause some fairly staggering memory growth (the more cores on your machine the worst it gets). Whilst many usage patterns don't do highly concurrent updates to the same partition, hinting today, does, and in this case wild (order(s) of magnitude more than expected) memory allocation rates can be seen (especially when the updates being hinted are small updates to different partitions which can happen very fast on their own) - see CASSANDRA-7545 It would be best to eliminate/reduce/limit the spinning memory allocation whilst not slowing down the very common un-contended case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077063#comment-14077063 ] Benedict commented on CASSANDRA-7631: - Well, this sort of fits in with an extension I would like to make, which is in-process stressing (i.e. to avoid going over the network, and if feasible optionally avoid going through the native protocol), for which many of those options would also be meaningless. I don't see why we couldn't provide a separate shell script that makes some of the options easier, but I think this makes most sense living directly in stress itself; we can either ignore, or complain, if unrelated options are set. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077063#comment-14077063 ] Benedict edited comment on CASSANDRA-7631 at 7/28/14 10:49 PM: --- Well, this sort of fits in with an extension I would like to make, which is in-process stressing (i.e. to avoid going over the network), for which many of those options would also be meaningless. I don't see why we couldn't provide a separate shell script that makes some of the options easier, but I think this makes most sense living directly in stress itself; we can either ignore, or complain, if unrelated options are set. was (Author: benedict): Well, this sort of fits in with an extension I would like to make, which is in-process stressing (i.e. to avoid going over the network, and if feasible optionally avoid going through the native protocol), for which many of those options would also be meaningless. I don't see why we couldn't provide a separate shell script that makes some of the options easier, but I think this makes most sense living directly in stress itself; we can either ignore, or complain, if unrelated options are set. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077076#comment-14077076 ] Brandon Williams commented on CASSANDRA-7631: - bq. we can either ignore, or complain, if unrelated options are set. As someone who has screwed up the stress options more than once, consider this my vote for 'complain' :) Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7601) Data loss after nodetool taketoken
[ https://issues.apache.org/jira/browse/CASSANDRA-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-7601: --- Labels: qa-resolved (was: ) Data loss after nodetool taketoken -- Key: CASSANDRA-7601 URL: https://issues.apache.org/jira/browse/CASSANDRA-7601 Project: Cassandra Issue Type: Bug Components: Core, Tests Environment: Mac OSX Mavericks. Ubuntu 14.04 Reporter: Philip Thompson Assignee: Brandon Williams Priority: Minor Labels: qa-resolved Fix For: 2.0.10, 2.1.0 Attachments: 7601-2.0.txt, 7601-2.1.txt, consistent_bootstrap_test.py, taketoken.tar.gz The dtest consistent_bootstrap_test.py:TestBootstrapConsistency.consistent_reads_after_relocate_test is failing on HEAD of the git branches 2.1 and 2.1.0. The test performs the following actions: - Create a cluster of 3 nodes - Create a keyspace with RF 2 - Take node 3 down - Write 980 rows to node 2 with CL ONE - Flush node 2 - Bring node 3 back up - Run nodetool taketoken on node 3 to transfer 80% of node 1's tokens to node 3 - Check for data loss When the check for data loss is performed, only ~725 rows can be read via CL ALL. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7601) Data loss after nodetool taketoken
[ https://issues.apache.org/jira/browse/CASSANDRA-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077083#comment-14077083 ] Philip Thompson commented on CASSANDRA-7601: Lgtm from a test perspective now that the relevant tests have been removed. The original problem is (clearly) solved with the removal of shuffle/taketoken. Data loss after nodetool taketoken -- Key: CASSANDRA-7601 URL: https://issues.apache.org/jira/browse/CASSANDRA-7601 Project: Cassandra Issue Type: Bug Components: Core, Tests Environment: Mac OSX Mavericks. Ubuntu 14.04 Reporter: Philip Thompson Assignee: Brandon Williams Priority: Minor Labels: qa-resolved Fix For: 2.0.10, 2.1.0 Attachments: 7601-2.0.txt, 7601-2.1.txt, consistent_bootstrap_test.py, taketoken.tar.gz The dtest consistent_bootstrap_test.py:TestBootstrapConsistency.consistent_reads_after_relocate_test is failing on HEAD of the git branches 2.1 and 2.1.0. The test performs the following actions: - Create a cluster of 3 nodes - Create a keyspace with RF 2 - Take node 3 down - Write 980 rows to node 2 with CL ONE - Flush node 2 - Bring node 3 back up - Run nodetool taketoken on node 3 to transfer 80% of node 1's tokens to node 3 - Check for data loss When the check for data loss is performed, only ~725 rows can be read via CL ALL. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7634) cqlsh error tracing CAS
dan jatnieks created CASSANDRA-7634: --- Summary: cqlsh error tracing CAS Key: CASSANDRA-7634 URL: https://issues.apache.org/jira/browse/CASSANDRA-7634 Project: Cassandra Issue Type: Bug Components: Tools Reporter: dan jatnieks Priority: Minor On branch cassandra-2.1.0 Getting message {{'NoneType' object has no attribute 'microseconds'}} from cqlsh while tracing a CAS statement. {noformat} Connected to devc-large at 146.148.39.53:9042. [cqlsh 5.0.1 | Cassandra 2.1.0-rc4-SNAPSHOT | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh use test2; cqlsh:test2 update cas set c2 = 2 where c1 = 1 if c3 = 1; applied - True cqlsh:test2 tracing on; Now tracing requests. cqlsh:test2 update cas set c2 = 2 where c1 = 1 if c3 = 1; applied - True 'NoneType' object has no attribute 'microseconds' cqlsh:test2 {noformat} Tracing {{select *}} from the same table works as expected, but tracing the conditional update results in the error. More details: {noformat} cqlsh:test2 desc keyspace CREATE KEYSPACE test2 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; CREATE TABLE test2.cas ( c1 int PRIMARY KEY, c2 int, c3 int ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; cqlsh:test2 tracing on; Tracing is already enabled. Use TRACING OFF to disable. cqlsh:test2 select * from cas; c1 | c2 | c3 ++ 1 | 2 | 1 (1 rows) Tracing session: 8f0c8340-16ae-11e4-8ca3-fb429d8fb4a7 activity | timestamp | source | source_elapsed ---+++ Execute CQL3 query | 2014-07-28 16:26:10.804000 | 10.240.139.181 | 0 Parsing select * from cas LIMIT 1; [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 10.240.139.181 | 78 Preparing statement [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 10.240.139.181 |282 Determining replicas to query [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 10.240.139.181 |501 Enqueuing request to /10.240.189.138 [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 10.240.139.181 |918 Sending message to /10.240.189.138 [WRITE-/10.240.189.138] | 2014-07-28 16:26:10.805000 | 10.240.139.181 | 1095 Message received from /10.240.139.181 [Thread-27] | 2014-07-28 16:26:10.805000 | 10.240.189.138 | 28 Executing seq scan across 0 sstables for [min(-9223372036854775808), max(-4611686018427387904)] [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 10.240.189.138 |384 Scanned 0 rows and matched 0 [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 10.240.189.138 |481 Enqueuing response to /10.240.139.181 [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 10.240.189.138 |570 Sending message to /10.240.139.181 [WRITE-/10.240.139.181] | 2014-07-28 16:26:10.806000 | 10.240.189.138 |735 Message received from /10.240.189.138 [Thread-30] | 2014-07-28 16:26:10.807000 | 10.240.139.181 | 3264 Processing response from /10.240.189.138 [SharedPool-Worker-2] | 2014-07-28 16:26:10.807000 | 10.240.139.181 |
[jira] [Commented] (CASSANDRA-7409) Allow multiple overlapping sstables in L1
[ https://issues.apache.org/jira/browse/CASSANDRA-7409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077156#comment-14077156 ] Carl Yeksigian commented on CASSANDRA-7409: --- I have a first cut of this working now at https://github.com/carlyeks/cassandra/tree/overlapping This adds a new compaction strategy called 'Overlapping', which operates mostly the same as 'Leveled' when max_overlapping_level is configured to 0, except L0 does not do any STCS. When max_overlapping_level is set to non-zero, it will compact without selecting non-overlapping sstables, and will not include any sstables from an upper level. Also, added a new nodetool command to list the sstables in each level for both leveled and overlapping. I haven't benchmarked this strategy yet to compare with regular leveled; that's going to be what I work on next for this. Allow multiple overlapping sstables in L1 - Key: CASSANDRA-7409 URL: https://issues.apache.org/jira/browse/CASSANDRA-7409 Project: Cassandra Issue Type: Improvement Reporter: Carl Yeksigian Assignee: Carl Yeksigian Currently, when a normal L0 compaction takes place (not STCS), we take up to MAX_COMPACTING_L0 L0 sstables and all of the overlapping L1 sstables and compact them together. If we didn't have to deal with the overlapping L1 tables, we could compact a higher number of L0 sstables together into a set of non-overlapping L1 sstables. This could be done by delaying the invariant that L1 has no overlapping sstables. Going from L1 to L2, we would be compacting fewer sstables together which overlap. When reading, we will not have the same one sstable per level (except L0) guarantee, but this can be bounded (once we have too many sets of sstables, either compact them back into the same level, or compact them up to the next level). This could be generalized to allow any level to be the maximum for this overlapping strategy. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7518) The In-Memory option
[ https://issues.apache.org/jira/browse/CASSANDRA-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077227#comment-14077227 ] Hanson commented on CASSANDRA-7518: --- I do not have access to DataStax JIRA. My another post on stackoverflow: http://stackoverflow.com/questions/24719276/cassandra-in-memory-option It mentioned that the DataStax White Paper asserts that in upcoming version the amount of memory for single node will increase probably via JNA, but no solid timeline. The In-Memory option Key: CASSANDRA-7518 URL: https://issues.apache.org/jira/browse/CASSANDRA-7518 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Hanson Fix For: 2.1.0 There is an In-Memory option introduced in the commercial version of Cassandra by DataStax Enterprise 4.0: http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/inMemory.html But with 1GB size limited for an in-memory table. It would be great if the In-Memory option can be available to the community version of Cassandra, and extend to a large size of in-memory table, such as 64GB. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077317#comment-14077317 ] Jonathan Ellis commented on CASSANDRA-7582: --- I suppose we can compromise on enabling the check as soon as we remember the cfids, even though that leaves a hole where we can false-positive on upgrade. How sure are you that the MalformedCommitLogException aren't going to false-positive on power failure? On first inspection all of those except the serializedSize check look like they will be prone to that. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (CASSANDRA-7634) cqlsh error tracing CAS
[ https://issues.apache.org/jira/browse/CASSANDRA-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-7634. --- Resolution: Duplicate cqlsh error tracing CAS --- Key: CASSANDRA-7634 URL: https://issues.apache.org/jira/browse/CASSANDRA-7634 Project: Cassandra Issue Type: Bug Components: Tools Reporter: dan jatnieks Priority: Minor Labels: cqlsh On branch cassandra-2.1.0 Getting message {{'NoneType' object has no attribute 'microseconds'}} from cqlsh while tracing a CAS statement. {noformat} Connected to devc-large at 146.148.39.53:9042. [cqlsh 5.0.1 | Cassandra 2.1.0-rc4-SNAPSHOT | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh use test2; cqlsh:test2 update cas set c2 = 2 where c1 = 1 if c3 = 1; applied - True cqlsh:test2 tracing on; Now tracing requests. cqlsh:test2 update cas set c2 = 2 where c1 = 1 if c3 = 1; applied - True 'NoneType' object has no attribute 'microseconds' cqlsh:test2 {noformat} Tracing {{select *}} from the same table works as expected, but tracing the conditional update results in the error. More details: {noformat} cqlsh:test2 desc keyspace CREATE KEYSPACE test2 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; CREATE TABLE test2.cas ( c1 int PRIMARY KEY, c2 int, c3 int ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; cqlsh:test2 tracing on; Tracing is already enabled. Use TRACING OFF to disable. cqlsh:test2 select * from cas; c1 | c2 | c3 ++ 1 | 2 | 1 (1 rows) Tracing session: 8f0c8340-16ae-11e4-8ca3-fb429d8fb4a7 activity | timestamp | source | source_elapsed ---+++ Execute CQL3 query | 2014-07-28 16:26:10.804000 | 10.240.139.181 | 0 Parsing select * from cas LIMIT 1; [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 10.240.139.181 | 78 Preparing statement [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 10.240.139.181 |282 Determining replicas to query [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 10.240.139.181 |501 Enqueuing request to /10.240.189.138 [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 10.240.139.181 |918 Sending message to /10.240.189.138 [WRITE-/10.240.189.138] | 2014-07-28 16:26:10.805000 | 10.240.139.181 | 1095 Message received from /10.240.139.181 [Thread-27] | 2014-07-28 16:26:10.805000 | 10.240.189.138 | 28 Executing seq scan across 0 sstables for [min(-9223372036854775808), max(-4611686018427387904)] [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 10.240.189.138 |384 Scanned 0 rows and matched 0 [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 10.240.189.138 |481 Enqueuing response to /10.240.139.181 [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 10.240.189.138 |570 Sending message to /10.240.139.181 [WRITE-/10.240.139.181] | 2014-07-28 16:26:10.806000 | 10.240.189.138 |735 Message received
[jira] [Created] (CASSANDRA-7635) Make hinted_handoff_throttle_delay_in_ms configurable via nodetool
Matt Stump created CASSANDRA-7635: - Summary: Make hinted_handoff_throttle_delay_in_ms configurable via nodetool Key: CASSANDRA-7635 URL: https://issues.apache.org/jira/browse/CASSANDRA-7635 Project: Cassandra Issue Type: Improvement Reporter: Matt Stump Priority: Minor Transfer of stored hints can peg the CPU of the node performing the sending of the hints. We have a throttle hinted_handoff_throttle_delay_in_ms, but it requires a restart. It would be helpful if this were configurable via nodetool to avoid the reboot. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (CASSANDRA-7635) Make hinted_handoff_throttle_delay_in_ms configurable via nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-7635: - Assignee: Lyuben Todorov Make hinted_handoff_throttle_delay_in_ms configurable via nodetool -- Key: CASSANDRA-7635 URL: https://issues.apache.org/jira/browse/CASSANDRA-7635 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Matt Stump Assignee: Lyuben Todorov Priority: Minor Fix For: 2.0.10 Transfer of stored hints can peg the CPU of the node performing the sending of the hints. We have a throttle hinted_handoff_throttle_delay_in_ms, but it requires a restart. It would be helpful if this were configurable via nodetool to avoid the reboot. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7635) Make hinted_handoff_throttle_delay_in_ms configurable via nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7635: -- Component/s: Tools Fix Version/s: 2.0.10 Make hinted_handoff_throttle_delay_in_ms configurable via nodetool -- Key: CASSANDRA-7635 URL: https://issues.apache.org/jira/browse/CASSANDRA-7635 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Matt Stump Assignee: Lyuben Todorov Priority: Minor Fix For: 2.0.10 Transfer of stored hints can peg the CPU of the node performing the sending of the hints. We have a throttle hinted_handoff_throttle_delay_in_ms, but it requires a restart. It would be helpful if this were configurable via nodetool to avoid the reboot. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077359#comment-14077359 ] T Jake Luciani commented on CASSANDRA-7631: --- bq. As someone who has screwed up the stress options more than once, consider this my vote for 'complain' I also find the stress options/help un-intuitive. I'd like to see if we can use airline to address this under a different ticket. Allow Stress to write directly to SSTables -- Key: CASSANDRA-7631 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer One common difficulty with benchmarking machines is the amount of time it takes to initially load data. For machines with a large amount of ram this becomes especially onerous because a very large amount of data needs to be placed on the machine before page-cache can be circumvented. To remedy this I suggest we add a top level flag to Cassandra-Stress which would cause the tool to write directly to sstables rather than actually performing CQL inserts. Internally this would use CQLSStable writer to write directly to sstables while skipping any keys which are not owned by the node stress is running on. The same stress command run on each node in the cluster would then write unique sstables only containing data which that node is responsible for. Following this no further network IO would be required to distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)