[jira] [Commented] (CASSANDRA-6977) attempting to create 10K column families fails with 100 node cluster

2014-07-28 Thread Michael Nelson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075947#comment-14075947
 ] 

Michael Nelson commented on CASSANDRA-6977:
---

This is a showstopper for a very large customer. They need the ability to 
create new keyspaces as they add new customers. Their use case is 
multi-tenancy, due to HIPPA and PCI, so that each customer is a separate 
keyspace, keeping the data separate. 

 attempting to create 10K column families fails with 100 node cluster
 

 Key: CASSANDRA-6977
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6977
 Project: Cassandra
  Issue Type: Bug
 Environment: 100 nodes, Ubuntu 12.04.3 LTS, AWS m1.large instances
Reporter: Daniel Meyer
Assignee: Russ Hatch
Priority: Minor
 Attachments: 100_nodes_all_data.png, all_data_5_nodes.png, 
 keyspace_create.py, logs.tar, tpstats.txt, visualvm_tracer_data.csv


 During this test we are attempting to create a total of 1K keyspaces with 10 
 column families each to bring the total column families to 10K.  With a 5 
 node cluster this operation can be completed; however, it fails with 100 
 nodes.  Please see the two charts.  For the 5 node case the time required to 
 create each keyspace and subsequent 10 column families increases linearly 
 until the number of keyspaces is 1K.  For a 100 node cluster there is a 
 sudden increase in latency between 450 keyspaces and 550 keyspaces.  The test 
 ends when the test script times out.  After the test script times out it is 
 impossible to reconnect to the cluster with the datastax python driver 
 because it cannot connect to the host:
 cassandra.cluster.NoHostAvailable: ('Unable to connect to any servers', 
 {'10.199.5.98': OperationTimedOut()}
 It was found that running the following stress command does work from the 
 same machine the test script runs on.
 cassandra-stress -d 10.199.5.98 -l 2 -e QUORUM -L3 -b -o INSERT
 It should be noted that this test was initially done with DSE 4.0 and c* 
 version 2.0.5.24 and in that case it was not possible to run stress against 
 the cluster even locally on a node due to not finding the host.
 Attached are system logs from one of the nodes, charts showing schema 
 creation latency for 5 and 100 node clusters and virtualvm tracer data for 
 cpu, memory, num_threads and gc runs, tpstat output and the test script.
 The test script was on an m1.large aws instance outside of the cluster under 
 test.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7575) Custom 2i validation

2014-07-28 Thread Sergio Bossa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076044#comment-14076044
 ] 

Sergio Bossa commented on CASSANDRA-7575:
-

[~adelapena], following the review of your patch, I believe that while it works 
in practice, pulling the index searchers in SelectStatement#getRangeCommand and 
validating them that way is a bit odd, more specifically:
* SelectStatement#getRangeCommand may be called even if a 2i query is not 
present, so enforcing 2i validation there is a bit misleading and unexpected.
* SecondaryIndexSearcher#validate is called with the whole list of index 
expressions, which means each searcher implementation will have to go through 
the list to inspect each expression and decide if that specific expression was 
targeted for it and is wrong, or was just for another searcher.

I'd rather rework the patch in the following way:
* Add a SecondaryIndexManager#validateIndexSearchersForQuery method that works 
similarly to getIndexSearchersForQuery, but rather than just getting the index 
by each column, it also validates it against the proper column/expression by 
calling SecondaryIndexSearcher#validate(IndexExpression).
* Call SecondaryIndexManager#validateIndexSearchersForQuery from 
SelectStatement#RawStatement#validateSecondaryIndexSelections

That should improve encapsulation and responsibility placement and provide 
better 2i APIs.

Finally, I would add a few tests.

 Custom 2i validation
 

 Key: CASSANDRA-7575
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7575
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Andrés de la Peña
Assignee: Andrés de la Peña
Priority: Minor
  Labels: 2i, cql3, secondaryIndex, secondary_index, select
 Fix For: 2.1.0, 3.0

 Attachments: 2i_validation.patch


 There are several projects using custom secondary indexes as an extension 
 point to integrate C* with other systems such as Solr or Lucene. The usual 
 approach is to embed third party indexing queries in CQL clauses. 
 For example, [DSE 
 Search|http://www.datastax.com/what-we-offer/products-services/datastax-enterprise]
  embeds Solr syntax this way:
 {code}
 SELECT title FROM solr WHERE solr_query='title:natio*';
 {code}
 [Stratio platform|https://github.com/Stratio/stratio-cassandra] embeds custom 
 JSON syntax for searching in Lucene indexes:
 {code}
 SELECT * FROM tweets WHERE lucene='{
 filter : {
 type: range,
 field: time,
 lower: 2014/04/25,
 upper: 2014/04/1
 },
 query  : {
 type: phrase, 
 field: body, 
 values: [big, data]
 },
 sort  : {fields: [ {field:time, reverse:true} ] }
 }';
 {code}
 Tuplejump [Stargate|http://tuplejump.github.io/stargate/] also uses the 
 Stratio's open source JSON syntax:
 {code}
 SELECT name,company FROM PERSON WHERE stargate ='{
 filter: {
 type: range,
 field: company,
 lower: a,
 upper: p
 },
 sort:{
fields: [{field:name,reverse:true}]
 }
 }';
 {code}
 These syntaxes are validated by the corresponding 2i implementation. This 
 validation is done behind the StorageProxy command distribution. So, far as I 
 know, there is no way to give rich feedback about syntax errors to CQL users.
 I'm uploading a patch with some changes trying to improve this. I propose 
 adding an empty validation method to SecondaryIndexSearcher that can be 
 overridden by custom 2i implementations:
 {code}
 public void validate(ListIndexExpression clause) {}
 {code}
 And call it from SelectStatement#getRangeCommand:
 {code}
 ColumnFamilyStore cfs = 
 Keyspace.open(keyspace()).getColumnFamilyStore(columnFamily());
 for (SecondaryIndexSearcher searcher : 
 cfs.indexManager.getIndexSearchersForQuery(expressions))
 {
 try
 {
 searcher.validate(expressions);
 }
 catch (RuntimeException e)
 {
 String exceptionMessage = e.getMessage();
 if (exceptionMessage != null 
  !exceptionMessage.trim().isEmpty())
 throw new InvalidRequestException(
 Invalid index expression:  + e.getMessage());
 else
 throw new InvalidRequestException(
 Invalid index expression);
 }
 }
 {code}
 In this way C* allows custom 2i implementations to give feedback about syntax 
 errors.
 We are currently using these changes in a fork with no problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7593) Errors when upgrading through several versions to 2.1

2014-07-28 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076068#comment-14076068
 ] 

Marcus Eriksson commented on CASSANDRA-7593:


no, they should not be empty

We are inserting a RangeTombstone with start='token' and end='token' (ie, 
delete the set for this row).

In 2.0 we only make the end have an EOC 
(https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/cql3/Sets.java#L234)
 while in 2.1 both do: 
https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/cql3/Sets.java#L252
 + 
https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/db/composites/AbstractComposite.java#L69

 Errors when upgrading through several versions to 2.1
 -

 Key: CASSANDRA-7593
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7593
 Project: Cassandra
  Issue Type: Bug
 Environment: java 1.7
Reporter: Russ Hatch
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.1.0


 I'm seeing two different errors cropping up in the dtest which upgrades a 
 cluster through several versions.
 This is the more common error:
 {noformat}
 ERROR [GossipStage:10] 2014-07-22 13:14:30,028 CassandraDaemon.java:168 - 
 Exception in thread Thread[GossipStage:10,5,main]
 java.lang.AssertionError: null
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.shouldInclude(SliceQueryFilter.java:347)
  ~[main/:na]
 at 
 org.apache.cassandra.db.filter.QueryFilter.shouldInclude(QueryFilter.java:249)
  ~[main/:na]
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249)
  ~[main/:na]
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1873)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1681)
  ~[main/:na]
 at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:345) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:59)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.readLocally(SelectStatement.java:293)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:302)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:60)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:263)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SystemKeyspace.getPreferredIP(SystemKeyspace.java:514)
  ~[main/:na]
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.init(OutboundTcpConnectionPool.java:51)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:522)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:536)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:689)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:663)
  ~[main/:na]
 at 
 org.apache.cassandra.service.EchoVerbHandler.doVerb(EchoVerbHandler.java:40) 
 ~[main/:na]
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) 
 ~[main/:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  ~[na:1.7.0_60]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60]
 {noformat}
 The same test sometimes fails with this exception instead:
 {noformat}
 ERROR [CompactionExecutor:4] 2014-07-22 16:18:21,008 CassandraDaemon.java:168 
 - Exception in thread Thread[CompactionExecutor:4,1,RMI Runtime]
 java.util.concurrent.RejectedExecutionException: Task 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7059d3e9 
 rejected from 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@108f1504[Terminated,
  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 95]
 at 
 java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) 
 ~[na:1.7.0_60]
 at 
 

[jira] [Assigned] (CASSANDRA-7596) Don't swap min/max column names when mutating level or repairedAt

2014-07-28 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reassigned CASSANDRA-7596:
--

Assignee: Marcus Eriksson

 Don't swap min/max column names when mutating level or repairedAt
 -

 Key: CASSANDRA-7596
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7596
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 2.1.0

 Attachments: 0001-dont-swap.patch


 Seems we swap min/max col names when mutating sstable metadata



--
This message was sent by Atlassian JIRA
(v6.2#6252)


git commit: Don't swap max/min column names when mutating sstable metadata.

2014-07-28 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1.0 6f15fe260 - ee62ae104


Don't swap max/min column names when mutating sstable metadata.

Patch by marcuse; reviewed by benedict for CASSANDRA-7596.


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ee62ae10
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ee62ae10
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ee62ae10

Branch: refs/heads/cassandra-2.1.0
Commit: ee62ae104ee2c69d852b488f904b1854aa58aa2a
Parents: 6f15fe2
Author: Marcus Eriksson marc...@apache.org
Authored: Mon Jul 28 12:48:24 2014 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Mon Jul 28 12:48:24 2014 +0200

--
 CHANGES.txt  | 1 +
 .../org/apache/cassandra/io/sstable/metadata/StatsMetadata.java  | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ee62ae10/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0a1ba51..c6aaef9 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -14,6 +14,7 @@
  * Fix tracing of range slices and secondary index lookups that are local
to the coordinator (CASSANDRA-7599)
  * Set -Dcassandra.storagedir for all tool shell scripts (CASSANDRA-7587)
+ * Don't swap max/min col names when mutating sstable metadata (CASSANDRA-7596)
 Merged from 2.0:
  * Fix ReversedType(DateType) mapping to native protocol (CASSANDRA-7576)
  * Always merge ranges owned by a single node (CASSANDRA-6930)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ee62ae10/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java
--
diff --git 
a/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java 
b/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java
index 900bd4e..a557b88 100644
--- a/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java
+++ b/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java
@@ -124,8 +124,8 @@ public class StatsMetadata extends MetadataComponent
  compressionRatio,
  estimatedTombstoneDropTime,
  newLevel,
- maxColumnNames,
  minColumnNames,
+ maxColumnNames,
  hasLegacyCounterShards,
  repairedAt);
 }
@@ -141,8 +141,8 @@ public class StatsMetadata extends MetadataComponent
  compressionRatio,
  estimatedTombstoneDropTime,
  sstableLevel,
- maxColumnNames,
  minColumnNames,
+ maxColumnNames,
  hasLegacyCounterShards,
  newRepairedAt);
 }



[1/2] git commit: Don't swap max/min column names when mutating sstable metadata.

2014-07-28 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 3744d7792 - 2236afb7a


Don't swap max/min column names when mutating sstable metadata.

Patch by marcuse; reviewed by benedict for CASSANDRA-7596.


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ee62ae10
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ee62ae10
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ee62ae10

Branch: refs/heads/cassandra-2.1
Commit: ee62ae104ee2c69d852b488f904b1854aa58aa2a
Parents: 6f15fe2
Author: Marcus Eriksson marc...@apache.org
Authored: Mon Jul 28 12:48:24 2014 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Mon Jul 28 12:48:24 2014 +0200

--
 CHANGES.txt  | 1 +
 .../org/apache/cassandra/io/sstable/metadata/StatsMetadata.java  | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ee62ae10/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0a1ba51..c6aaef9 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -14,6 +14,7 @@
  * Fix tracing of range slices and secondary index lookups that are local
to the coordinator (CASSANDRA-7599)
  * Set -Dcassandra.storagedir for all tool shell scripts (CASSANDRA-7587)
+ * Don't swap max/min col names when mutating sstable metadata (CASSANDRA-7596)
 Merged from 2.0:
  * Fix ReversedType(DateType) mapping to native protocol (CASSANDRA-7576)
  * Always merge ranges owned by a single node (CASSANDRA-6930)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ee62ae10/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java
--
diff --git 
a/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java 
b/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java
index 900bd4e..a557b88 100644
--- a/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java
+++ b/src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java
@@ -124,8 +124,8 @@ public class StatsMetadata extends MetadataComponent
  compressionRatio,
  estimatedTombstoneDropTime,
  newLevel,
- maxColumnNames,
  minColumnNames,
+ maxColumnNames,
  hasLegacyCounterShards,
  repairedAt);
 }
@@ -141,8 +141,8 @@ public class StatsMetadata extends MetadataComponent
  compressionRatio,
  estimatedTombstoneDropTime,
  sstableLevel,
- maxColumnNames,
  minColumnNames,
+ maxColumnNames,
  hasLegacyCounterShards,
  newRepairedAt);
 }



[2/2] git commit: Merge branch 'cassandra-2.1.0' into cassandra-2.1

2014-07-28 Thread marcuse
Merge branch 'cassandra-2.1.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2236afb7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2236afb7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2236afb7

Branch: refs/heads/cassandra-2.1
Commit: 2236afb7a06725f9ceb13bab8c2180eb0d6134f5
Parents: 3744d77 ee62ae1
Author: Marcus Eriksson marc...@apache.org
Authored: Mon Jul 28 12:49:06 2014 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Mon Jul 28 12:49:06 2014 +0200

--
 CHANGES.txt  | 1 +
 .../org/apache/cassandra/io/sstable/metadata/StatsMetadata.java  | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2236afb7/CHANGES.txt
--



[2/3] git commit: Merge branch 'cassandra-2.1.0' into cassandra-2.1

2014-07-28 Thread marcuse
Merge branch 'cassandra-2.1.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2236afb7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2236afb7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2236afb7

Branch: refs/heads/trunk
Commit: 2236afb7a06725f9ceb13bab8c2180eb0d6134f5
Parents: 3744d77 ee62ae1
Author: Marcus Eriksson marc...@apache.org
Authored: Mon Jul 28 12:49:06 2014 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Mon Jul 28 12:49:06 2014 +0200

--
 CHANGES.txt  | 1 +
 .../org/apache/cassandra/io/sstable/metadata/StatsMetadata.java  | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2236afb7/CHANGES.txt
--



[3/3] git commit: Merge branch 'cassandra-2.1' into trunk

2014-07-28 Thread marcuse
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0fd1a0bb
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0fd1a0bb
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0fd1a0bb

Branch: refs/heads/trunk
Commit: 0fd1a0bb47f66eaa29ce821aa4836c52b65e46e1
Parents: f3aa83b 2236afb
Author: Marcus Eriksson marc...@apache.org
Authored: Mon Jul 28 12:49:28 2014 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Mon Jul 28 12:49:28 2014 +0200

--
 CHANGES.txt  | 1 +
 .../org/apache/cassandra/io/sstable/metadata/StatsMetadata.java  | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0fd1a0bb/CHANGES.txt
--



[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076202#comment-14076202
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

How do we know what's in the system ks when all we have is a cfid that doesn't 
match anything known?

More generally, I'm not sure how stop on unknown cfid is going to be a useful 
feature.  It's definitely going to happen if you replay a commitlog after 
dropping a table, for instance, if we have an unclean shutdown in between.  
This is normal behavior and not a bug per se, so whacking users and not 
starting up is definitely antisocial.

On the other hand I can't picture a scenario where the user *can* take 
meaningful action based on failing startup here.  Put another way, ignoring the 
mutations is the Right Thing to do in every scenario I can think of.

So I propose we just log it at info and ignore.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076203#comment-14076203
 ] 

Aleksey Yeschenko commented on CASSANDRA-7582:
--

Indeed, there is no obvious way to recover from it that I can think of. +1 on 
logging it and going on.

-Dcassandra.commitlog.stop_on_missing_tables should also go.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7056) Add RAMP transactions

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076297#comment-14076297
 ] 

Jonathan Ellis commented on CASSANDRA-7056:
---

bq. I'd also vote for making UNLOGGED the default (implicit) BATCH behavior, 
now that the LOGGED batches would cost even more than they do now.

UNLOGGED is still a misfeature, so I don't see how the cost of RAMP affects our 
choice of default.  (And for the record I think RAMP should definitely be the 
default; it matches users' assumptions so much better.)

I guess we could add UN_ISOLATED to request logged-without-ramp though.

 Add RAMP transactions
 -

 Key: CASSANDRA-7056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Tupshin Harper
Priority: Minor

 We should take a look at 
 [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
  transactions, and figure out if they can be used to provide more efficient 
 LWT (or LWT-like) operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7056) Add RAMP transactions

2014-07-28 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076300#comment-14076300
 ] 

Jeremiah Jordan commented on CASSANDRA-7056:


bq. UNLOGGED is still a misfeature

UNLOGGED is not always a misfeature.  If I was doing batch writes to a single 
partition, I would make them unlogged.  No point in having the overhead of a 
logged batch for that.  But I would not make UNLOGGED the default.

 Add RAMP transactions
 -

 Key: CASSANDRA-7056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Tupshin Harper
Priority: Minor

 We should take a look at 
 [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
  transactions, and figure out if they can be used to provide more efficient 
 LWT (or LWT-like) operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7576) DateType columns not properly converted to TimestampType when in ReversedType columns.

2014-07-28 Thread Karl Rieb (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076313#comment-14076313
 ] 

Karl Rieb commented on CASSANDRA-7576:
--

bq. Karl Rieb I know it wasn't a big deal to you, but anyway - when 
cherry-picking the patch back to 2.1.0, I did correct the name to yours in 
'patch by' (:

Thanks [~iamaleksey]!

 DateType columns not properly converted to TimestampType when in ReversedType 
 columns.
 --

 Key: CASSANDRA-7576
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7576
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Karl Rieb
Assignee: Karl Rieb
 Fix For: 2.0.10, 2.1.0

 Attachments: DataType_CASSANDRA_7576.patch

   Original Estimate: 0.25h
  Remaining Estimate: 0.25h

 The {{org.apache.cassandra.transport.DataType.fromType(AbstractType)}} method 
 has a bug that prevents sending the correct Protocol ID for reversed 
 {{DateType}} columns.   This results in clients receiving Protocol ID {{0}}, 
 which maps to a {{CUSTOM}} type, for timestamp columns that are clustered in 
 reverse order.  
 Some clients can handle this properly since they recognize the 
 {{org.apache.cassandra.db.marshal.DateType}} marshaling type, however the 
 native Datastax java-driver does not.  It will produce errors like the one 
 below when trying to prepare queries against such tables:
 {noformat}
 com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type 
 for value 2 of CQL type 'org.apache.cassandra.db.marshal.DateType', expecting 
 class java.nio.ByteBuffer but class java.util.Date provided
   at com.datastax.driver.core.BoundStatement.bind(BoundStatement.java:190)
   at 
 com.datastax.driver.core.DefaultPreparedStatement.bind(DefaultPreparedStatement.java:103)
 {noformat}
 On the Cassandra side, there is a check for {{DateType}} columns that is 
 supposed to convert these columns to TimestampType.  However, the check is 
 skipped when the column is also reversed.  Specifically:
 {code:title=DataType.java|borderStyle=solid}
 public static PairDataType, Object fromType(AbstractType type)
 {
 // For CQL3 clients, ReversedType is an implementation detail and they
 // shouldn't have to care about it.
 if (type instanceof ReversedType)
 type = ((ReversedType)type).baseType;
 // For compatibility sake, we still return DateType as the timestamp type 
 in resultSet metadata (#5723)
 else if (type instanceof DateType)
 type = TimestampType.instance;
 // ...
 {code}
 The *else if* should be changed to just an *if*, like so:
 {code:title=DataType.java|borderStyle=solid}
 public static PairDataType, Object fromType(AbstractType type)
 {
 // For CQL3 clients, ReversedType is an implementation detail and they
 // shouldn't have to care about it.
 if (type instanceof ReversedType)
 type = ((ReversedType)type).baseType;
 // For compatibility sake, we still return DateType as the timestamp type 
 in resultSet metadata (#5723)
 if (type instanceof DateType)
 type = TimestampType.instance;
 // ...
 {code}
 This bug is preventing us from upgrading our 1.2.11 cluster to 2.0.9 because 
 our clients keep throwing exceptions trying to read or write data to tables 
 with reversed timestamp columns. This issue can be reproduced by creating a 
 CQL table in Cassandra 1.2.11 that clusters on a timestamp in reverse, then 
 upgrading the node to 2.0.9.  When querying the metadata for the table, the 
 node will return Protocol ID 0 (CUSTOM) instead of Protocol ID 11 (TIMESTAMP).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory

2014-07-28 Thread graham sanderson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

graham sanderson updated CASSANDRA-7546:


Attachment: 7546.20_5.txt

I've add 7646.20_5.txt which is the same as 7546_20.4.txt but with a minor 
change that allows it to function correctly to close to the full 32 bits of 
time range vs 31 bits.

# Any thoughts on metrics? 
I'm thinking a simple CF (and rolled up KS) metric which simply counts number 
of highly contented rows over time. Note, we do know when a row was partially 
contented, but I don't know that we can assign a meaningful value between 0  
1. Note, we could do a ratio of good vs bad rows on flush, but I think the raw 
count is more interesting
# Note, I plan to move the static {} block at the top to a test case for sanity 
checking - it doesn't belong mixed in the code... Once we're all set I'll 
submit an actual patch for 2.0.x and 2.1.x - should we patch this in 1.1/1.2 
also?
# Any other thoughts? I'd like to start testing this (but don't want to do so 
if it you want to make major changes). I'll test on top of 2.0.10 in beta with 
our code and cassandra stress (hopefully some scenarios you have in 2.1 both 
with a node down for hinting and not), and maybe after that with the 
tracking/metric on but the synchronized off in production just to check that it 
exactly detects our hint storms and nothing else in production (we have no 
application tables that should be heavily contented on the partition level). 
I'll make and test a patch on 2.1 also, however I'll have to finish testing on 
2.0.x before I can upgrade a (fast h/w) cluster to 2.1

 AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
 -

 Key: CASSANDRA-7546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: graham sanderson
Assignee: graham sanderson
 Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 
 7546.20_4.txt, 7546.20_5.txt, 7546.20_alt.txt, suggestion1.txt, 
 suggestion1_21.txt


 In order to preserve atomicity, this code attempts to read, clone/update, 
 then CAS the state of the partition.
 Under heavy contention for updating a single partition this can cause some 
 fairly staggering memory growth (the more cores on your machine the worst it 
 gets).
 Whilst many usage patterns don't do highly concurrent updates to the same 
 partition, hinting today, does, and in this case wild (order(s) of magnitude 
 more than expected) memory allocation rates can be seen (especially when the 
 updates being hinted are small updates to different partitions which can 
 happen very fast on their own) - see CASSANDRA-7545
 It would be best to eliminate/reduce/limit the spinning memory allocation 
 whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076399#comment-14076399
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

This was introduced by CASSANDRA-7125 for 2.1.1 and is not in the 2.1.0 branch. 
 Is this actually a problem with rc4 [~enigmacurry]?

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7056) Add RAMP transactions

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076404#comment-14076404
 ] 

Jonathan Ellis commented on CASSANDRA-7056:
---

FTR, we transform single-partition batches to UNLOGGED automagically, since you 
are right; there is no point in the logging overhead there.

 Add RAMP transactions
 -

 Key: CASSANDRA-7056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Tupshin Harper
Priority: Minor

 We should take a look at 
 [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
  transactions, and figure out if they can be used to provide more efficient 
 LWT (or LWT-like) operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7593) Errors when upgrading through several versions to 2.1

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076411#comment-14076411
 ] 

Jonathan Ellis commented on CASSANDRA-7593:
---

bq. Could we assume EOC.START for RT.min in 2.1 when deserializing old sstables?

That sounds like the right fix to me.

 Errors when upgrading through several versions to 2.1
 -

 Key: CASSANDRA-7593
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7593
 Project: Cassandra
  Issue Type: Bug
 Environment: java 1.7
Reporter: Russ Hatch
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.1.0


 I'm seeing two different errors cropping up in the dtest which upgrades a 
 cluster through several versions.
 This is the more common error:
 {noformat}
 ERROR [GossipStage:10] 2014-07-22 13:14:30,028 CassandraDaemon.java:168 - 
 Exception in thread Thread[GossipStage:10,5,main]
 java.lang.AssertionError: null
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.shouldInclude(SliceQueryFilter.java:347)
  ~[main/:na]
 at 
 org.apache.cassandra.db.filter.QueryFilter.shouldInclude(QueryFilter.java:249)
  ~[main/:na]
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249)
  ~[main/:na]
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1873)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1681)
  ~[main/:na]
 at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:345) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:59)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.readLocally(SelectStatement.java:293)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:302)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:60)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:263)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SystemKeyspace.getPreferredIP(SystemKeyspace.java:514)
  ~[main/:na]
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.init(OutboundTcpConnectionPool.java:51)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:522)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:536)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:689)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:663)
  ~[main/:na]
 at 
 org.apache.cassandra.service.EchoVerbHandler.doVerb(EchoVerbHandler.java:40) 
 ~[main/:na]
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) 
 ~[main/:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  ~[na:1.7.0_60]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60]
 {noformat}
 The same test sometimes fails with this exception instead:
 {noformat}
 ERROR [CompactionExecutor:4] 2014-07-22 16:18:21,008 CassandraDaemon.java:168 
 - Exception in thread Thread[CompactionExecutor:4,1,RMI Runtime]
 java.util.concurrent.RejectedExecutionException: Task 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7059d3e9 
 rejected from 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@108f1504[Terminated,
  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 95]
 at 
 java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) 
 ~[na:1.7.0_60]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:619)
  ~[na:1.7.0_60]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.scheduleTidy(SSTableReader.java:628)
  ~[main/:na]
 at 
 

[jira] [Created] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Russell Alexander Spitzer (JIRA)
Russell Alexander Spitzer created CASSANDRA-7631:


 Summary: Allow Stress to write directly to SSTables
 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer


One common difficulty with benchmarking machines is the amount of time it takes 
to initially load data. For machines with a large amount of ram this becomes 
especially onerous because a very large amount of data needs to be placed on 
the machine before page-cache can be circumvented. 

To remedy this I suggest we add a top level flag to Cassandra-Stress which 
would cause the tool to write directly to sstables rather than actually 
performing CQL inserts. Internally this would use CQLSStable writer to write 
directly to sstables while skipping any keys which are not owned by the node 
stress is running on. The same stress command run on each node in the cluster 
would then write unique sstables only containing data which that node is 
responsible for. Following this no further network IO would be required to 
distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7629) tracing no longer logs when the request completed

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076416#comment-14076416
 ] 

Jonathan Ellis commented on CASSANDRA-7629:
---

I think you're mis-remembering.  The only request complete in 2.0 is a debug 
log entry, which is still there in 2.1:

{code}
public void stopSession()
{
TraceState state = this.state.get();
if (state == null) // inline isTracing to avoid implicit two calls to 
state.get()
{
logger.debug(request complete);
}
{code}

 tracing no longer logs when the request completed
 -

 Key: CASSANDRA-7629
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7629
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
 Fix For: 2.1.1


 In 2.0 and before, there is a Request complete entry in tracing, which no 
 longer appears in 2.1.  This makes it difficult to reason about 
 latency/performance problems in a trace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7629) tracing no longer logs when the request completed

2014-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7629:
--

 Priority: Minor  (was: Major)
Fix Version/s: (was: 2.1.0)
   2.1.1

 tracing no longer logs when the request completed
 -

 Key: CASSANDRA-7629
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7629
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Priority: Minor
 Fix For: 2.1.1


 In 2.0 and before, there is a Request complete entry in tracing, which no 
 longer appears in 2.1.  This makes it difficult to reason about 
 latency/performance problems in a trace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Russell Alexander Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076415#comment-14076415
 ] 

Russell Alexander Spitzer commented on CASSANDRA-7631:
--

I think we can implement this by writing a new client, SSTableClient which 
would create the directory structure and instead of executing cql statements 
will add lines to a CQLSSTable writer. 


 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer reassigned CASSANDRA-7631:


Assignee: Russell Alexander Spitzer

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7575) Custom 2i validation

2014-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7575:
--

Fix Version/s: (was: 2.1.0)
   (was: 3.0)
   2.1.1

 Custom 2i validation
 

 Key: CASSANDRA-7575
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7575
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Andrés de la Peña
Assignee: Andrés de la Peña
Priority: Minor
  Labels: 2i, cql3, secondaryIndex, secondary_index, select
 Fix For: 2.1.1

 Attachments: 2i_validation.patch


 There are several projects using custom secondary indexes as an extension 
 point to integrate C* with other systems such as Solr or Lucene. The usual 
 approach is to embed third party indexing queries in CQL clauses. 
 For example, [DSE 
 Search|http://www.datastax.com/what-we-offer/products-services/datastax-enterprise]
  embeds Solr syntax this way:
 {code}
 SELECT title FROM solr WHERE solr_query='title:natio*';
 {code}
 [Stratio platform|https://github.com/Stratio/stratio-cassandra] embeds custom 
 JSON syntax for searching in Lucene indexes:
 {code}
 SELECT * FROM tweets WHERE lucene='{
 filter : {
 type: range,
 field: time,
 lower: 2014/04/25,
 upper: 2014/04/1
 },
 query  : {
 type: phrase, 
 field: body, 
 values: [big, data]
 },
 sort  : {fields: [ {field:time, reverse:true} ] }
 }';
 {code}
 Tuplejump [Stargate|http://tuplejump.github.io/stargate/] also uses the 
 Stratio's open source JSON syntax:
 {code}
 SELECT name,company FROM PERSON WHERE stargate ='{
 filter: {
 type: range,
 field: company,
 lower: a,
 upper: p
 },
 sort:{
fields: [{field:name,reverse:true}]
 }
 }';
 {code}
 These syntaxes are validated by the corresponding 2i implementation. This 
 validation is done behind the StorageProxy command distribution. So, far as I 
 know, there is no way to give rich feedback about syntax errors to CQL users.
 I'm uploading a patch with some changes trying to improve this. I propose 
 adding an empty validation method to SecondaryIndexSearcher that can be 
 overridden by custom 2i implementations:
 {code}
 public void validate(ListIndexExpression clause) {}
 {code}
 And call it from SelectStatement#getRangeCommand:
 {code}
 ColumnFamilyStore cfs = 
 Keyspace.open(keyspace()).getColumnFamilyStore(columnFamily());
 for (SecondaryIndexSearcher searcher : 
 cfs.indexManager.getIndexSearchersForQuery(expressions))
 {
 try
 {
 searcher.validate(expressions);
 }
 catch (RuntimeException e)
 {
 String exceptionMessage = e.getMessage();
 if (exceptionMessage != null 
  !exceptionMessage.trim().isEmpty())
 throw new InvalidRequestException(
 Invalid index expression:  + e.getMessage());
 else
 throw new InvalidRequestException(
 Invalid index expression);
 }
 }
 {code}
 In this way C* allows custom 2i implementations to give feedback about syntax 
 errors.
 We are currently using these changes in a fork with no problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7632) NPE in AutoSavingCache$Writer.deleteOldCacheFiles

2014-07-28 Thread Vishy Kasar (JIRA)
Vishy Kasar created CASSANDRA-7632:
--

 Summary: NPE in AutoSavingCache$Writer.deleteOldCacheFiles
 Key: CASSANDRA-7632
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7632
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Vishy Kasar
Priority: Minor


Observed this NPE in one of our production cluster (2.0.9). Does not seem to be 
causing harm but good to resolve.

ERROR [CompactionExecutor:1188] 2014-07-27 21:57:08,225 CassandraDaemon.java 
(line 199) Exception in thread Thread[CompactionExecutor:1188,1,main] 
clusterName=clouddb_p03 
java.lang.NullPointerException 
at 
org.apache.cassandra.cache.AutoSavingCache$Writer.deleteOldCacheFiles(AutoSavingCache.java:265)
 
at 
org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:195)
 
at 
org.apache.cassandra.db.compaction.CompactionManager$10.run(CompactionManager.java:862)
 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) 
at java.util.concurrent.FutureTask.run(FutureTask.java:166) 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) 
at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7628) Tools java driver needs to be updated

2014-07-28 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076433#comment-14076433
 ] 

Brandon Williams commented on CASSANDRA-7628:
-

Looks like upgrading to 2.1 is going to require code changes.

 Tools java driver needs to be updated
 -

 Key: CASSANDRA-7628
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Brandon Williams
Priority: Minor
 Fix For: 2.1.0


 When you run stress currently you get a bunch of harmless stacktraces like:
 {noformat}
 ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: 
 Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions()
  will return null
 java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a 
 column defined in this metadata
 at 
 com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159)
  [stress/:na]
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) 
 [stress/:na]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (CASSANDRA-7628) Tools java driver needs to be updated

2014-07-28 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-7628:
---

Assignee: Benedict

 Tools java driver needs to be updated
 -

 Key: CASSANDRA-7628
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Brandon Williams
Assignee: Benedict
Priority: Minor
 Fix For: 2.1.0


 When you run stress currently you get a bunch of harmless stacktraces like:
 {noformat}
 ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: 
 Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions()
  will return null
 java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a 
 column defined in this metadata
 at 
 com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159)
  [stress/:na]
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) 
 [stress/:na]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7629) tracing no longer logs when the request completed

2014-07-28 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076435#comment-14076435
 ] 

Brandon Williams commented on CASSANDRA-7629:
-

I'm really not though, see my paste in CASSANDRA-7567

 tracing no longer logs when the request completed
 -

 Key: CASSANDRA-7629
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7629
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Priority: Minor
 Fix For: 2.1.1


 In 2.0 and before, there is a Request complete entry in tracing, which no 
 longer appears in 2.1.  This makes it difficult to reason about 
 latency/performance problems in a trace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7629) tracing no longer logs when the request completed

2014-07-28 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076436#comment-14076436
 ] 

Brandon Williams commented on CASSANDRA-7629:
-

Aha, it's cqlsh that adds that:

{noformat}
pylib/cqlshlib/tracing.py:rows.append(['Request complete', finished_at, 
coordinator, duration])
{noformat}

 tracing no longer logs when the request completed
 -

 Key: CASSANDRA-7629
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7629
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Priority: Minor
 Fix For: 2.1.1


 In 2.0 and before, there is a Request complete entry in tracing, which no 
 longer appears in 2.1.  This makes it difficult to reason about 
 latency/performance problems in a trace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (CASSANDRA-7632) NPE in AutoSavingCache$Writer.deleteOldCacheFiles

2014-07-28 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-7632:
---

Assignee: Marcus Eriksson

 NPE in AutoSavingCache$Writer.deleteOldCacheFiles
 -

 Key: CASSANDRA-7632
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7632
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Vishy Kasar
Assignee: Marcus Eriksson
Priority: Minor
 Fix For: 2.0.10


 Observed this NPE in one of our production cluster (2.0.9). Does not seem to 
 be causing harm but good to resolve.
 ERROR [CompactionExecutor:1188] 2014-07-27 21:57:08,225 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:1188,1,main] 
 clusterName=clouddb_p03 
 java.lang.NullPointerException 
 at 
 org.apache.cassandra.cache.AutoSavingCache$Writer.deleteOldCacheFiles(AutoSavingCache.java:265)
  
 at 
 org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:195)
  
 at 
 org.apache.cassandra.db.compaction.CompactionManager$10.run(CompactionManager.java:862)
  
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) 
 at java.util.concurrent.FutureTask.run(FutureTask.java:166) 
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7632) NPE in AutoSavingCache$Writer.deleteOldCacheFiles

2014-07-28 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-7632:


Fix Version/s: 2.0.10

 NPE in AutoSavingCache$Writer.deleteOldCacheFiles
 -

 Key: CASSANDRA-7632
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7632
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Vishy Kasar
Assignee: Marcus Eriksson
Priority: Minor
 Fix For: 2.0.10


 Observed this NPE in one of our production cluster (2.0.9). Does not seem to 
 be causing harm but good to resolve.
 ERROR [CompactionExecutor:1188] 2014-07-27 21:57:08,225 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:1188,1,main] 
 clusterName=clouddb_p03 
 java.lang.NullPointerException 
 at 
 org.apache.cassandra.cache.AutoSavingCache$Writer.deleteOldCacheFiles(AutoSavingCache.java:265)
  
 at 
 org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:195)
  
 at 
 org.apache.cassandra.db.compaction.CompactionManager$10.run(CompactionManager.java:862)
  
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) 
 at java.util.concurrent.FutureTask.run(FutureTask.java:166) 
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7628) Tools java driver needs to be updated

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076447#comment-14076447
 ] 

Benedict commented on CASSANDRA-7628:
-

bq. going to require code changes.

Could you elaborate? This is appears to be a Java Driver bug.

 Tools java driver needs to be updated
 -

 Key: CASSANDRA-7628
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Brandon Williams
Assignee: Benedict
Priority: Minor
 Fix For: 2.1.0


 When you run stress currently you get a bunch of harmless stacktraces like:
 {noformat}
 ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: 
 Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions()
  will return null
 java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a 
 column defined in this metadata
 at 
 com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159)
  [stress/:na]
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) 
 [stress/:na]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7628) Tools java driver needs to be updated

2014-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7628:
--

Fix Version/s: (was: 2.1.0)
   2.1.1

(This only affects stress.  Pushing to 2.1.1.)

 Tools java driver needs to be updated
 -

 Key: CASSANDRA-7628
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Brandon Williams
Assignee: Benedict
Priority: Minor
 Fix For: 2.1.1


 When you run stress currently you get a bunch of harmless stacktraces like:
 {noformat}
 ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: 
 Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions()
  will return null
 java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a 
 column defined in this metadata
 at 
 com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159)
  [stress/:na]
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) 
 [stress/:na]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7628) Tools java driver needs to be updated

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076450#comment-14076450
 ] 

Jonathan Ellis edited comment on CASSANDRA-7628 at 7/28/14 5:43 PM:


(This only affects stress and hadoop.  Pushing to 2.1.1.)


was (Author: jbellis):
(This only affects stress.  Pushing to 2.1.1.)

 Tools java driver needs to be updated
 -

 Key: CASSANDRA-7628
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Brandon Williams
Assignee: Benedict
Priority: Minor
 Fix For: 2.1.1


 When you run stress currently you get a bunch of harmless stacktraces like:
 {noformat}
 ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: 
 Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions()
  will return null
 java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a 
 column defined in this metadata
 at 
 com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159)
  [stress/:na]
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) 
 [stress/:na]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7628) Tools java driver needs to be updated

2014-07-28 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076461#comment-14076461
 ] 

Brandon Williams commented on CASSANDRA-7628:
-

Well, I tried it and got compile errors in stress.  
https://github.com/datastax/java-driver/blob/2.1/driver-core/Upgrade_guide_to_2.1.rst

 Tools java driver needs to be updated
 -

 Key: CASSANDRA-7628
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Brandon Williams
Assignee: Benedict
Priority: Minor
 Fix For: 2.1.1


 When you run stress currently you get a bunch of harmless stacktraces like:
 {noformat}
 ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: 
 Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions()
  will return null
 java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a 
 column defined in this metadata
 at 
 com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159)
  [stress/:na]
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) 
 [stress/:na]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076459#comment-14076459
 ] 

Matt Kennedy commented on CASSANDRA-7631:
-

Having a mechanism like this is extremely important for testing large scale 
clusters. We don't necessarily want/need to test a large scale ingest each 
time, so the sooner we can go from spinning up 100 nodes, to running a mixed 
workload, the better. If one invocation of stress can tell 100 stressd 
processes to write local SSTables according to the user defined yaml, that 
should be massively more efficient than running a write job.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Ryan McGuire (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-7582:


Since Version: 2.1.1  (was: 2.1 rc3)

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076465#comment-14076465
 ] 

Ryan McGuire commented on CASSANDRA-7582:
-

I think this was version tagged incorrectly. I'm seeing CASSANDRA-7593 on rc4 
instead of this one.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7593) Errors when upgrading through several versions to 2.1

2014-07-28 Thread Ryan McGuire (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-7593:


Reproduced In: 2.1 rc4

 Errors when upgrading through several versions to 2.1
 -

 Key: CASSANDRA-7593
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7593
 Project: Cassandra
  Issue Type: Bug
 Environment: java 1.7
Reporter: Russ Hatch
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.1.0


 I'm seeing two different errors cropping up in the dtest which upgrades a 
 cluster through several versions.
 This is the more common error:
 {noformat}
 ERROR [GossipStage:10] 2014-07-22 13:14:30,028 CassandraDaemon.java:168 - 
 Exception in thread Thread[GossipStage:10,5,main]
 java.lang.AssertionError: null
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.shouldInclude(SliceQueryFilter.java:347)
  ~[main/:na]
 at 
 org.apache.cassandra.db.filter.QueryFilter.shouldInclude(QueryFilter.java:249)
  ~[main/:na]
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249)
  ~[main/:na]
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1873)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1681)
  ~[main/:na]
 at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:345) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:59)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.readLocally(SelectStatement.java:293)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:302)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:60)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:263)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SystemKeyspace.getPreferredIP(SystemKeyspace.java:514)
  ~[main/:na]
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.init(OutboundTcpConnectionPool.java:51)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:522)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:536)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:689)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:663)
  ~[main/:na]
 at 
 org.apache.cassandra.service.EchoVerbHandler.doVerb(EchoVerbHandler.java:40) 
 ~[main/:na]
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) 
 ~[main/:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  ~[na:1.7.0_60]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60]
 {noformat}
 The same test sometimes fails with this exception instead:
 {noformat}
 ERROR [CompactionExecutor:4] 2014-07-22 16:18:21,008 CassandraDaemon.java:168 
 - Exception in thread Thread[CompactionExecutor:4,1,RMI Runtime]
 java.util.concurrent.RejectedExecutionException: Task 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7059d3e9 
 rejected from 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@108f1504[Terminated,
  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 95]
 at 
 java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) 
 ~[na:1.7.0_60]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:619)
  ~[na:1.7.0_60]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.scheduleTidy(SSTableReader.java:628)
  ~[main/:na]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.tidy(SSTableReader.java:609) 
 ~[main/:na]
 at 
 

[jira] [Commented] (CASSANDRA-7628) Tools java driver needs to be updated

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076466#comment-14076466
 ] 

Benedict commented on CASSANDRA-7628:
-

Ah, right. Version namespace clash with java driver confused me.

 Tools java driver needs to be updated
 -

 Key: CASSANDRA-7628
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7628
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Brandon Williams
Assignee: Benedict
Priority: Minor
 Fix For: 2.1.1


 When you run stress currently you get a bunch of harmless stacktraces like:
 {noformat}
 ERROR 21:11:51 Error parsing schema options for table system_traces.sessions: 
 Cluster.getMetadata().getKeyspace(system_traces).getTable(sessions).getOptions()
  will return null
 java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a 
 column defined in this metadata
 at 
 com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279)
  ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ArrayBackedRow.isNull(ArrayBackedRow.java:56) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata$Options.init(TableMetadata.java:529) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.TableMetadata.build(TableMetadata.java:119) 
 ~[cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:131) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:92) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:293)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:230)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170)
  [cassandra-driver-core-2.0.1.jar:na]
 at 
 com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1029) 
 [cassandra-driver-core-2.0.1.jar:na]
 at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:270) 
 [cassandra-driver-core-2.0.1.jar:na]
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:90)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:177)
  [stress/:na]
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:159)
  [stress/:na]
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:264) 
 [stress/:na]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Ryan McGuire (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-7582:


Fix Version/s: (was: 2.1.0)
   2.1.1

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076469#comment-14076469
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

Thanks, Ryan.

Benedict, I'm starting to think 7125 was misguided.  If the CL errors out there 
just isn't much you can do about it except pass the flag and try again, so why 
not cut out the extra step?

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7593) Errors when upgrading through several versions to 2.1

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076472#comment-14076472
 ] 

Benedict commented on CASSANDRA-7593:
-

This fix doesn't address the situation where we legitimately get an sstable 
with only range tombstones where the lowerbound has fewer components to the 
upper bounds (let's say we have a flood of DELETE from T where pk=K and a  X 
and (a,b)  (Y, Z))

Also, whilst we _may_ separately want to insert an EOC.START (since we now 
populate it, so presumably it should be populated, though it would be good to 
understand why this is now the case to document here for posterity), according 
to the commenting in the grabbing of min/max column names, we only care about 
clustering columns, so with or without the extra EOC we should not be fetching 
the whole of this BoundedComposite into min/max - we should only be fetching up 
to the number of clustering columns (0). Or we should be fetching the column 
name (and potentially any further components for sets/maps/etc.) from CellName 
as well.

 Errors when upgrading through several versions to 2.1
 -

 Key: CASSANDRA-7593
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7593
 Project: Cassandra
  Issue Type: Bug
 Environment: java 1.7
Reporter: Russ Hatch
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.1.0


 I'm seeing two different errors cropping up in the dtest which upgrades a 
 cluster through several versions.
 This is the more common error:
 {noformat}
 ERROR [GossipStage:10] 2014-07-22 13:14:30,028 CassandraDaemon.java:168 - 
 Exception in thread Thread[GossipStage:10,5,main]
 java.lang.AssertionError: null
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.shouldInclude(SliceQueryFilter.java:347)
  ~[main/:na]
 at 
 org.apache.cassandra.db.filter.QueryFilter.shouldInclude(QueryFilter.java:249)
  ~[main/:na]
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249)
  ~[main/:na]
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1873)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1681)
  ~[main/:na]
 at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:345) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:59)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.readLocally(SelectStatement.java:293)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:302)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:60)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:263)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SystemKeyspace.getPreferredIP(SystemKeyspace.java:514)
  ~[main/:na]
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.init(OutboundTcpConnectionPool.java:51)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:522)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:536)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:689)
  ~[main/:na]
 at 
 org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:663)
  ~[main/:na]
 at 
 org.apache.cassandra.service.EchoVerbHandler.doVerb(EchoVerbHandler.java:40) 
 ~[main/:na]
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) 
 ~[main/:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_60]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  ~[na:1.7.0_60]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60]
 {noformat}
 The same test sometimes fails with this exception instead:
 {noformat}
 ERROR [CompactionExecutor:4] 2014-07-22 16:18:21,008 CassandraDaemon.java:168 
 - Exception in thread Thread[CompactionExecutor:4,1,RMI Runtime]
 java.util.concurrent.RejectedExecutionException: Task 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7059d3e9 
 rejected from 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@108f1504[Terminated,
  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 95]
 at 

[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076509#comment-14076509
 ] 

Benedict commented on CASSANDRA-7582:
-

I'm -1000 on encountering an error and silently swallowing it on something as 
core to correctness as the commit log - this at least gives the user a big red 
flag they may want to seek expert help. I think there are two distinct problems 
here - there are the 'unexpected' errors which should almost certainly involve 
the user seeking help from an expert to diagnose (or perhaps JIRA, since it 
possibly means a bug), and the unknown table exceptions. The latter are 
debatably more ok to ignore, but I would much rather we simply retain 
information about dropped tables, much as we do truncated tables, so that we 
can suppress those known to have been dropped (with knowledge of exactly _when_ 
they were dropped, so if we see CL records past that time we can still fail and 
ask the user to at least file a bug report). 

Consider the following (pretty plausible scenario):

* User turns on CL saving
* User creates table X, populates it with some data (let's say it's a fairly 
static dataset) 
* User uses the database for a period, mostly changing other tables
* At time T, user drops table X, recreates it (instead of, e.g. truncate (which 
is separately also subtly dangerous in this scenario), and repopulates it with 
subtly but business-wise importantly different data
* Some time after T, user has to restore the cluster, and restores the schema 
from prior to T by mistake (let's say the team member restoring doesn't realise 
the table was recreated since then), then performs a PIT restore

The user now has no idea they have stale business data in their tables. Now, 
assuming we have saved the ids of all dropped tables we could report to the 
user that they are likely restoring data from a future schema, and they could 
then decide if this was safe or not; in this case they would be able to restore 
a newer schema (assuming they had saved it) and a major business error would 
have been averted.

In general this fail-fast is likely to result in an increase in JIRA filing, 
and possibly for relatively benign bugs, but on the whole I would prefer that 
scenario than leaving subtle bugs in the CL. We've already caught at least one 
as a result of this, and we've had long standing bugs with respect to drain 
that still affect 2.0 that would have been caught a long time ago with better 
reporting.



 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 

[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076516#comment-14076516
 ] 

Benedict commented on CASSANDRA-7582:
-

Hmm. Separately this scenario also points out that TRUNCATE is even more broken 
than I thought - since it doesn't get logged to the CL, if you restore a schema 
prior to a TRUNCATE you will simply get the old data supplemented with the new 
data.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076540#comment-14076540
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

I see two actual classes of CL errors:

# Table is dropped and we are replaying stale data that should also have been 
dropped.  Blocking startup is the Wrong Solution.
# Hardware problem caused a checksum mismatch.  Blocking startup is the Wrong 
Solution.

Granted that blocking startup can help prevent user errors during PIT recover, 
that's an entirely hypothetical situation today; PIT is only nominally usable.  
(Fork the JVM every time a CL segment finishes?  Yeah.)  So let's not optimize 
for that at the expense of scenarios we see frequently.

I think we should roll back 7125 until we can do it right.  Doing it right 
probably means, remembering old cfids in 2.1.x, then we can get paranoid about 
seeing them in the CL for 3.0.  (Getting paranoid in the same version as we 
start remembering is bad for obvious reasons.)

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076586#comment-14076586
 ] 

Benedict commented on CASSANDRA-7582:
-

3. We've busted something

This is the main type I'm trying to catch with this behaviour. I would prefer 
to know earlier if 2.1 is broken instead of corrupting the user's CL in some 
way without realising it. We've had several bugs in the 2.1 release cycle that 
would have been caught earlier had we had this feature enabled, and I would be 
surprised if we don't see some more once it gets released into the wild as a 
result of this. There are still bugs in 2.0 that we've fixed in 2.1 that we 
would certainly have caught earlier.

Enforcing correctness from other avenues is a strong secondary concern. This 
isn't a point of optimisation, we're talking about providing an unsafe PIT 
feature (and we've already got a ticket filed for removing forking), and also 
more importantly risking an unsafe regular _replay_. I disagree that hardware 
problem causing checksum mismatch shouldn't block startup - in this case you 
may have alternative copies of the data that are not corrupted, or can choose 
to analyse the logs yourself to establish what is happening. If you don't care, 
you set the don't care flag; but without the failure you maybe don't even 
know there are records that haven't been replayed (possibly whole files)


 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076721#comment-14076721
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

But that's still not a common *production* scenario.  So we're still optimizing 
bassackwards.

How about this?  Leave the checks in, but backwards: they're disabled, *unless* 
there's a flag.  Then we set the flag in utest and dtest.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076746#comment-14076746
 ] 

Benedict commented on CASSANDRA-7582:
-

What isn't a common production scenario? Commit Log bugs? We know there are 
some still in 2.0. There are potentially some in 2.1 too, and we probably won't 
spot them without something like this to help users know they encountered them 
and report them. Optimizing != Correctness.

I am very negative on disabling this.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076786#comment-14076786
 ] 

T Jake Luciani commented on CASSANDRA-7631:
---

So you aren't interested in stressing writes, you only care about reads?



 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7523) add date and time types

2014-07-28 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076793#comment-14076793
 ] 

Joshua McKenzie commented on CASSANDRA-7523:


I ended up going outside scope of strictly C* changes on this while testing - 
here's a snapshot of what I have thus far:
* [cql-internal python driver 
changes|https://github.com/josh-mckenzie/cql-internal/compare/7523]
* [cqlshlib and cqlsh 
changes|https://github.com/josh-mckenzie/cassandra/compare/7523_cqlshlib]
* [Java type 
addition|https://github.com/josh-mckenzie/cassandra/compare/7523_java_types_only]
* [Combined commit, new cql-internal 
archive|https://github.com/josh-mckenzie/cassandra/compare/7523_combined]

A few points I could use some feedback on:
# Is it reasonable to consider the new Date and Time types DATETIME w/regards 
to PEP249?
# What kind of conversion enforcement do we want on SimpleDate and Time types?  
I'm thinking reduction only w/warning on both, promotion to Timestamp w/Date 
object.
# I don't like changing the ui-time_format cqlshrc option underneath people 
but if we add a time type and time_format points to timestamp...

I still have some testing to implement (cqlsh, unit, potentially python driver 
if we merge these changes in) but wanted to get this out there to get feedback 
since this is a new area of the code-base for me.

 add date and time types
 ---

 Key: CASSANDRA-7523
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Joshua McKenzie
Priority: Minor
 Fix For: 2.0.10


 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html
 (we already have timestamp; interval is out of scope for now, and see 
 CASSANDRA-6350 for discussion on timestamp-with-time-zone.  but date/time 
 should be pretty easy to add.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076797#comment-14076797
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

I'm skeptical.  Looking at the 2.0 changelog, we've fixed CASSANDRA-6652 and 
CASSANDRA-6714 since 2.0.0 final, and this wouldn't have helped catch those.

So, I'm not saying that ignoring errors is a Good Thing, but when there's more 
false positives than true positives, users will learn to ignore it anyway and 
we're not actually helping anyone.

At the very least, this is demonstrably broken in 2.1.1 given this ticket right 
here.  So I see two reasonable courses of action:

# remembering old cfids in 2.1.x, then we can get paranoid about seeing them in 
the CL for 3.0.
# using the checks as a kind of assert that we enable for tests but not 
(without opt-in) for production 

I'm open to alternatives, but leaving things the way they are now is not one of 
them.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7523) add date and time types

2014-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7523:
--

Reviewer: Tyler Hobbs

[~thobbs] to review

 add date and time types
 ---

 Key: CASSANDRA-7523
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Joshua McKenzie
Priority: Minor
 Fix For: 2.0.10


 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html
 (we already have timestamp; interval is out of scope for now, and see 
 CASSANDRA-6350 for discussion on timestamp-with-time-zone.  but date/time 
 should be pretty easy to add.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076801#comment-14076801
 ] 

Matt Kennedy commented on CASSANDRA-7631:
-

In many cases, we primarily care about mixed workloads, but those need a 
populated cluster to run on. So yes, writes are important, but mostly in the 
context of concurrent reads also happening. 

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7594) Disruptor Thrift server worker thread pool not adjustable

2014-07-28 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076805#comment-14076805
 ] 

Pavel Yaskevich commented on CASSANDRA-7594:


[~rbranson] That was on my list for a while now but nobody seemed to care so I 
de-prioritized it, thanks for reporting! I'm currently OOO but will try to look 
into it ASAP.

 Disruptor Thrift server worker thread pool not adjustable
 -

 Key: CASSANDRA-7594
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7594
 Project: Cassandra
  Issue Type: Bug
Reporter: Rick Branson
Assignee: Pavel Yaskevich

 For the THsHaDisruptorServer, there may not be enough threads to run blocking 
 StorageProxy methods. The current number of worker threads is hardcoded at 2 
 per selector, so 2 * numAvailableProcessors(), or 64 threads on a 16-core 
 hyperthreaded machine. StorageProxy methods block these threads, so this puts 
 an upper bound on the throughput if hsha is enabled. If operations take 10ms 
 on average, the node can only handle a maximum of 6,400 operations per 
 second. This is a regression from hsha on 1.2.x, where the thread pool was 
 tunable using rpc_min_threads and rpc_max_threads.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076820#comment-14076820
 ] 

Benedict commented on CASSANDRA-7582:
-

I am less adamant about CfId checks as I am about failing on commit log 
checksum/mutation replay failures. I could just about live with (2), but 
naturally we will get better coverage by enabling this with all users. We don't 
know what bugs we might catch with it. So, I would prefer one of:

1) Start remembering old cfids in 2.1.1 along with this feature, so we can 
start complaining immediately; or
2) For now simply assert on non-CfId errors (i.e. make that opt-in rather than 
opt-out), introduce CfId recording at some point and make it opt-out at some 
point after

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076843#comment-14076843
 ] 

T Jake Luciani commented on CASSANDRA-7631:
---

So you want a way to quickly get a bunch of data on the cluster, then run a 
mixed workload using traditional cql reads/writes?



 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076854#comment-14076854
 ] 

Matt Kennedy commented on CASSANDRA-7631:
-

Yes, ideally formatted using your new user-defined schema stuff. I don't mean 
to speak for Russ, but we fleshed out this idea jointly.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7633) Speculating retry for LOCAL_QUORUM send requests to other DC

2014-07-28 Thread sankalp kohli (JIRA)
sankalp kohli created CASSANDRA-7633:


 Summary: Speculating retry for LOCAL_QUORUM send requests to other 
DC
 Key: CASSANDRA-7633
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7633
 Project: Cassandra
  Issue Type: Improvement
Reporter: sankalp kohli
Priority: Minor


C* can potentially send an extra request to other DC for LOCAL_QUORUM which did 
not get counted. 
This is a waste effort and we should not send this request. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Russell Alexander Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076906#comment-14076906
 ] 

Russell Alexander Spitzer commented on CASSANDRA-7631:
--

+1 Basically

Put a TB on the cluster as fast as possible, 
Then run a mixed user-defined workload

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076976#comment-14076976
 ] 

Brandon Williams commented on CASSANDRA-7631:
-

I'll just note that stress itself is probably the wrong place for this, it'll 
likely need to be a new utility that uses SSTableSimpleUnsortedWriter.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077005#comment-14077005
 ] 

Benedict commented on CASSANDRA-7631:
-

Stress seems like a perfectly reasonable place to put this, really. It also 
means we know the data generated is compatible with the stress workload, which 
is important. It's even possible to have stress output one single file per node 
in one pass, but that would require some (small-ish) amount of work.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077018#comment-14077018
 ] 

Benedict commented on CASSANDRA-7546:
-

My biggest concern with metrics is that what we expose as a metric will 
probably change when we change tack to a lock-free lazy-update design, since it 
will be more expensive to maintain. Certainly tracking the amount of 'wasted' 
work will be meaningless then, although possibly we could track the raw 
occurrences of failure to make a change atomically without interference (which 
in the lazy case would be failure to acquire exclusivity to merge your changes 
in)

I'm currently on holiday but will try to review your patch shortly.

 AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
 -

 Key: CASSANDRA-7546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: graham sanderson
Assignee: graham sanderson
 Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 
 7546.20_4.txt, 7546.20_5.txt, 7546.20_alt.txt, suggestion1.txt, 
 suggestion1_21.txt


 In order to preserve atomicity, this code attempts to read, clone/update, 
 then CAS the state of the partition.
 Under heavy contention for updating a single partition this can cause some 
 fairly staggering memory growth (the more cores on your machine the worst it 
 gets).
 Whilst many usage patterns don't do highly concurrent updates to the same 
 partition, hinting today, does, and in this case wild (order(s) of magnitude 
 more than expected) memory allocation rates can be seen (especially when the 
 updates being hinted are small updates to different partitions which can 
 happen very fast on their own) - see CASSANDRA-7545
 It would be best to eliminate/reduce/limit the spinning memory allocation 
 whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Russell Alexander Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077020#comment-14077020
 ] 

Russell Alexander Spitzer commented on CASSANDRA-7631:
--

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java
 wraps SSTableSimpleUnsorted Writer so I think we are ok there. The main reason 
I would like this as part of stress is that we already have all the data 
generation code backed in for arbitrary schemas, Thanks [~tjake]! This way we 
could prepare for a test that uses a large amount of data and a mixed workload 
much faster. 

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Russell Alexander Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077020#comment-14077020
 ] 

Russell Alexander Spitzer edited comment on CASSANDRA-7631 at 7/28/14 10:32 PM:


https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java
 wraps SSTableSimpleUnsorted Writer so I think we are ok there. The main reason 
I would like this as part of stress is that we already have all the data 
generation code written in for arbitrary schemas, Thanks [~tjake]! This way we 
could prepare for a test that writes a large amount of data and then runs a 
mixed workload much faster. 


was (Author: rspitzer):
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java
 wraps SSTableSimpleUnsorted Writer so I think we are ok there. The main reason 
I would like this as part of stress is that we already have all the data 
generation code backed in for arbitrary schemas, Thanks [~tjake]! This way we 
could prepare for a test that uses a large amount of data and a mixed workload 
much faster. 

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077050#comment-14077050
 ] 

Brandon Williams commented on CASSANDRA-7631:
-

bq. Stress seems like a perfectly reasonable place to put this, really. It also 
means we know the data generated is compatible with the stress workload, which 
is important.

I agree with your latter point, but we could still reuse the code in a separate 
utility.  It just seems like stress has enough options as it is, and 
introducing an sstable writer would make a lot of them nonsensical (like 
consistency level, replication, etc.)  I'd somewhat prefer having a clear 
delineation, util-wise, between going over the network and writing to disk.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory

2014-07-28 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077057#comment-14077057
 ] 

graham sanderson commented on CASSANDRA-7546:
-

Ok, thank you... yeah my only reason for recording something in the actual 
codebase was to indicate that to the user that they had ultra heavy partition 
contention that might be detrimental to performance, and they should perhaps 
review their schema. Given that this may not be the case at all in 3.0 (i.e. it 
may be gracefully handled in all cases), I'll try out locally with a WARN 
statement instead. I'll probably do it at memtable flush anyway which has more 
useful context (e.g. the CF in question), and would be less spam-y (i.e. one 
warn with the number of contended partitions, though perhaps the contended 
key(s) are interesting at a lower log level)... whether we include such logging 
in the final patch I don't know.

 AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
 -

 Key: CASSANDRA-7546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: graham sanderson
Assignee: graham sanderson
 Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 
 7546.20_4.txt, 7546.20_5.txt, 7546.20_alt.txt, suggestion1.txt, 
 suggestion1_21.txt


 In order to preserve atomicity, this code attempts to read, clone/update, 
 then CAS the state of the partition.
 Under heavy contention for updating a single partition this can cause some 
 fairly staggering memory growth (the more cores on your machine the worst it 
 gets).
 Whilst many usage patterns don't do highly concurrent updates to the same 
 partition, hinting today, does, and in this case wild (order(s) of magnitude 
 more than expected) memory allocation rates can be seen (especially when the 
 updates being hinted are small updates to different partitions which can 
 happen very fast on their own) - see CASSANDRA-7545
 It would be best to eliminate/reduce/limit the spinning memory allocation 
 whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077063#comment-14077063
 ] 

Benedict commented on CASSANDRA-7631:
-

Well, this sort of fits in with an extension I would like to make, which is 
in-process stressing (i.e. to avoid going over the network, and if feasible 
optionally avoid going through the native protocol), for which many of those 
options would also be meaningless.

I don't see why we couldn't provide a separate shell script that makes some of 
the options easier, but I think this makes most sense living directly in stress 
itself; we can either ignore, or complain, if unrelated options are set.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077063#comment-14077063
 ] 

Benedict edited comment on CASSANDRA-7631 at 7/28/14 10:49 PM:
---

Well, this sort of fits in with an extension I would like to make, which is 
in-process stressing (i.e. to avoid going over the network), for which many of 
those options would also be meaningless.

I don't see why we couldn't provide a separate shell script that makes some of 
the options easier, but I think this makes most sense living directly in stress 
itself; we can either ignore, or complain, if unrelated options are set.


was (Author: benedict):
Well, this sort of fits in with an extension I would like to make, which is 
in-process stressing (i.e. to avoid going over the network, and if feasible 
optionally avoid going through the native protocol), for which many of those 
options would also be meaningless.

I don't see why we couldn't provide a separate shell script that makes some of 
the options easier, but I think this makes most sense living directly in stress 
itself; we can either ignore, or complain, if unrelated options are set.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077076#comment-14077076
 ] 

Brandon Williams commented on CASSANDRA-7631:
-

bq. we can either ignore, or complain, if unrelated options are set.

As someone who has screwed up the stress options more than once, consider this 
my vote for 'complain' :)

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7601) Data loss after nodetool taketoken

2014-07-28 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-7601:
---

Labels: qa-resolved  (was: )

 Data loss after nodetool taketoken
 --

 Key: CASSANDRA-7601
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7601
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tests
 Environment: Mac OSX Mavericks. Ubuntu 14.04
Reporter: Philip Thompson
Assignee: Brandon Williams
Priority: Minor
  Labels: qa-resolved
 Fix For: 2.0.10, 2.1.0

 Attachments: 7601-2.0.txt, 7601-2.1.txt, 
 consistent_bootstrap_test.py, taketoken.tar.gz


 The dtest 
 consistent_bootstrap_test.py:TestBootstrapConsistency.consistent_reads_after_relocate_test
  is failing on HEAD of the git branches 2.1 and 2.1.0.
 The test performs the following actions:
 - Create a cluster of 3 nodes
 - Create a keyspace with RF 2
 - Take node 3 down
 - Write 980 rows to node 2 with CL ONE
 - Flush node 2
 - Bring node 3 back up
 - Run nodetool taketoken on node 3 to transfer 80% of node 1's tokens to node 
 3
 - Check for data loss
 When the check for data loss is performed, only ~725 rows can be read via CL 
 ALL.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7601) Data loss after nodetool taketoken

2014-07-28 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077083#comment-14077083
 ] 

Philip Thompson commented on CASSANDRA-7601:


Lgtm from a test perspective now that the relevant tests have been removed. The 
original problem is (clearly) solved with the removal of shuffle/taketoken.

 Data loss after nodetool taketoken
 --

 Key: CASSANDRA-7601
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7601
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tests
 Environment: Mac OSX Mavericks. Ubuntu 14.04
Reporter: Philip Thompson
Assignee: Brandon Williams
Priority: Minor
  Labels: qa-resolved
 Fix For: 2.0.10, 2.1.0

 Attachments: 7601-2.0.txt, 7601-2.1.txt, 
 consistent_bootstrap_test.py, taketoken.tar.gz


 The dtest 
 consistent_bootstrap_test.py:TestBootstrapConsistency.consistent_reads_after_relocate_test
  is failing on HEAD of the git branches 2.1 and 2.1.0.
 The test performs the following actions:
 - Create a cluster of 3 nodes
 - Create a keyspace with RF 2
 - Take node 3 down
 - Write 980 rows to node 2 with CL ONE
 - Flush node 2
 - Bring node 3 back up
 - Run nodetool taketoken on node 3 to transfer 80% of node 1's tokens to node 
 3
 - Check for data loss
 When the check for data loss is performed, only ~725 rows can be read via CL 
 ALL.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7634) cqlsh error tracing CAS

2014-07-28 Thread dan jatnieks (JIRA)
dan jatnieks created CASSANDRA-7634:
---

 Summary: cqlsh error tracing CAS
 Key: CASSANDRA-7634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7634
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: dan jatnieks
Priority: Minor


On branch cassandra-2.1.0

Getting message {{'NoneType' object has no attribute 'microseconds'}} from 
cqlsh while tracing a CAS statement.

{noformat}
Connected to devc-large at 146.148.39.53:9042.
[cqlsh 5.0.1 | Cassandra 2.1.0-rc4-SNAPSHOT | CQL spec 3.2.0 | Native protocol 
v3]
Use HELP for help.
cqlsh use test2;
cqlsh:test2 update cas set c2 = 2 where c1 = 1 if c3 = 1;

 applied
-
True

cqlsh:test2 tracing on;
Now tracing requests.
cqlsh:test2 update cas set c2 = 2 where c1 = 1 if c3 = 1;

 applied
-
True

'NoneType' object has no attribute 'microseconds'
cqlsh:test2
{noformat}

Tracing {{select *}} from the same table works as expected, but tracing the 
conditional update results in the error.

More details:
{noformat}
cqlsh:test2 desc keyspace

CREATE KEYSPACE test2 WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '3'}  AND durable_writes = true;


CREATE TABLE test2.cas (
c1 int PRIMARY KEY,
c2 int,
c3 int
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{keys:ALL, rows_per_partition:NONE}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';

cqlsh:test2 tracing on;
Tracing is already enabled. Use TRACING OFF to disable.
cqlsh:test2 select * from cas;

 c1 | c2 | c3
++
  1 |  2 |  1

(1 rows)


Tracing session: 8f0c8340-16ae-11e4-8ca3-fb429d8fb4a7

 activity   
   | timestamp  | source
 | source_elapsed
---+++

Execute CQL3 query | 2014-07-28 16:26:10.804000 | 
10.240.139.181 |  0
  Parsing select * from 
cas LIMIT 1; [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 
10.240.139.181 | 78
 
Preparing statement [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 
10.240.139.181 |282
   Determining 
replicas to query [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 
10.240.139.181 |501
Enqueuing request 
to /10.240.189.138 [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 
10.240.139.181 |918
Sending message to 
/10.240.189.138 [WRITE-/10.240.189.138] | 2014-07-28 16:26:10.805000 | 
10.240.139.181 |   1095
 Message 
received from /10.240.139.181 [Thread-27] | 2014-07-28 16:26:10.805000 | 
10.240.189.138 | 28
 Executing seq scan across 0 sstables for [min(-9223372036854775808), 
max(-4611686018427387904)] [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 
10.240.189.138 |384
Scanned 0 
rows and matched 0 [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 
10.240.189.138 |481
   Enqueuing response 
to /10.240.139.181 [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 
10.240.189.138 |570
Sending message to 
/10.240.139.181 [WRITE-/10.240.139.181] | 2014-07-28 16:26:10.806000 | 
10.240.189.138 |735
 Message 
received from /10.240.189.138 [Thread-30] | 2014-07-28 16:26:10.807000 | 
10.240.139.181 |   3264
Processing response 
from /10.240.189.138 [SharedPool-Worker-2] | 2014-07-28 16:26:10.807000 | 
10.240.139.181 |   

[jira] [Commented] (CASSANDRA-7409) Allow multiple overlapping sstables in L1

2014-07-28 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077156#comment-14077156
 ] 

Carl Yeksigian commented on CASSANDRA-7409:
---

I have a first cut of this working now at 
https://github.com/carlyeks/cassandra/tree/overlapping

This adds a new compaction strategy called 'Overlapping', which operates mostly 
the same as 'Leveled' when max_overlapping_level is configured to 0, except L0 
does not do any STCS. When max_overlapping_level is set to non-zero, it will 
compact without selecting non-overlapping sstables, and will not include any 
sstables from an upper level.

Also, added a new nodetool command to list the sstables in each level for both 
leveled and overlapping. 

I haven't benchmarked this strategy yet to compare with regular leveled; that's 
going to be what I work on next for this.

 Allow multiple overlapping sstables in L1
 -

 Key: CASSANDRA-7409
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7409
 Project: Cassandra
  Issue Type: Improvement
Reporter: Carl Yeksigian
Assignee: Carl Yeksigian

 Currently, when a normal L0 compaction takes place (not STCS), we take up to 
 MAX_COMPACTING_L0 L0 sstables and all of the overlapping L1 sstables and 
 compact them together. If we didn't have to deal with the overlapping L1 
 tables, we could compact a higher number of L0 sstables together into a set 
 of non-overlapping L1 sstables.
 This could be done by delaying the invariant that L1 has no overlapping 
 sstables. Going from L1 to L2, we would be compacting fewer sstables together 
 which overlap.
 When reading, we will not have the same one sstable per level (except L0) 
 guarantee, but this can be bounded (once we have too many sets of sstables, 
 either compact them back into the same level, or compact them up to the next 
 level).
 This could be generalized to allow any level to be the maximum for this 
 overlapping strategy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7518) The In-Memory option

2014-07-28 Thread Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077227#comment-14077227
 ] 

Hanson commented on CASSANDRA-7518:
---

I do not have access to DataStax JIRA.

My another post on stackoverflow:
http://stackoverflow.com/questions/24719276/cassandra-in-memory-option

It mentioned that the DataStax White Paper asserts that in upcoming version the 
amount of memory for single node will increase probably via JNA, but no solid 
timeline.


 The In-Memory option
 

 Key: CASSANDRA-7518
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7518
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Hanson
 Fix For: 2.1.0


 There is an In-Memory option introduced in the commercial version of 
 Cassandra by DataStax Enterprise 4.0:
 http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/inMemory.html
 But with 1GB size limited for an in-memory table.
 It would be great if the In-Memory option can be available to the community 
 version of Cassandra, and extend to a large size of in-memory table, such as 
 64GB.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077317#comment-14077317
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

I suppose we can compromise on enabling the check as soon as we remember the 
cfids, even though that leaves a hole where we can false-positive on upgrade.

How sure are you that the MalformedCommitLogException aren't going to 
false-positive on power failure?  On first inspection all of those except the 
serializedSize check look like they will be prone to that.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (CASSANDRA-7634) cqlsh error tracing CAS

2014-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-7634.
---

Resolution: Duplicate

 cqlsh error tracing CAS
 ---

 Key: CASSANDRA-7634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7634
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: dan jatnieks
Priority: Minor
  Labels: cqlsh

 On branch cassandra-2.1.0
 Getting message {{'NoneType' object has no attribute 'microseconds'}} from 
 cqlsh while tracing a CAS statement.
 {noformat}
 Connected to devc-large at 146.148.39.53:9042.
 [cqlsh 5.0.1 | Cassandra 2.1.0-rc4-SNAPSHOT | CQL spec 3.2.0 | Native 
 protocol v3]
 Use HELP for help.
 cqlsh use test2;
 cqlsh:test2 update cas set c2 = 2 where c1 = 1 if c3 = 1;
  applied
 -
 True
 cqlsh:test2 tracing on;
 Now tracing requests.
 cqlsh:test2 update cas set c2 = 2 where c1 = 1 if c3 = 1;
  applied
 -
 True
 'NoneType' object has no attribute 'microseconds'
 cqlsh:test2
 {noformat}
 Tracing {{select *}} from the same table works as expected, but tracing the 
 conditional update results in the error.
 More details:
 {noformat}
 cqlsh:test2 desc keyspace
 CREATE KEYSPACE test2 WITH replication = {'class': 'SimpleStrategy', 
 'replication_factor': '3'}  AND durable_writes = true;
 CREATE TABLE test2.cas (
 c1 int PRIMARY KEY,
 c2 int,
 c3 int
 ) WITH bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = ''
 AND compaction = {'min_threshold': '4', 'class': 
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
 'max_threshold': '32'}
 AND compression = {'sstable_compression': 
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';
 cqlsh:test2 tracing on;
 Tracing is already enabled. Use TRACING OFF to disable.
 cqlsh:test2 select * from cas;
  c1 | c2 | c3
 ++
   1 |  2 |  1
 (1 rows)
 Tracing session: 8f0c8340-16ae-11e4-8ca3-fb429d8fb4a7
  activity 
  | timestamp  | 
 source | source_elapsed
 ---+++
   
   Execute CQL3 query | 2014-07-28 16:26:10.804000 | 
 10.240.139.181 |  0
   Parsing select * 
 from cas LIMIT 1; [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 
 10.240.139.181 | 78
  
 Preparing statement [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 
 10.240.139.181 |282

 Determining replicas to query [SharedPool-Worker-1] | 2014-07-28 
 16:26:10.804000 | 10.240.139.181 |501
 Enqueuing request 
 to /10.240.189.138 [SharedPool-Worker-1] | 2014-07-28 16:26:10.804000 | 
 10.240.139.181 |918
 Sending message 
 to /10.240.189.138 [WRITE-/10.240.189.138] | 2014-07-28 16:26:10.805000 | 
 10.240.139.181 |   1095
  Message 
 received from /10.240.139.181 [Thread-27] | 2014-07-28 16:26:10.805000 | 
 10.240.189.138 | 28
  Executing seq scan across 0 sstables for [min(-9223372036854775808), 
 max(-4611686018427387904)] [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 
 | 10.240.189.138 |384
 Scanned 0 
 rows and matched 0 [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 
 10.240.189.138 |481
Enqueuing response 
 to /10.240.139.181 [SharedPool-Worker-1] | 2014-07-28 16:26:10.805000 | 
 10.240.189.138 |570
 Sending message 
 to /10.240.139.181 [WRITE-/10.240.139.181] | 2014-07-28 16:26:10.806000 | 
 10.240.189.138 |735
  Message 
 received 

[jira] [Created] (CASSANDRA-7635) Make hinted_handoff_throttle_delay_in_ms configurable via nodetool

2014-07-28 Thread Matt Stump (JIRA)
Matt Stump created CASSANDRA-7635:
-

 Summary: Make hinted_handoff_throttle_delay_in_ms configurable via 
nodetool
 Key: CASSANDRA-7635
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7635
 Project: Cassandra
  Issue Type: Improvement
Reporter: Matt Stump
Priority: Minor


Transfer of stored hints can peg the CPU of the node performing the sending of 
the hints. We have a throttle hinted_handoff_throttle_delay_in_ms, but it 
requires a restart. It would be helpful if this were configurable via nodetool 
to avoid the reboot.





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (CASSANDRA-7635) Make hinted_handoff_throttle_delay_in_ms configurable via nodetool

2014-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-7635:
-

Assignee: Lyuben Todorov

 Make hinted_handoff_throttle_delay_in_ms configurable via nodetool
 --

 Key: CASSANDRA-7635
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7635
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Stump
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 2.0.10


 Transfer of stored hints can peg the CPU of the node performing the sending 
 of the hints. We have a throttle hinted_handoff_throttle_delay_in_ms, but 
 it requires a restart. It would be helpful if this were configurable via 
 nodetool to avoid the reboot.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7635) Make hinted_handoff_throttle_delay_in_ms configurable via nodetool

2014-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7635:
--

  Component/s: Tools
Fix Version/s: 2.0.10

 Make hinted_handoff_throttle_delay_in_ms configurable via nodetool
 --

 Key: CASSANDRA-7635
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7635
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Stump
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 2.0.10


 Transfer of stored hints can peg the CPU of the node performing the sending 
 of the hints. We have a throttle hinted_handoff_throttle_delay_in_ms, but 
 it requires a restart. It would be helpful if this were configurable via 
 nodetool to avoid the reboot.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077359#comment-14077359
 ] 

T Jake Luciani commented on CASSANDRA-7631:
---

bq. As someone who has screwed up the stress options more than once, consider 
this my vote for 'complain' 

I also find the stress options/help un-intuitive. I'd like to see if we can use 
airline to address this under a different ticket.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)