[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-10-07 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787971#comment-13787971
 ] 

Sylvain Lebresne commented on CASSANDRA-6107:
-

CASSANDRA-5981 is indeed about limiting the size at the protocol level. However 
it's a global frame limitation. In particular this is the hard limit for 
queries with their values and for that reason the current hard-coded limit is 
relatively high (256MB). And we can bikeshed on the exact default to user and 
CASSANDRA-5981 will probably allow the user to play with that limit, but in any 
case, it will definitively have to be higher than the 1MB. The other detail is 
that the limit done by CASSANDRA-5981 is on the sent bytes, not the in-memory 
size of the query, but that probably don't matter much.

Anyway, provided that a prepared statement doesn't include values, it wouldn't 
be absurd to have a specific, lower limit on their size. Though my own 
preference would be to just leave it to a global limit on the 
preparedStatements cache map (but it could make sense to reject statements that 
blow up the entire limit on their own, so as to make sure to respect it). Too 
many hard-coded limitations make me nervous.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 1.2.11

 Attachments: 6107.patch, 6107_v2.patch, 6107_v3.patch, 6107-v4.txt, 
 Screen Shot 2013-10-03 at 17.59.37.png


 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6055) 'Bad Request: Invalid null value for partition key part' on SELECT .. WHERE key IN (val,NULL)

2013-10-07 Thread Roman Skvazh (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787994#comment-13787994
 ] 

Roman Skvazh commented on CASSANDRA-6055:
-

I agree with Sylvain Lebresne. CQL is not SQL and there is no need for this 
hack (ignore NULL values in queries).

 'Bad Request: Invalid null value for partition key part' on SELECT .. WHERE 
 key IN (val,NULL)
 -

 Key: CASSANDRA-6055
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6055
 Project: Cassandra
  Issue Type: Bug
 Environment: cqlsh, pdo_cassandra
Reporter: Sergey Nagaytsev
Priority: Minor
  Labels: cql3

 Query: SELECT * FROM user WHERE key IN(uuid,NULL);
 Table:
 CREATE COLUMNFAMILY user (
  KEY uuid PRIMARY KEY,
  name text,
  note text,
  avatar text,
  email text,
  phone text,
  login text,
  pw text,
  st text
 );
 Logs: Nothing, last message hours ago.
 This query is good in SQL and so is generated by DB abstraction library. Fix 
 on applications sides is multiplying of work.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4988) Fix concurrent addition of collection columns

2013-10-07 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788002#comment-13788002
 ] 

Sylvain Lebresne commented on CASSANDRA-4988:
-

Each separate collection already has it's separate entry in schema_columns, and 
we have all the information there, so I don't think we need a new table here. 
The information is already redundant. It's just that because the comparator 
object needs to know the collections (to implement correctly 
AbstractType.compareCollectionMembers), we currently include the collection 
names in the comparator serialized form and we should stop doing that, but we 
already have all the information we need.

In fact, in 2.0 the 'comparator' field in schema_columnfamilies is entirely 
useless, all the information it contains can be reconstructed from the 
schema_columns. So probably the right solution is to stop saving that field at 
all, and to reconstruct it from schema_columns instead. Which would also some 
the concurrent modification of comparator components other problem I've 
discussed above.

Of course, we'd need to be careful with backward compatibility if we do so.

 Fix concurrent addition of collection columns
 -

 Key: CASSANDRA-4988
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4988
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 2.0.2


 It is currently not safe to update the schema by adding multiple collection 
 columns to the same table. The reason is that with collections, the 
 comparator embeds a map of names-comparator for each collection columns 
 (since different maps can have different key type for example). And when 
 serialized on disk in the schema table, the comparator is serialized as a 
 string with that map as one column. So if new collection columns are added 
 concurrently, the addition may not be merged correctly.
 One option to fix this would be to stop serializing the names-comparator map 
 of ColumnToCollectionType in toString(), and do one of:
 # reconstruct that map from the information stores in the schema_columns. The 
 downside I can see is that code-wise this may not be super clean to do.
 # change ColumnToCollectionType so that instead of having it's own 
 names-comparator map, to just store a point to the CFMetaData that contains 
 it and when it needs to find the exact comparator for a collection column, it 
 would use CFMetadata.column_metadata directly. The downside is that creating 
 a dependency from a comparator to a CFMetadata feels a bit backward.
 Note sure what's the best solution of the two honestly.
 While probably more anecdotal, we also now allow to change the type of the 
 comparator in some cases (for example updating to BytesType is always 
 allowed), and doing so concurrently on multiple components of a composite 
 comparator is also not safe for a similar reason. I'm not sure how to fix 
 that one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Lyuben Todorov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788015#comment-13788015
 ] 

Lyuben Todorov commented on CASSANDRA-4809:
---

The patch works as intended. To specify a particular ks/cf to be archived we 
need to add parameters when starting cassandra.

{code}
./cassandra -Dcassandra.readKeyspaceCommitlog=exampleKS 
-Dcassandra.readColumnFamilyCommitlog=exampleCF
{code}

Maybe it would be a better option to add restore_keyspace and 
restore_columnfamily config options to commitlog_archiving.properties. If the 
original setup is preferable I'll rebase the patch (it doesn't apply cleanly 
because of other changes on the branch).

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Lyuben Todorov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788015#comment-13788015
 ] 

Lyuben Todorov edited comment on CASSANDRA-4809 at 10/7/13 9:54 AM:


The patch works as intended. To specify a particular ks/cf to be archived we 
need to add parameters when starting cassandra.

{code}
./cassandra -Dcassandra.readKeyspaceCommitlog=exampleKS 
-Dcassandra.readColumnFamilyCommitlog=exampleCF
{code}

Maybe it would be a better option to add restore_keyspace and 
restore_columnfamily config options to commitlog_archiving.properties. If the 
original setup is preferable I'll rebase the patch (it doesn't apply cleanly 
because of other changes on the branch).

The last thing I want to change is: The patch assumes that users will want to 
replay only part of the commitlog, we should still allow them to replay the 
entire log if the ks/cf parameters aren't supplied.


was (Author: lyubent):
The patch works as intended. To specify a particular ks/cf to be archived we 
need to add parameters when starting cassandra.

{code}
./cassandra -Dcassandra.readKeyspaceCommitlog=exampleKS 
-Dcassandra.readColumnFamilyCommitlog=exampleCF
{code}

Maybe it would be a better option to add restore_keyspace and 
restore_columnfamily config options to commitlog_archiving.properties. If the 
original setup is preferable I'll rebase the patch (it doesn't apply cleanly 
because of other changes on the branch).

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Lyuben Todorov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788015#comment-13788015
 ] 

Lyuben Todorov edited comment on CASSANDRA-4809 at 10/7/13 9:58 AM:


The patch works as intended. To specify a particular ks/cf to be archived we 
need to add parameters when starting cassandra.

{code}
./cassandra -Dcassandra.readKeyspaceCommitlog=exampleKS 
-Dcassandra.readColumnFamilyCommitlog=exampleCF
{code}

Maybe it would be a better option to add restore_keyspace and 
restore_columnfamily config options to commitlog_archiving.properties. If the 
original setup is preferable I'll rebase the patch (it doesn't apply cleanly 
because of other changes on the branch).


was (Author: lyubent):
The patch works as intended. To specify a particular ks/cf to be archived we 
need to add parameters when starting cassandra.

{code}
./cassandra -Dcassandra.readKeyspaceCommitlog=exampleKS 
-Dcassandra.readColumnFamilyCommitlog=exampleCF
{code}

Maybe it would be a better option to add restore_keyspace and 
restore_columnfamily config options to commitlog_archiving.properties. If the 
original setup is preferable I'll rebase the patch (it doesn't apply cleanly 
because of other changes on the branch).

The last thing I want to change is: The patch assumes that users will want to 
replay only part of the commitlog, we should still allow them to replay the 
entire log if the ks/cf parameters aren't supplied.

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput

2013-10-07 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown reassigned CASSANDRA-4718:
--

Assignee: Jason Brown

 More-efficient ExecutorService for improved throughput
 --

 Key: CASSANDRA-4718
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
Priority: Minor
  Labels: performance
 Attachments: baq vs trunk.png, op costs of various queues.ods, 
 PerThreadQueue.java


 Currently all our execution stages dequeue tasks one at a time.  This can 
 result in contention between producers and consumers (although we do our best 
 to minimize this by using LinkedBlockingQueue).
 One approach to mitigating this would be to make consumer threads do more 
 work in bulk instead of just one task per dequeue.  (Producer threads tend 
 to be single-task oriented by nature, so I don't see an equivalent 
 opportunity there.)
 BlockingQueue has a drainTo(collection, int) method that would be perfect for 
 this.  However, no ExecutorService in the jdk supports using drainTo, nor 
 could I google one.
 What I would like to do here is create just such a beast and wire it into (at 
 least) the write and read stages.  (Other possible candidates for such an 
 optimization, such as the CommitLog and OutboundTCPConnection, are not 
 ExecutorService-based and will need to be one-offs.)
 AbstractExecutorService may be useful.  The implementations of 
 ICommitLogExecutorService may also be useful. (Despite the name these are not 
 actual ExecutorServices, although they share the most important properties of 
 one.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4988) Fix concurrent addition of collection columns

2013-10-07 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788144#comment-13788144
 ] 

Aleksey Yeschenko commented on CASSANDRA-4988:
--

IMO, this is what we should do (get rid of the comparator column in 
schema_columnfamilies). And get rid of the rest of the redundancies:

- column_aliases, key_aliases, value_alias columns
- CASSANDRA-4603

And after that implement CASSANDRA-6038 in 3.0, with all the garbage cleaned up.

 Fix concurrent addition of collection columns
 -

 Key: CASSANDRA-4988
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4988
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 2.0.2


 It is currently not safe to update the schema by adding multiple collection 
 columns to the same table. The reason is that with collections, the 
 comparator embeds a map of names-comparator for each collection columns 
 (since different maps can have different key type for example). And when 
 serialized on disk in the schema table, the comparator is serialized as a 
 string with that map as one column. So if new collection columns are added 
 concurrently, the addition may not be merged correctly.
 One option to fix this would be to stop serializing the names-comparator map 
 of ColumnToCollectionType in toString(), and do one of:
 # reconstruct that map from the information stores in the schema_columns. The 
 downside I can see is that code-wise this may not be super clean to do.
 # change ColumnToCollectionType so that instead of having it's own 
 names-comparator map, to just store a point to the CFMetaData that contains 
 it and when it needs to find the exact comparator for a collection column, it 
 would use CFMetadata.column_metadata directly. The downside is that creating 
 a dependency from a comparator to a CFMetadata feels a bit backward.
 Note sure what's the best solution of the two honestly.
 While probably more anecdotal, we also now allow to change the type of the 
 comparator in some cases (for example updating to BytesType is always 
 allowed), and doing so concurrently on multiple components of a composite 
 comparator is also not safe for a similar reason. I'm not sure how to fix 
 that one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Mike Bulman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788157#comment-13788157
 ] 

Mike Bulman commented on CASSANDRA-4809:


Fwiw, it would be great if multiple keyspaces and/or column families could be 
specified.  Requiring a restart of c* for each CF is.. tedious.  If I wanted to 
restore multiple column families with the current patch, is there a way to know 
that the restore is done?  (eg, once thrift is available?)

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Lyuben Todorov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788188#comment-13788188
 ] 

Lyuben Todorov edited comment on CASSANDRA-4809 at 10/7/13 2:35 PM:


bq.  is there a way to know that the restore is done?
Once the replays complete you should get a message: *INFO 17:31:09,085 Log 
replay complete, X replayed mutations* and this does occur before thrift starts.

As for the multiple kss/cfs we could do what we did in 
[4191|https://issues.apache.org/jira/browse/CASSANDRA-4191] where a list of 
ks.cf1 ks2.cf1 is supplied.


was (Author: lyubent):
bq.  is there a way to know that the restore is done?
Once the replays complete you should get a message: *INFO 17:31:09,085 Log 
replay complete, 0 replayed mutations*

As for the multiple kss/cfs we could do what we did in 
[4191|https://issues.apache.org/jira/browse/CASSANDRA-4191] where a list of 
ks.cf1 ks2.cf1 is supplied.

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Lyuben Todorov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788188#comment-13788188
 ] 

Lyuben Todorov commented on CASSANDRA-4809:
---

bq.  is there a way to know that the restore is done?
Once the replays complete you should get a message: *INFO 17:31:09,085 Log 
replay complete, 0 replayed mutations*

As for the multiple kss/cfs we could do what we did in 
[4191|https://issues.apache.org/jira/browse/CASSANDRA-4191] where a list of 
ks.cf1 ks2.cf1 is supplied.

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788199#comment-13788199
 ] 

Jonathan Ellis commented on CASSANDRA-4809:
---

The -D solution is a bit clunky but putting restore properties in the archive 
config file feels backwards too.  And -D is maybe a bit less likely to be left 
enabled for multiple restarts by mistake, so let's go with that.

List approach is fine.

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Mike Bulman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788210#comment-13788210
 ] 

Mike Bulman commented on CASSANDRA-4809:


bq. and this does occur before thrift starts.

Cool.  Waiting for thrift to be available wfm

bq. List approach is fine.

+1

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6128) Add more data mappings for Pig

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788217#comment-13788217
 ] 

Brandon Williams commented on CASSANDRA-6128:
-

Shouldn't DecimalType map to a float or double instead of a string?

 Add more data mappings for Pig
 --

 Key: CASSANDRA-6128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6128
 Project: Cassandra
  Issue Type: Bug
Reporter: Alex Liu
Assignee: Alex Liu
 Attachments: 6128-1.2-branch.txt


 We need add more data mappings for
 {code}
  DecimalType
  InetAddressType
  LexicalUUIDType
  TimeUUIDType
  UUIDType
 {code}
 Existing implementation throws exception for those data type



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (CASSANDRA-5905) Cassandra crashes on restart with IndexOutOfBoundsException: index (2) must be less than size (2)

2013-10-07 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-5905.
---

   Resolution: Duplicate
Fix Version/s: (was: 1.2.11)

As near as I can tell, this is another manifestation of the same problem as 
CASSANDRA-5202 -- since we reuse CFIDs for different incarnations of CFs with 
the same name, commitlog will try to apply mutations from an earlier 
definition, to whatever the current one is.


 Cassandra crashes on restart with IndexOutOfBoundsException: index (2) must 
 be less than size (2)
 ---

 Key: CASSANDRA-5905
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5905
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu 12.04 x64
 3 nodes, 1GB each
Reporter: David Semeria

 All 3 nodes crash on restart with same error:
 INFO 13:31:05,272 Finished reading 
 /var/lib/cassandra/commitlog/CommitLog-2-1376754824649.log
  INFO 13:31:05,272 Replaying 
 /var/lib/cassandra/commitlog/CommitLog-2-1376754824650.log
  INFO 13:31:08,267 Finished reading 
 /var/lib/cassandra/commitlog/CommitLog-2-1376754824650.log
 java.lang.IndexOutOfBoundsException: index (2) must be less than size (2)
   at 
 com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305)
   at 
 com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284)
   at 
 com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:81)
   at 
 org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:94)
   at 
 org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:76)
   at 
 org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
   at java.util.TreeMap.compare(TreeMap.java:1188)
   at java.util.TreeMap.put(TreeMap.java:531)
   at 
 org.apache.cassandra.db.TreeMapBackedSortedColumns.addColumn(TreeMapBackedSortedColumns.java:102)
   at 
 org.apache.cassandra.db.TreeMapBackedSortedColumns.addColumn(TreeMapBackedSortedColumns.java:88)
   at 
 org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:114)
   at 
 org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:109)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:101)
   at 
 org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:376)
   at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:203)
   at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
   at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146)
   at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:281)
   at 
 org.apache.cassandra.service.CassandraDaemon.init(CassandraDaemon.java:358)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:212)
 Cannot load daemon
 Service exit with a return value of 3



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5815) NPE from migration manager

2013-10-07 Thread Chris Burroughs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788228#comment-13788228
 ] 

Chris Burroughs commented on CASSANDRA-5815:


I'm seeing an NPE in migration manager in 1.2.9 and what I think is the same 
spot (line numbers changed slightly since July).  This occurs on at least one 
node every time (about 10 attempts) I try to bootstrap with a 2 dc production 
cluster using the GPFS w/ reconnecting.

{noformat}
ERROR [OptionalTasks:1] 2013-10-07 08:06:05,658 CassandraDaemon.java (line 194) 
Exception in thread Thread[OptionalTasks:1,5,main]
java.lang.NullPointerException
at 
org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:130)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}

I added a log message to confirm that Gossiper really really thinks it's not 
there (off of the 1.2.10 tag if that matters).  I'm suspicious of this being a 
timing problem the reconnect dance, but I'm not sure how to prove or disprove 
that.

{noformat}
logger.warn([csb] Trying to get endpoint state for {} ; 
exists {}, new Object[] {endpoint, 
Gossiper.instance.isKnownEndpoint(endpoint)});

 INFO [GossipTasks:1] 2013-10-07 11:19:10,565 Gossiper.java (line 803) 
InetAddress /208.49.103.36 is now DOWN
 INFO [GossipTasks:1] 2013-10-07 11:19:13,572 Gossiper.java (line 608) 
FatClient /208.49.103.36 has been silent for 3ms, removing from gossip
 INFO [HANDSHAKE-/208.49.103.36] 2013-10-07 11:19:13,863 
OutboundTcpConnection.java (line 399) Handshaking version with /208.49.103.36
 INFO [HANDSHAKE-/208.49.103.36] 2013-10-07 11:19:15,275 
OutboundTcpConnection.java (line 399) Handshaking version with /208.49.103.36
 WARN [OptionalTasks:1] 2013-10-07 11:19:36,696 MigrationManager.java (line 
130) [csb] Trying to get endpoint state for /208.49.103.36 ; exists false
ERROR [OptionalTasks:1] 2013-10-07 11:19:36,696 CassandraDaemon.java (line 193) 
Exception in thread Thread[OptionalTasks:1,5,main]
java.lang.NullPointerException
at 
org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:131)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}

 NPE from migration manager
 --

 Key: CASSANDRA-5815
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5815
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.12
Reporter: Vishy Kasar
Assignee: Brandon Williams
Priority: Minor

 In one of our production clusters we see this error often. Looking through 
 the source, Gossiper.instance.getEndpointStateForEndpoint(endpoint) is 
 returning null for some end point. De we need any config change on our end to 
 resolve this? In any case, cassandra should be updated to protect against 
 this NPE.
 ERROR [OptionalTasks:1] 2013-07-24 13:40:38,972 AbstractCassandraDaemon.java 
 (line 132) Exception in thread Thread[OptionalTasks:1,5,main] 
 java.lang.NullPointerException 
 at 
 org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:134)
  
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) 
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) 
 at java.util.concurrent.FutureTask.run(FutureTask.java:138) 
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
  
 at 
 

[jira] [Assigned] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated

2013-10-07 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-6151:
---

Assignee: Alex Liu

 CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
 

 Key: CASSANDRA-6151
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Russell Alexander Spitzer
Assignee: Alex Liu
Priority: Minor

 From 
 http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546
 The user was attempting to load a single partition using a where clause in a 
 pig load statement. 
 CQL Table
 {code}
 CREATE table data (
   occurday  text,
   seqnumber int,
   occurtimems bigint,
   unique bigint,
   fields maptext, text,
   primary key ((occurday, seqnumber), occurtimems, unique)
 )
 {code}
 Pig Load statement Query
 {code}
 data = LOAD 
 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27'
  USING CqlStorage();
 {code}
 This results in an exception when processed by the the CqlPagingRecordReader 
 which attempts to page this query even though it contains at most one 
 partition key. This leads to an invalid CQL statement. 
 CqlPagingRecordReader Query
 {code}
 SELECT * FROM data WHERE token(occurday,seqnumber)  ? AND
 token(occurday,seqnumber) = ? AND occurday='A Great Day' 
 AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
 {code}
 Exception
 {code}
  InvalidRequestException(why:occurday cannot be restricted by more than one 
 relation if it includes an Equal)
 {code}
 I'm not sure it is worth the special case but, a modification to not use the 
 paging record reader when the entire partition key is specified would solve 
 this issue. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788231#comment-13788231
 ] 

Brandon Williams commented on CASSANDRA-6151:
-

To be fair, I'm not sure why you'd use M/R for a single partition, but I'll let 
Alex decide what to do based on difficulty here.

 CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
 

 Key: CASSANDRA-6151
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Russell Alexander Spitzer
Assignee: Alex Liu
Priority: Minor

 From 
 http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546
 The user was attempting to load a single partition using a where clause in a 
 pig load statement. 
 CQL Table
 {code}
 CREATE table data (
   occurday  text,
   seqnumber int,
   occurtimems bigint,
   unique bigint,
   fields maptext, text,
   primary key ((occurday, seqnumber), occurtimems, unique)
 )
 {code}
 Pig Load statement Query
 {code}
 data = LOAD 
 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27'
  USING CqlStorage();
 {code}
 This results in an exception when processed by the the CqlPagingRecordReader 
 which attempts to page this query even though it contains at most one 
 partition key. This leads to an invalid CQL statement. 
 CqlPagingRecordReader Query
 {code}
 SELECT * FROM data WHERE token(occurday,seqnumber)  ? AND
 token(occurday,seqnumber) = ? AND occurday='A Great Day' 
 AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
 {code}
 Exception
 {code}
  InvalidRequestException(why:occurday cannot be restricted by more than one 
 relation if it includes an Equal)
 {code}
 I'm not sure it is worth the special case but, a modification to not use the 
 paging record reader when the entire partition key is specified would solve 
 this issue. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6137) CQL3 SELECT IN CLAUSE inconsistent

2013-10-07 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788238#comment-13788238
 ] 

Constance Eustace commented on CASSANDRA-6137:
--

The error is reappearing.

Nodetool repairs/compactions/etc triggering didn't do anything.

Perhaps this is yet another caching error, this time in the key caches. We're 
going to try some restarts of the node or schemes to clean the caches to see if 
this is an actual internal data representation problem.

 CQL3 SELECT IN CLAUSE inconsistent
 --

 Key: CASSANDRA-6137
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6137
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu AWS Cassandra 2.0.1 SINGLE NODE
Reporter: Constance Eustace
 Fix For: 2.0.1


 We are encountering inconsistent results from CQL3 queries with column keys 
 using IN clause in WHERE. This has been reproduced in cqlsh.
 Rowkey is e_entid
 Column key is p_prop
 This returns roughly 21 rows for 21 column keys that match p_prop.
 cqlsh SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB';
 These three queries each return one row for the requested single column key 
 in the IN clause:
 SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:complete:count');
 SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:all:count');
 SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:fail:count');
 This query returns ONLY ONE ROW (one column key), not three as I would expect 
 from the three-column-key IN clause:
 cqlsh SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');
 This query does return two rows however for the requested two column keys:
 cqlsh SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in (  
   
 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');
 cqlsh describe table internal_submission.entity_job;
 CREATE TABLE entity_job (
   e_entid text,
   p_prop text,
   describes text,
   dndcondition text,
   e_entlinks text,
   e_entname text,
   e_enttype text,
   ingeststatus text,
   ingeststatusdetail text,
   p_flags text,
   p_propid text,
   p_proplinks text,
   p_storage text,
   p_subents text,
   p_val text,
   p_vallang text,
   p_vallinks text,
   p_valtype text,
   p_valunit text,
   p_vars text,
   partnerid text,
   referenceid text,
   size int,
   sourceip text,
   submitdate bigint,
   submitevent text,
   userid text,
   version text,
   PRIMARY KEY (e_entid, p_prop)
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='NONE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes);
 CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job 
 (dndcondition);
 CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job 
 (ingeststatus);
 CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job 
 (ingeststatusdetail);
 CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job 
 (referenceid);
 CREATE INDEX 

[jira] [Commented] (CASSANDRA-5815) NPE from migration manager

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788237#comment-13788237
 ] 

Brandon Williams commented on CASSANDRA-5815:
-

It looks the same to me.  The good news is the error is purely cosmetic at this 
point, there's nothing left to do if the gossiper has removed the node (not to 
mention it's a fat client)

 NPE from migration manager
 --

 Key: CASSANDRA-5815
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5815
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.12
Reporter: Vishy Kasar
Assignee: Brandon Williams
Priority: Minor

 In one of our production clusters we see this error often. Looking through 
 the source, Gossiper.instance.getEndpointStateForEndpoint(endpoint) is 
 returning null for some end point. De we need any config change on our end to 
 resolve this? In any case, cassandra should be updated to protect against 
 this NPE.
 ERROR [OptionalTasks:1] 2013-07-24 13:40:38,972 AbstractCassandraDaemon.java 
 (line 132) Exception in thread Thread[OptionalTasks:1,5,main] 
 java.lang.NullPointerException 
 at 
 org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:134)
  
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) 
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) 
 at java.util.concurrent.FutureTask.run(FutureTask.java:138) 
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
  
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
  
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  
 at java.lang.Thread.run(Thread.java:662)
 It turned out that the reason for NPE was we bootstrapped a node with the 
 same token as another node. Cassandra should not throw an NPE here but log a 
 meaningful error message. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput

2013-10-07 Thread darion yaphets (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788243#comment-13788243
 ] 

darion yaphets commented on CASSANDRA-4718:
---

LMAX Disruptor's RingBuffer maybe a good idea for lock free component
But maybe set a bigger size for hold the structure in ring buffer to avoid  
cover by new one
And is meaning to use more memory ...

 More-efficient ExecutorService for improved throughput
 --

 Key: CASSANDRA-4718
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
Priority: Minor
  Labels: performance
 Attachments: baq vs trunk.png, op costs of various queues.ods, 
 PerThreadQueue.java


 Currently all our execution stages dequeue tasks one at a time.  This can 
 result in contention between producers and consumers (although we do our best 
 to minimize this by using LinkedBlockingQueue).
 One approach to mitigating this would be to make consumer threads do more 
 work in bulk instead of just one task per dequeue.  (Producer threads tend 
 to be single-task oriented by nature, so I don't see an equivalent 
 opportunity there.)
 BlockingQueue has a drainTo(collection, int) method that would be perfect for 
 this.  However, no ExecutorService in the jdk supports using drainTo, nor 
 could I google one.
 What I would like to do here is create just such a beast and wire it into (at 
 least) the write and read stages.  (Other possible candidates for such an 
 optimization, such as the CommitLog and OutboundTCPConnection, are not 
 ExecutorService-based and will need to be one-offs.)
 AbstractExecutorService may be useful.  The implementations of 
 ICommitLogExecutorService may also be useful. (Despite the name these are not 
 actual ExecutorServices, although they share the most important properties of 
 one.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated

2013-10-07 Thread Russell Alexander Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788245#comment-13788245
 ] 

Russell Alexander Spitzer commented on CASSANDRA-6151:
--

[~brandon.williams], I agree, but we should at least have a better error 
message then.

 CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
 

 Key: CASSANDRA-6151
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Russell Alexander Spitzer
Assignee: Alex Liu
Priority: Minor

 From 
 http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546
 The user was attempting to load a single partition using a where clause in a 
 pig load statement. 
 CQL Table
 {code}
 CREATE table data (
   occurday  text,
   seqnumber int,
   occurtimems bigint,
   unique bigint,
   fields maptext, text,
   primary key ((occurday, seqnumber), occurtimems, unique)
 )
 {code}
 Pig Load statement Query
 {code}
 data = LOAD 
 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27'
  USING CqlStorage();
 {code}
 This results in an exception when processed by the the CqlPagingRecordReader 
 which attempts to page this query even though it contains at most one 
 partition key. This leads to an invalid CQL statement. 
 CqlPagingRecordReader Query
 {code}
 SELECT * FROM data WHERE token(occurday,seqnumber)  ? AND
 token(occurday,seqnumber) = ? AND occurday='A Great Day' 
 AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
 {code}
 Exception
 {code}
  InvalidRequestException(why:occurday cannot be restricted by more than one 
 relation if it includes an Equal)
 {code}
 I'm not sure it is worth the special case but, a modification to not use the 
 paging record reader when the entire partition key is specified would solve 
 this issue. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

2013-10-07 Thread J.B. Langston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

J.B. Langston updated CASSANDRA-5911:
-

Attachment: 6528_140171_knwmuqxe9bjv5re_system.log

Attached system.log showing commitlog replay.  This was produced by running 
stress against a single-node cassandra cluster, then running drain and 
restarting.

 Commit logs are not removed after nodetool flush or nodetool drain
 --

 Key: CASSANDRA-5911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: J.B. Langston
Assignee: Vijay
Priority: Minor
 Fix For: 2.0.2

 Attachments: 6528_140171_knwmuqxe9bjv5re_system.log


 Commit logs are not removed after nodetool flush or nodetool drain. This can 
 lead to unnecessary commit log replay during startup.  I've reproduced this 
 on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
 Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
 which can make startup take a long time (on the order of 20-30 min).
 Reproduction follows:
 {code}
 jblangston:bin jblangston$ ./cassandra  /dev/null
 jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000  
 /dev/null
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool flush
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool drain
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ pkill java
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ ./cassandra -f | grep Replaying
  INFO 10:03:42,915 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
  INFO 10:03:42,922 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log
  INFO 10:03:43,912 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log
  INFO 10:03:43,912 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log
  INFO 10:03:43,912 Replaying 
 

[jira] [Commented] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

2013-10-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788250#comment-13788250
 ] 

Jonathan Ellis commented on CASSANDRA-5911:
---

{noformat}
DEBUG [main] 2013-08-21 10:39:36,311 CommitLogReplayer.java (line 150) Reading 
mutation at 0
DEBUG [main] 2013-08-21 10:39:36,328 CommitLogReplayer.java (line 150) Reading 
mutation at 336
DEBUG [main] 2013-08-21 10:39:36,328 CommitLogReplayer.java (line 150) Reading 
mutation at 672
{noformat}

This is what concerns me; I would expect it to start scanning w/ the most 
recent flush point, which should be the same as the end of the commitlog after 
drain.

 Commit logs are not removed after nodetool flush or nodetool drain
 --

 Key: CASSANDRA-5911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: J.B. Langston
Assignee: Vijay
Priority: Minor
 Fix For: 2.0.2

 Attachments: 6528_140171_knwmuqxe9bjv5re_system.log


 Commit logs are not removed after nodetool flush or nodetool drain. This can 
 lead to unnecessary commit log replay during startup.  I've reproduced this 
 on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
 Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
 which can make startup take a long time (on the order of 20-30 min).
 Reproduction follows:
 {code}
 jblangston:bin jblangston$ ./cassandra  /dev/null
 jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000  
 /dev/null
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool flush
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool drain
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ pkill java
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ ./cassandra -f | grep Replaying
  INFO 10:03:42,915 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
  INFO 10:03:42,922 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log
  INFO 10:03:43,911 Replaying 
 

[jira] [Commented] (CASSANDRA-6124) Ability to specify a DC to consume from when using ColumnFamilyInputFormat externally

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788254#comment-13788254
 ] 

Brandon Williams commented on CASSANDRA-6124:
-

Worth noting this is another case where a CL.LOCALONE would be useful (cc 
[~jjordan])

 Ability to specify a DC to consume from when using ColumnFamilyInputFormat 
 externally
 -

 Key: CASSANDRA-6124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6124
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Patricio Echague
Priority: Minor
  Labels: hadoop
 Fix For: 1.2.11

 Attachments: CASSANDRA-6124.diff, CASSANDRA-6124-v2.diff


 Our production environment looks like this:
 - 6 cassandra nodes (online DC)
 - 3 cassandra nodes (offline DC)
 - Hadoop cluster.
 we are interested in connecting to the offline DC from hadoop (not colocated 
 with cassandra offline dc)
 I've tested this patch and seems to work with our 1.2.5 deployment.
 Kindly review.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6124) Ability to specify a DC to consume from when using ColumnFamilyInputFormat externally

2013-10-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788259#comment-13788259
 ] 

Jonathan Ellis commented on CASSANDRA-6124:
---

I'd favor LOCAL_ONE as a slightly more generally useful solution.

 Ability to specify a DC to consume from when using ColumnFamilyInputFormat 
 externally
 -

 Key: CASSANDRA-6124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6124
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Patricio Echague
Priority: Minor
  Labels: hadoop
 Fix For: 1.2.11

 Attachments: CASSANDRA-6124.diff, CASSANDRA-6124-v2.diff


 Our production environment looks like this:
 - 6 cassandra nodes (online DC)
 - 3 cassandra nodes (offline DC)
 - Hadoop cluster.
 we are interested in connecting to the offline DC from hadoop (not colocated 
 with cassandra offline dc)
 I've tested this patch and seems to work with our 1.2.5 deployment.
 Kindly review.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6124) Ability to specify a DC to consume from when using ColumnFamilyInputFormat externally

2013-10-07 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788265#comment-13788265
 ] 

Jeremiah Jordan commented on CASSANDRA-6124:


+1 to LOCAL_ONE, I still think it is kind of silly, but I have been thinking of 
more and more fairly valid use cases for it...

 Ability to specify a DC to consume from when using ColumnFamilyInputFormat 
 externally
 -

 Key: CASSANDRA-6124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6124
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Patricio Echague
Priority: Minor
  Labels: hadoop
 Fix For: 1.2.11

 Attachments: CASSANDRA-6124.diff, CASSANDRA-6124-v2.diff


 Our production environment looks like this:
 - 6 cassandra nodes (online DC)
 - 3 cassandra nodes (offline DC)
 - Hadoop cluster.
 we are interested in connecting to the offline DC from hadoop (not colocated 
 with cassandra offline dc)
 I've tested this patch and seems to work with our 1.2.5 deployment.
 Kindly review.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5815) NPE from migration manager

2013-10-07 Thread Chris Burroughs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788284#comment-13788284
 ] 

Chris Burroughs commented on CASSANDRA-5815:


Whoops, missed the important part for the case I am seeing but might not be 
part of the original (bootstrapping with the same token would presumably fail 
anyway).  The situation I am seeing post NPE is:
 * Bootstrapping node expects steams from NPE-node
 * NPE-node says it has no outstanding streams

And thus bootstrap never completes.

 NPE from migration manager
 --

 Key: CASSANDRA-5815
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5815
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.12
Reporter: Vishy Kasar
Assignee: Brandon Williams
Priority: Minor

 In one of our production clusters we see this error often. Looking through 
 the source, Gossiper.instance.getEndpointStateForEndpoint(endpoint) is 
 returning null for some end point. De we need any config change on our end to 
 resolve this? In any case, cassandra should be updated to protect against 
 this NPE.
 ERROR [OptionalTasks:1] 2013-07-24 13:40:38,972 AbstractCassandraDaemon.java 
 (line 132) Exception in thread Thread[OptionalTasks:1,5,main] 
 java.lang.NullPointerException 
 at 
 org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:134)
  
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) 
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) 
 at java.util.concurrent.FutureTask.run(FutureTask.java:138) 
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
  
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
  
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  
 at java.lang.Thread.run(Thread.java:662)
 It turned out that the reason for NPE was we bootstrapped a node with the 
 same token as another node. Cassandra should not throw an NPE here but log a 
 meaningful error message. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6153) Stress stopped calculating latency stats

2013-10-07 Thread Ryan McGuire (JIRA)
Ryan McGuire created CASSANDRA-6153:
---

 Summary: Stress stopped calculating latency stats
 Key: CASSANDRA-6153
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6153
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ryan McGuire


In trunk, cassandra-stress has stopped calculating all latency information:

From trunk:
{code}
$ ccm node1 stress
Created keyspaces. Sleeping 1s for propagation.
total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time
89995,8999,8999,0.0,0.0,0.0,10
304267,21427,21427,0.0,0.0,0.0,20
514791,21052,21052,0.0,0.0,0.0,30
727471,21268,21268,0.0,0.0,0.0,40
926467,19899,19899,0.0,0.0,0.0,50
100,7353,7353,0.0,0.0,0.0,54


Averages from the middle 80% of values:
interval_op_rate  : 21249
interval_key_rate : 21249
latency median: 0.0
latency 95th percentile   : 0.0
latency 99.9th percentile : 0.0
Total operation time  : 00:00:54
END
{code}

From 2.0:
{code}
$ ccm node1 stress
Created keyspaces. Sleeping 1s for propagation.
total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time
66720,6672,6672,0.2,25.6,201.6,10
289577,22285,22285,0.2,3.4,201.1,20
489105,19952,19952,0.2,1.8,201.2,30
660916,17181,17181,0.2,1.6,87.9,40
847452,18653,18653,0.2,1.6,108.8,50
100,15254,15254,0.2,1.6,108.9,59


Averages from the middle 80% of values:
interval_op_rate  : 19517
interval_key_rate : 19517
latency median: 0.2
latency 95th percentile   : 2.1
latency 99.9th percentile : 149.8
Total operation time  : 00:00:59
END
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6137) CQL3 SELECT IN CLAUSE inconsistent

2013-10-07 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788294#comment-13788294
 ] 

Constance Eustace commented on CASSANDRA-6137:
--

nodetool invalidatekeycache did nothing, and we have already turned off 
rowcaching for another bug...

A wider audit of 15,000 rows in other tables that have quite complex column key 
names produced no errors.

It must be the :count at the end of these column keys...

 CQL3 SELECT IN CLAUSE inconsistent
 --

 Key: CASSANDRA-6137
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6137
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu AWS Cassandra 2.0.1 SINGLE NODE
Reporter: Constance Eustace
 Fix For: 2.0.1


 We are encountering inconsistent results from CQL3 queries with column keys 
 using IN clause in WHERE. This has been reproduced in cqlsh.
 Rowkey is e_entid
 Column key is p_prop
 This returns roughly 21 rows for 21 column keys that match p_prop.
 cqlsh SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB';
 These three queries each return one row for the requested single column key 
 in the IN clause:
 SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:complete:count');
 SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:all:count');
 SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:fail:count');
 This query returns ONLY ONE ROW (one column key), not three as I would expect 
 from the three-column-key IN clause:
 cqlsh SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');
 This query does return two rows however for the requested two column keys:
 cqlsh SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in (  
   
 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');
 cqlsh describe table internal_submission.entity_job;
 CREATE TABLE entity_job (
   e_entid text,
   p_prop text,
   describes text,
   dndcondition text,
   e_entlinks text,
   e_entname text,
   e_enttype text,
   ingeststatus text,
   ingeststatusdetail text,
   p_flags text,
   p_propid text,
   p_proplinks text,
   p_storage text,
   p_subents text,
   p_val text,
   p_vallang text,
   p_vallinks text,
   p_valtype text,
   p_valunit text,
   p_vars text,
   partnerid text,
   referenceid text,
   size int,
   sourceip text,
   submitdate bigint,
   submitevent text,
   userid text,
   version text,
   PRIMARY KEY (e_entid, p_prop)
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='NONE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes);
 CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job 
 (dndcondition);
 CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job 
 (ingeststatus);
 CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job 
 (ingeststatusdetail);
 CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job 
 (referenceid);
 CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid);
 CREATE 

[jira] [Commented] (CASSANDRA-5815) NPE from migration manager

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788296#comment-13788296
 ] 

Brandon Williams commented on CASSANDRA-5815:
-

[~cburroughs] I think your problem is something else, since the bootstrapping 
node has not only been marked down, but it's been down long enough to get 
removed (which is the race between the gossiper and MM causing this NPE)  I 
will note for myself though that the fat client removal should also wait until 
the node has been marked down before beginning the 30s countdown to removal.

If the node has connected but the gossiper doesn't know about it, they haven't 
gossiped yet, so there's really nothing for MM to do yet anyway.

 NPE from migration manager
 --

 Key: CASSANDRA-5815
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5815
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.12
Reporter: Vishy Kasar
Assignee: Brandon Williams
Priority: Minor

 In one of our production clusters we see this error often. Looking through 
 the source, Gossiper.instance.getEndpointStateForEndpoint(endpoint) is 
 returning null for some end point. De we need any config change on our end to 
 resolve this? In any case, cassandra should be updated to protect against 
 this NPE.
 ERROR [OptionalTasks:1] 2013-07-24 13:40:38,972 AbstractCassandraDaemon.java 
 (line 132) Exception in thread Thread[OptionalTasks:1,5,main] 
 java.lang.NullPointerException 
 at 
 org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:134)
  
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) 
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) 
 at java.util.concurrent.FutureTask.run(FutureTask.java:138) 
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
  
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
  
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  
 at java.lang.Thread.run(Thread.java:662)
 It turned out that the reason for NPE was we bootstrapped a node with the 
 same token as another node. Cassandra should not throw an NPE here but log a 
 meaningful error message. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6053) system.peers table not updated after decommissioning nodes in C* 2.0

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788300#comment-13788300
 ] 

Brandon Williams commented on CASSANDRA-6053:
-

If a user knows enough to disable loading the state and that fixes the problem, 
they can clear the peers table manually.

 system.peers table not updated after decommissioning nodes in C* 2.0
 

 Key: CASSANDRA-6053
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6053
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Datastax AMI running EC2 m1.xlarge instances
Reporter: Guyon Moree
Assignee: Brandon Williams
 Attachments: peers


 After decommissioning my cluster from 20 to 9 nodes using opscenter, I found 
 all but one of the nodes had incorrect system.peers tables.
 This became a problem (afaik) when using the python-driver, since this 
 queries the peers table to set up its connection pool. Resulting in very slow 
 startup times, because of timeouts.
 The output of nodetool didn't seem to be affected. After removing the 
 incorrect entries from the peers tables, the connection issues seem to have 
 disappeared for us. 
 Would like some feedback on if this was the right way to handle the issue or 
 if I'm still left with a broken cluster.
 Attached is the output of nodetool status, which shows the correct 9 nodes. 
 Below that the output of the system.peers tables on the individual nodes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-2827) Thrift error

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788304#comment-13788304
 ] 

Brandon Williams commented on CASSANDRA-2827:
-

See CASSANDRA-5529

 Thrift error
 

 Key: CASSANDRA-2827
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2827
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.4
 Environment: 2 nodes with 0.7.4 on linux
Reporter: Olivier Smadja

 This exeception occured of a non seed node.
 ERROR [pool-1-thread-9] 2011-06-25 17:41:37,723 CustomTThreadPoolServer.java 
 (line 218) Thrift error occurred during processing of message.
 org.apache.thrift.TException: Negative length: -2147418111
   at 
 org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
   at 
 org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:15964)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3023)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
   at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
   at java.lang.Thread.run(Thread.java:619)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-2848) Make the Client API support passing down timeouts

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788320#comment-13788320
 ] 

Brandon Williams commented on CASSANDRA-2848:
-

The problem with that is, it removes request backpressure control from the 
server, leaving it vulnerable to bad client behavior.

 Make the Client API support passing down timeouts
 -

 Key: CASSANDRA-2848
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2848
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Priority: Minor

 Having a max server RPC timeout is good for worst case, but many applications 
 that have middleware in front of Cassandra, might have higher timeout 
 requirements. In a fail fast environment, if my application starting at say 
 the front-end, only has 20ms to process a request, and it must connect to X 
 services down the stack, by the time it hits Cassandra, we might only have 
 10ms. I propose we provide the ability to specify the timeout on each call we 
 do optionally.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6154) Inserts are blocked in 2.1

2013-10-07 Thread Ryan McGuire (JIRA)
Ryan McGuire created CASSANDRA-6154:
---

 Summary: Inserts are blocked in 2.1
 Key: CASSANDRA-6154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Priority: Critical


With cluster sizes 1 inserts are blocked indefinitely:

{code}
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v 
git:trunk test
Fetching Cassandra updates...
Current cluster is now: test
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh
Connected to test at 127.0.0.1:9160.
[cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
19.37.0]
Use HELP for help.
cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
cqlsh USE timeline;
cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
text, PRIMARY KEY (userid, event));
cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
'ryan', '2013-10-07', 'attempt');
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6154) Inserts are blocked in 2.1

2013-10-07 Thread Ryan McGuire (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-6154:


Description: 
With cluster sizes 1 inserts are blocked indefinitely:

{code}
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v 
git:trunk test
Fetching Cassandra updates...
Current cluster is now: test
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh
Connected to test at 127.0.0.1:9160.
[cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
19.37.0]
Use HELP for help.
cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
cqlsh USE timeline;
cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
text, PRIMARY KEY (userid, event));
cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
'ryan', '2013-10-07', 'attempt');
{code}

The last INSERT statment never returns..

  was:
With cluster sizes 1 inserts are blocked indefinitely:

{code}
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v 
git:trunk test
Fetching Cassandra updates...
Current cluster is now: test
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh
Connected to test at 127.0.0.1:9160.
[cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
19.37.0]
Use HELP for help.
cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
cqlsh USE timeline;
cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
text, PRIMARY KEY (userid, event));
cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
'ryan', '2013-10-07', 'attempt');
{code}


 Inserts are blocked in 2.1
 --

 Key: CASSANDRA-6154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Priority: Critical

 With cluster sizes 1 inserts are blocked indefinitely:
 {code}
 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v 
 git:trunk test
 Fetching Cassandra updates...
 Current cluster is now: test
 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate 
 -n 2
 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start
 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh
 Connected to test at 127.0.0.1:9160.
 [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
 19.37.0]
 Use HELP for help.
 cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': 1};
 cqlsh USE timeline;
 cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
 text, PRIMARY KEY (userid, event));
 cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
 'ryan', '2013-10-07', 'attempt');
 {code}
 The last INSERT statment never returns..



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6154) Inserts are blocked in 2.1

2013-10-07 Thread Ryan McGuire (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-6154:


Description: 
With cluster sizes 1 inserts are blocked indefinitely:

{code}
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v 
git:trunk test
Fetching Cassandra updates...
Current cluster is now: test
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh
Connected to test at 127.0.0.1:9160.
[cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
19.37.0]
Use HELP for help.
cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
cqlsh USE timeline;
cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
text, PRIMARY KEY (userid, event));
cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
'ryan', '2013-10-07', 'attempt');
{code}

The last INSERT statement never returns..

  was:
With cluster sizes 1 inserts are blocked indefinitely:

{code}
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v 
git:trunk test
Fetching Cassandra updates...
Current cluster is now: test
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh
Connected to test at 127.0.0.1:9160.
[cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
19.37.0]
Use HELP for help.
cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
cqlsh USE timeline;
cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
text, PRIMARY KEY (userid, event));
cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
'ryan', '2013-10-07', 'attempt');
{code}

The last INSERT statment never returns..


 Inserts are blocked in 2.1
 --

 Key: CASSANDRA-6154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Priority: Critical

 With cluster sizes 1 inserts are blocked indefinitely:
 {code}
 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v 
 git:trunk test
 Fetching Cassandra updates...
 Current cluster is now: test
 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate 
 -n 2
 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start
 01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh
 Connected to test at 127.0.0.1:9160.
 [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
 19.37.0]
 Use HELP for help.
 cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': 1};
 cqlsh USE timeline;
 cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
 text, PRIMARY KEY (userid, event));
 cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
 'ryan', '2013-10-07', 'attempt');
 {code}
 The last INSERT statement never returns..



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6154) Inserts are blocked in 2.1

2013-10-07 Thread Ryan McGuire (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-6154:


Description: 
With cluster sizes 1 inserts are blocked indefinitely:

{code}
$ ccm create -v git:trunk test
Fetching Cassandra updates...
Current cluster is now: test
$ ccm populate -n 2
$ ccm start
$ ccm node1 cqlsh
Connected to test at 127.0.0.1:9160.
[cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
19.37.0]
Use HELP for help.
cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
cqlsh USE timeline;
cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
text, PRIMARY KEY (userid, event));
cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
'ryan', '2013-10-07', 'attempt');
{code}

The last INSERT statement never returns..

  was:
With cluster sizes 1 inserts are blocked indefinitely:

{code}
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm create -v 
git:trunk test
Fetching Cassandra updates...
Current cluster is now: test
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm populate -n 2
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm start
01:38 PM:~/git/datastax/uniform-sample-data/timeline[master*]$ ccm node1 cqlsh
Connected to test at 127.0.0.1:9160.
[cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
19.37.0]
Use HELP for help.
cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
cqlsh USE timeline;
cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
text, PRIMARY KEY (userid, event));
cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
'ryan', '2013-10-07', 'attempt');
{code}

The last INSERT statement never returns..


 Inserts are blocked in 2.1
 --

 Key: CASSANDRA-6154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Priority: Critical

 With cluster sizes 1 inserts are blocked indefinitely:
 {code}
 $ ccm create -v git:trunk test
 Fetching Cassandra updates...
 Current cluster is now: test
 $ ccm populate -n 2
 $ ccm start
 $ ccm node1 cqlsh
 Connected to test at 127.0.0.1:9160.
 [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
 19.37.0]
 Use HELP for help.
 cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': 1};
 cqlsh USE timeline;
 cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
 text, PRIMARY KEY (userid, event));
 cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
 'ryan', '2013-10-07', 'attempt');
 {code}
 The last INSERT statement never returns..



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6155) Big cluster upgrade test

2013-10-07 Thread Mark Dewey (JIRA)
Mark Dewey created CASSANDRA-6155:
-

 Summary: Big cluster upgrade test
 Key: CASSANDRA-6155
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6155
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Mark Dewey


I am planning on writing a test that would:
 # Launch a 20 node cluster
 # Put 100 gigs of data on it using cassandra-stress
 # Perform a rolling upgrade of the cluster
 # Read the data using cassandra-stress



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6155) Big cluster upgrade test

2013-10-07 Thread Mark Dewey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788347#comment-13788347
 ] 

Mark Dewey commented on CASSANDRA-6155:
---

[~jbellis], [~dmeyer] please chime in with any suggestions, like # of columns, 
width of columns, etc.

Is there interest in having this upgrade test happen while reads and/or writes 
are happening on the cluster?

 Big cluster upgrade test
 

 Key: CASSANDRA-6155
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6155
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Mark Dewey

 I am planning on writing a test that would:
  # Launch a 20 node cluster
  # Put 100 gigs of data on it using cassandra-stress
  # Perform a rolling upgrade of the cluster
  # Read the data using cassandra-stress



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (CASSANDRA-6153) Stress stopped calculating latency stats

2013-10-07 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-6153:
-

Assignee: Mikhail Stepura

 Stress stopped calculating latency stats
 

 Key: CASSANDRA-6153
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6153
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ryan McGuire
Assignee: Mikhail Stepura
 Fix For: 2.1


 In trunk, cassandra-stress has stopped calculating all latency information:
 From trunk:
 {code}
 $ ccm node1 stress
 Created keyspaces. Sleeping 1s for propagation.
 total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time
 89995,8999,8999,0.0,0.0,0.0,10
 304267,21427,21427,0.0,0.0,0.0,20
 514791,21052,21052,0.0,0.0,0.0,30
 727471,21268,21268,0.0,0.0,0.0,40
 926467,19899,19899,0.0,0.0,0.0,50
 100,7353,7353,0.0,0.0,0.0,54
 Averages from the middle 80% of values:
 interval_op_rate  : 21249
 interval_key_rate : 21249
 latency median: 0.0
 latency 95th percentile   : 0.0
 latency 99.9th percentile : 0.0
 Total operation time  : 00:00:54
 END
 {code}
 From 2.0:
 {code}
 $ ccm node1 stress
 Created keyspaces. Sleeping 1s for propagation.
 total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time
 66720,6672,6672,0.2,25.6,201.6,10
 289577,22285,22285,0.2,3.4,201.1,20
 489105,19952,19952,0.2,1.8,201.2,30
 660916,17181,17181,0.2,1.6,87.9,40
 847452,18653,18653,0.2,1.6,108.8,50
 100,15254,15254,0.2,1.6,108.9,59
 Averages from the middle 80% of values:
 interval_op_rate  : 19517
 interval_key_rate : 19517
 latency median: 0.2
 latency 95th percentile   : 2.1
 latency 99.9th percentile : 149.8
 Total operation time  : 00:00:59
 END
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6153) Stress stopped calculating latency stats

2013-10-07 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6153:
--

Fix Version/s: 2.1

 Stress stopped calculating latency stats
 

 Key: CASSANDRA-6153
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6153
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ryan McGuire
Assignee: Mikhail Stepura
 Fix For: 2.1


 In trunk, cassandra-stress has stopped calculating all latency information:
 From trunk:
 {code}
 $ ccm node1 stress
 Created keyspaces. Sleeping 1s for propagation.
 total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time
 89995,8999,8999,0.0,0.0,0.0,10
 304267,21427,21427,0.0,0.0,0.0,20
 514791,21052,21052,0.0,0.0,0.0,30
 727471,21268,21268,0.0,0.0,0.0,40
 926467,19899,19899,0.0,0.0,0.0,50
 100,7353,7353,0.0,0.0,0.0,54
 Averages from the middle 80% of values:
 interval_op_rate  : 21249
 interval_key_rate : 21249
 latency median: 0.0
 latency 95th percentile   : 0.0
 latency 99.9th percentile : 0.0
 Total operation time  : 00:00:54
 END
 {code}
 From 2.0:
 {code}
 $ ccm node1 stress
 Created keyspaces. Sleeping 1s for propagation.
 total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time
 66720,6672,6672,0.2,25.6,201.6,10
 289577,22285,22285,0.2,3.4,201.1,20
 489105,19952,19952,0.2,1.8,201.2,30
 660916,17181,17181,0.2,1.6,87.9,40
 847452,18653,18653,0.2,1.6,108.8,50
 100,15254,15254,0.2,1.6,108.9,59
 Averages from the middle 80% of values:
 interval_op_rate  : 19517
 interval_key_rate : 19517
 latency median: 0.2
 latency 95th percentile   : 2.1
 latency 99.9th percentile : 149.8
 Total operation time  : 00:00:59
 END
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


git commit: Fix SP.sendToHintedEndpoints() javadoc

2013-10-07 Thread aleksey
Updated Branches:
  refs/heads/cassandra-1.2 d396fd47d - 9d31ac14d


Fix SP.sendToHintedEndpoints() javadoc


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9d31ac14
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9d31ac14
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9d31ac14

Branch: refs/heads/cassandra-1.2
Commit: 9d31ac14dfec10ede53cbd6fecfd9a08c39bfa45
Parents: d396fd4
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Oct 8 01:55:29 2013 +0800
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Oct 8 01:55:29 2013 +0800

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9d31ac14/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index 8a6e52e..cdb0bd6 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -467,7 +467,7 @@ public class StorageProxy implements StorageProxyMBean
  * | off|   =1  | -- DO NOT fire hints. And DO NOT 
wait for them to complete.
  * | off|   ANY  | -- DO NOT fire hints. And DO NOT 
wait for them to complete.
  *
- * @throws TimeoutException if the hints cannot be written/enqueued
+ * @throws OverloadedException if the hints cannot be written/enqueued
  */
 public static void sendToHintedEndpoints(final RowMutation rm,
  IterableInetAddress targets,



[1/2] git commit: Fix SP.sendToHintedEndpoints() javadoc

2013-10-07 Thread aleksey
Updated Branches:
  refs/heads/cassandra-2.0 6a603046e - 8e8db1f20


Fix SP.sendToHintedEndpoints() javadoc


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9d31ac14
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9d31ac14
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9d31ac14

Branch: refs/heads/cassandra-2.0
Commit: 9d31ac14dfec10ede53cbd6fecfd9a08c39bfa45
Parents: d396fd4
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Oct 8 01:55:29 2013 +0800
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Oct 8 01:55:29 2013 +0800

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9d31ac14/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index 8a6e52e..cdb0bd6 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -467,7 +467,7 @@ public class StorageProxy implements StorageProxyMBean
  * | off|   =1  | -- DO NOT fire hints. And DO NOT 
wait for them to complete.
  * | off|   ANY  | -- DO NOT fire hints. And DO NOT 
wait for them to complete.
  *
- * @throws TimeoutException if the hints cannot be written/enqueued
+ * @throws OverloadedException if the hints cannot be written/enqueued
  */
 public static void sendToHintedEndpoints(final RowMutation rm,
  IterableInetAddress targets,



[2/2] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-10-07 Thread aleksey
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8e8db1f2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8e8db1f2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8e8db1f2

Branch: refs/heads/cassandra-2.0
Commit: 8e8db1f20688eef9a89e112f2295d160c9c35075
Parents: 6a60304 9d31ac1
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Oct 8 01:56:55 2013 +0800
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Oct 8 01:56:55 2013 +0800

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/8e8db1f2/src/java/org/apache/cassandra/service/StorageProxy.java
--



[jira] [Updated] (CASSANDRA-6155) Big cluster upgrade test

2013-10-07 Thread Mark Dewey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Dewey updated CASSANDRA-6155:
--

Description: 
I am planning on writing a test that would:
 # Launch a 20 node cluster
 # Put 100 gigs of data on it (per node) using cassandra-stress
 # Perform a rolling upgrade of the cluster
 # Read the data using cassandra-stress

  was:
I am planning on writing a test that would:
 # Launch a 20 node cluster
 # Put 100 gigs of data on it using cassandra-stress
 # Perform a rolling upgrade of the cluster
 # Read the data using cassandra-stress


 Big cluster upgrade test
 

 Key: CASSANDRA-6155
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6155
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Mark Dewey

 I am planning on writing a test that would:
  # Launch a 20 node cluster
  # Put 100 gigs of data on it (per node) using cassandra-stress
  # Perform a rolling upgrade of the cluster
  # Read the data using cassandra-stress



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated

2013-10-07 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788362#comment-13788362
 ] 

Alex Liu commented on CASSANDRA-6151:
-

I think it's better not implement this to keep paging algorithm simple.  Let 
Pig/Hive to filter the result by using partition key clause at Hive/Pig side, 
CqlPagingRecordReader paging through the rows with only user defined where 
clauses on none-partition keys. Let's document this somewhere.

The paging algorithm of CqlPagingRecordReader is based on token range.

 CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
 

 Key: CASSANDRA-6151
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Russell Alexander Spitzer
Assignee: Alex Liu
Priority: Minor

 From 
 http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546
 The user was attempting to load a single partition using a where clause in a 
 pig load statement. 
 CQL Table
 {code}
 CREATE table data (
   occurday  text,
   seqnumber int,
   occurtimems bigint,
   unique bigint,
   fields maptext, text,
   primary key ((occurday, seqnumber), occurtimems, unique)
 )
 {code}
 Pig Load statement Query
 {code}
 data = LOAD 
 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27'
  USING CqlStorage();
 {code}
 This results in an exception when processed by the the CqlPagingRecordReader 
 which attempts to page this query even though it contains at most one 
 partition key. This leads to an invalid CQL statement. 
 CqlPagingRecordReader Query
 {code}
 SELECT * FROM data WHERE token(occurday,seqnumber)  ? AND
 token(occurday,seqnumber) = ? AND occurday='A Great Day' 
 AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
 {code}
 Exception
 {code}
  InvalidRequestException(why:occurday cannot be restricted by more than one 
 relation if it includes an Equal)
 {code}
 I'm not sure it is worth the special case but, a modification to not use the 
 paging record reader when the entire partition key is specified would solve 
 this issue. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated

2013-10-07 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788365#comment-13788365
 ] 

Alex Liu commented on CASSANDRA-6151:
-

Or we can add a validate method to validate the user defined where clauses.

 CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
 

 Key: CASSANDRA-6151
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Russell Alexander Spitzer
Assignee: Alex Liu
Priority: Minor

 From 
 http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546
 The user was attempting to load a single partition using a where clause in a 
 pig load statement. 
 CQL Table
 {code}
 CREATE table data (
   occurday  text,
   seqnumber int,
   occurtimems bigint,
   unique bigint,
   fields maptext, text,
   primary key ((occurday, seqnumber), occurtimems, unique)
 )
 {code}
 Pig Load statement Query
 {code}
 data = LOAD 
 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27'
  USING CqlStorage();
 {code}
 This results in an exception when processed by the the CqlPagingRecordReader 
 which attempts to page this query even though it contains at most one 
 partition key. This leads to an invalid CQL statement. 
 CqlPagingRecordReader Query
 {code}
 SELECT * FROM data WHERE token(occurday,seqnumber)  ? AND
 token(occurday,seqnumber) = ? AND occurday='A Great Day' 
 AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
 {code}
 Exception
 {code}
  InvalidRequestException(why:occurday cannot be restricted by more than one 
 relation if it includes an Equal)
 {code}
 I'm not sure it is worth the special case but, a modification to not use the 
 paging record reader when the entire partition key is specified would solve 
 this issue. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-07 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788367#comment-13788367
 ] 

Ryan McGuire commented on CASSANDRA-4338:
-

I started to run a benchmark for this but I found CASSANDRA-6153 and 
CASSANDRA-6154 standing in my way.

[Here's the 
data|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.4338.CompressedSequentialWriter.jsonmetric=interval_op_rateoperation=stress-writesmoothing=4]
 for my test with [~krummas]' patch, but it's missing any sort of baseline 
because of those above bugs. 

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk-me.png, 
 gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6156) Poor resilience and recovery for bootstrapping node - unable to fetch range

2013-10-07 Thread Alyssa Kwan (JIRA)
Alyssa Kwan created CASSANDRA-6156:
--

 Summary: Poor resilience and recovery for bootstrapping node - 
unable to fetch range
 Key: CASSANDRA-6156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6156
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Alyssa Kwan
 Fix For: 1.2.8


We have an 8 node cluster on 1.2.8 using vnodes.  One of our nodes failed and 
we are having lots of trouble bootstrapping it back.  On each attempt, 
bootstrapping eventually fails with a RuntimeException Unable to fetch range. 
 As far as we can tell, long GC pauses on the sender side cause heartbeat drops 
or delays, which leads the gossip controller to convict the connection and mark 
the sender dead.  We've done significant GC tuning to minimize the duration of 
pauses and raised phi_convict to its max.  It merely lets the bootstrap process 
take longer to fail.

The inability to reliably add nodes significantly affects our ability to scale.

We're not the only ones:  
http://stackoverflow.com/questions/19199349/cassandra-bootstrap-fails-with-unable-to-fetch-range

What can we do in the immediate term to bring this node in?  And what's the 
long term solution?

One possible solution would be to allow bootstrapping to be an incremental 
process with individual transfers of vnode ownership instead of attempting to 
transfer the whole set of vnodes transactionally.  (I assume that's what's 
happening now.)  I don't know what would have to change on the gossip and 
token-aware client side to support this.

Another solution would be to partition sstable files by vnode and allow 
transfer of those files directly with some sort of checkpointing of and 
incremental transfer of writes after the sstable is transferred.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-2848) Make the Client API support passing down timeouts

2013-10-07 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788390#comment-13788390
 ] 

sankalp kohli commented on CASSANDRA-2848:
--

So what we can do is that rpc specified in server can be a max. If client is 
passing timeout less than that, there is no need to keep the read request 
running for longer. 
This is very useful as when cluster is on its knees, these extra read requests 
getting killed can help a lot. 


 Make the Client API support passing down timeouts
 -

 Key: CASSANDRA-2848
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2848
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Priority: Minor

 Having a max server RPC timeout is good for worst case, but many applications 
 that have middleware in front of Cassandra, might have higher timeout 
 requirements. In a fail fast environment, if my application starting at say 
 the front-end, only has 20ms to process a request, and it must connect to X 
 services down the stack, by the time it hits Cassandra, we might only have 
 10ms. I propose we provide the ability to specify the timeout on each call we 
 do optionally.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6128) Add more data mappings for Pig

2013-10-07 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788402#comment-13788402
 ] 

Alex Liu commented on CASSANDRA-6128:
-

Decimal has different precision than float/double, we will lose precision if we 
convert a decimal to a float/double. It is explained in this link 
http://stackoverflow.com/questions/5749615/losing-precision-converting-from-java-bigdecimal-to-double

If we don't need preserve the precision, we can use a double instead of a 
string. 

 Add more data mappings for Pig
 --

 Key: CASSANDRA-6128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6128
 Project: Cassandra
  Issue Type: Bug
Reporter: Alex Liu
Assignee: Alex Liu
 Attachments: 6128-1.2-branch.txt


 We need add more data mappings for
 {code}
  DecimalType
  InetAddressType
  LexicalUUIDType
  TimeUUIDType
  UUIDType
 {code}
 Existing implementation throws exception for those data type



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Lyuben Todorov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lyuben Todorov updated CASSANDRA-4809:
--

Attachment: 4809__v2.patch

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch, 4809__v2.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6149) OOM in Cassandra 2.0.1

2013-10-07 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788422#comment-13788422
 ] 

Pavel Yaskevich commented on CASSANDRA-6149:


+1

 OOM in Cassandra 2.0.1
 --

 Key: CASSANDRA-6149
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6149
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Windows 7 64 bit. Java 64-bit 1.7.0_25. Cassandra 2.0.1
Reporter: Kai Wang
Assignee: Jonathan Ellis
 Fix For: 2.0.2

 Attachments: 6149-debug.txt, 6149.txt


 I have a program to stress test Cassandra. What it does is remove/insert rows 
 with a small set of row keys as fast as possible. Two CFs are involved. When 
 I test against C* 1.2.3 with default configurations, it ran for 24 hours and 
 C* doesn't having any issue. However after I upgraded to C* 2.0.1, C* crashes 
 on OOM within 1-2 minutes. I can consistently reproduce this.
 I built C* from the source and found out the last good changeset is 
 cfa097cdd5e28d7fe8204248e246a1fae226d2c0. As soon as I include the next 
 changeset 1e0d9513b748fae4ec0737283da71c65e9272102, C* starts to crash. 
 What's interesting is although it seems the change was reverted by 
 fc1a7206fe15882fd64e7ba8eb68ba9dc320275f. C* built from 
 fc1a7206fe15882fd64e7ba8eb68ba9dc320275f has the same problem - OOM within 
 minutes.
 I didn't test against the official 2.0.0. But the C* built from 
 03045ca22b11b0e5fc85c4fabd83ce6121b5709b seems OK. I assume that's what 2.0.0 
 is.
 I use default configurations in all cases. I didn't tune anything.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788416#comment-13788416
 ] 

Carl Yeksigian commented on CASSANDRA-4809:
---

Sorry, missed the question before.

Overall, the proposed updates to the patch look good (I had forgotten about the 
low quality of the first patch -- done during a meetup). I'm wondering why the 
keyspace checks have been moved inside of the loop, but the control flow is 
still a return. Seems like either we can check it once before the loop and exit 
the method, or we continue instead of return if we want to check each mutation.

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch, 4809__v2.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[1/3] git commit: Fix SP.sendToHintedEndpoints() javadoc

2013-10-07 Thread aleksey
Updated Branches:
  refs/heads/trunk e43b82ba6 - b966e1ad2


Fix SP.sendToHintedEndpoints() javadoc


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9d31ac14
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9d31ac14
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9d31ac14

Branch: refs/heads/trunk
Commit: 9d31ac14dfec10ede53cbd6fecfd9a08c39bfa45
Parents: d396fd4
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Oct 8 01:55:29 2013 +0800
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Oct 8 01:55:29 2013 +0800

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9d31ac14/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index 8a6e52e..cdb0bd6 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -467,7 +467,7 @@ public class StorageProxy implements StorageProxyMBean
  * | off|   =1  | -- DO NOT fire hints. And DO NOT 
wait for them to complete.
  * | off|   ANY  | -- DO NOT fire hints. And DO NOT 
wait for them to complete.
  *
- * @throws TimeoutException if the hints cannot be written/enqueued
+ * @throws OverloadedException if the hints cannot be written/enqueued
  */
 public static void sendToHintedEndpoints(final RowMutation rm,
  IterableInetAddress targets,



[2/3] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-10-07 Thread aleksey
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8e8db1f2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8e8db1f2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8e8db1f2

Branch: refs/heads/trunk
Commit: 8e8db1f20688eef9a89e112f2295d160c9c35075
Parents: 6a60304 9d31ac1
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Oct 8 01:56:55 2013 +0800
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Oct 8 01:56:55 2013 +0800

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/8e8db1f2/src/java/org/apache/cassandra/service/StorageProxy.java
--



[3/3] git commit: Merge branch 'cassandra-2.0' into trunk

2013-10-07 Thread aleksey
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b966e1ad
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b966e1ad
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b966e1ad

Branch: refs/heads/trunk
Commit: b966e1ad21a345838a2b50e1790e0257fab30c7f
Parents: e43b82b 8e8db1f
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Oct 8 02:39:59 2013 +0800
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Oct 8 02:39:59 2013 +0800

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b966e1ad/src/java/org/apache/cassandra/service/StorageProxy.java
--



[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput

2013-10-07 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788425#comment-13788425
 ] 

Benedict commented on CASSANDRA-4718:
-

Disruptors are very difficult to use as a drop in replacement for the executor 
service, so I tried to knock up some queues that could provide similar 
performance without ripping apart the whole application. The resulting queues I 
benchmarked under high load, in isolation, against LinkedBlockingQueue, 
BlockingArrayQueue and the Disruptor, and plotted the average op costs in the 
op costs of various queues attachment*. As can be seen, these queues and the 
Disruptor are substantially faster under high load than LinkedBlockingQueue, 
however it can also be seen that:

- The average op cost for LinkedBlockingQueue is still very low, in fact only 
around 300ns at worst
- BlockingArrayQueue is considerably worse than LinkedBlockingQueue under all 
conditions

These suggest both that the overhead attributed to LinkedBlockingQueue for a 
1Mop workload (as run above) should be at most a few seconds of the overall 
cost (probably much less); and that BlockingArrayQueue is unlikely to make any 
cost incurred by LinkedBlockingQueue substantially better. This made me suspect 
the previous result might be attributable to random variance, but to be sure I 
ran a number of ccm -stress tests with the different queues, and plotted the 
results in stress op rate with various queues.ods, which show the following:

1) No meaningful difference between BAQ, LBQ and SlowQueue (though the latter 
has a clear ~1% slow down)
2) UltraSlow (~10x slow down, or 2000ns spinning each op) is approximately 5% 
slower
3) The faster queue actually slows down the process, by about 9% - more than 
the queue supposedly much slower than it!

Anyway, I've been concurrently looking at where I might be able to improve 
performance independent of this, and have found the following:

A) Raw performance of local reads is ~6-7x faster than through Stress
B) Raw performance of local reads run asynchronously is ~4x faster
C) Raw performance of local reads run asynchronously using the fast queue is 
~4.7x faster
D) Performance of local reads from the Thrift server-side methods is ~3x faster
E) Performance of remote (i.e. local non-optimised) reads is ~1.5x faster

In particular (C) is interesting, as it demonstrates the queue really is faster 
in use, but I've yet to absolutely determine why that translates into an 
overall decline in throughput. It looks as though it's possible it causes 
greater congestion in LockSupport.unpark(), but this is a new piece of 
information, derived from YourKit. As these sorts of methods are difficult to 
meter accurately I don't necessarily trust it, and haven't had a chance to 
figure out what I can do with the information. If it is accurate, and I can 
figure out how to reduce the overhead, we might get a modest speed boost, which 
will accumulate as we find other places to improve.

As to the overall problem of improving throughput, it seems to me that there 
are two big avenues to explore: 

  1) the networking (software) overhead is large;
  2) possibly the cost of managing thread liveness (e.g. park/unpark/scheduler 
costs); though the evidence for this is as yet inconclusive... given the op 
rate and other evidence it doesn't seem to be synchronization overhead. I'm 
still trying to pin this down.

Once the costs here are nailed down as tight as they can go, I'm pretty 
confident we can get some noticeable improvements to the actual work being 
done, but since that currently accounts for only a fraction of the time spent 
(probably less than 20%), I'd rather wait until it was a higher percentage so 
any improvement is multiplied.


* These can be replicated by running 
org.apache.cassandra.concurrent.test.bench.Benchmark on any of the linked 
branches on github. 

https://github.com/belliottsmith/cassandra/tree/4718-lbq [using 
LinkedBlockingQueue]
https://github.com/belliottsmith/cassandra/tree/4718-baq [using 
BlockingArrayQueue]
https://github.com/belliottsmith/cassandra/tree/4718-lpbq [using a new high 
performance queue]
https://github.com/belliottsmith/cassandra/tree/4718-slow [using a 
LinkedBlockingQueue with 200ns spinning each op]
https://github.com/belliottsmith/cassandra/tree/4718-ultraslow [using a 
LinkedBlockingQueue with 2000ns spinning each op]


 More-efficient ExecutorService for improved throughput
 --

 Key: CASSANDRA-4718
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
Priority: Minor
  Labels: performance
 Attachments: baq vs trunk.png, op costs of various queues.ods, 
 

[jira] [Updated] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput

2013-10-07 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-4718:


Attachment: stress op rate with various queues.ods

 More-efficient ExecutorService for improved throughput
 --

 Key: CASSANDRA-4718
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
Priority: Minor
  Labels: performance
 Attachments: baq vs trunk.png, op costs of various queues.ods, 
 PerThreadQueue.java, stress op rate with various queues.ods


 Currently all our execution stages dequeue tasks one at a time.  This can 
 result in contention between producers and consumers (although we do our best 
 to minimize this by using LinkedBlockingQueue).
 One approach to mitigating this would be to make consumer threads do more 
 work in bulk instead of just one task per dequeue.  (Producer threads tend 
 to be single-task oriented by nature, so I don't see an equivalent 
 opportunity there.)
 BlockingQueue has a drainTo(collection, int) method that would be perfect for 
 this.  However, no ExecutorService in the jdk supports using drainTo, nor 
 could I google one.
 What I would like to do here is create just such a beast and wire it into (at 
 least) the write and read stages.  (Other possible candidates for such an 
 optimization, such as the CommitLog and OutboundTCPConnection, are not 
 ExecutorService-based and will need to be one-offs.)
 AbstractExecutorService may be useful.  The implementations of 
 ICommitLogExecutorService may also be useful. (Despite the name these are not 
 actual ExecutorServices, although they share the most important properties of 
 one.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput

2013-10-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788426#comment-13788426
 ] 

Jonathan Ellis commented on CASSANDRA-4718:
---

bq. The faster queue actually slows down the process, by about 9% - more than 
the queue supposedly much slower than it

So this actually confirms Ryan's original measurement of C*/BAQ [slow queue] 
faster than C*/LBQ [fast queue]?

 More-efficient ExecutorService for improved throughput
 --

 Key: CASSANDRA-4718
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
Priority: Minor
  Labels: performance
 Attachments: baq vs trunk.png, op costs of various queues.ods, 
 PerThreadQueue.java, stress op rate with various queues.ods


 Currently all our execution stages dequeue tasks one at a time.  This can 
 result in contention between producers and consumers (although we do our best 
 to minimize this by using LinkedBlockingQueue).
 One approach to mitigating this would be to make consumer threads do more 
 work in bulk instead of just one task per dequeue.  (Producer threads tend 
 to be single-task oriented by nature, so I don't see an equivalent 
 opportunity there.)
 BlockingQueue has a drainTo(collection, int) method that would be perfect for 
 this.  However, no ExecutorService in the jdk supports using drainTo, nor 
 could I google one.
 What I would like to do here is create just such a beast and wire it into (at 
 least) the write and read stages.  (Other possible candidates for such an 
 optimization, such as the CommitLog and OutboundTCPConnection, are not 
 ExecutorService-based and will need to be one-offs.)
 AbstractExecutorService may be useful.  The implementations of 
 ICommitLogExecutorService may also be useful. (Despite the name these are not 
 actual ExecutorServices, although they share the most important properties of 
 one.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results

2013-10-07 Thread Li Zou (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788437#comment-13788437
 ] 

Li Zou commented on CASSANDRA-5932:
---

This morning's trunk load has a slightly different symptom, and is even more 
serious than last Friday's load, as this time just commenting out the assert 
statement in the {{MessagingService.addCallback()}} will not help.

I copy the {{/var/log/cassandra/system.log}} exception errors below.

{noformat}
ERROR [Thrift:12] 2013-10-07 14:42:39,396 Caller+0   at 
org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134)
 - Exception in thread Thread[Thrift:12,5,main]
java.lang.AssertionError: null
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:543)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:591) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:869)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:123) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:739) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:511) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:581)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:379)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:363)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:126)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:267)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.thrift.CassandraServer.execute_prepared_cql3_query(CassandraServer.java:2061)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_prepared_cql3_query.getResult(Cassandra.java:4502)
 ~[apache-cassandra-thrift-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_prepared_cql3_query.getResult(Cassandra.java:4486)
 ~[apache-cassandra-thrift-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[libthrift-0.9.1.jar:0.9.1]
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
~[libthrift-0.9.1.jar:0.9.1]
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:194)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_25]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
~[na:1.7.0_25]
at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]

{noformat}



 Speculative read performance data show unexpected results
 -

 Key: CASSANDRA-5932
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5932
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Aleksey Yeschenko
 Fix For: 2.0.2

 Attachments: 5932.6692c50412ef7d.compaction.png, 
 5932-6692c50412ef7d.png, 5932.6692c50412ef7d.rr0.png, 
 5932.6692c50412ef7d.rr1.png, 5932.ded39c7e1c2fa.logs.tar.gz, 5932.txt, 
 5933-128_and_200rc1.png, 5933-7a87fc11.png, 5933-logs.tar.gz, 
 5933-randomized-dsnitch-replica.2.png, 5933-randomized-dsnitch-replica.3.png, 
 5933-randomized-dsnitch-replica.png, compaction-makes-slow.png, 
 compaction-makes-slow-stats.png, eager-read-looks-promising.png, 
 eager-read-looks-promising-stats.png, eager-read-not-consistent.png, 
 eager-read-not-consistent-stats.png, node-down-increase-performance.png


 I've done a series of stress tests with eager retries enabled that show 
 undesirable behavior. I'm grouping these behaviours into one ticket as they 
 are most likely related.
 1) Killing off a node in a 4 node cluster actually increases performance.
 

[1/2] git commit: Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128

2013-10-07 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-2.0 8e8db1f20 - c374aca19


Add more data type mappings for pig.
Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3633aea4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3633aea4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3633aea4

Branch: refs/heads/cassandra-2.0
Commit: 3633aea42d7689fa0252c104f62b0646d0858624
Parents: d396fd4
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Oct 7 13:57:45 2013 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Oct 7 13:57:45 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 30 +++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |  2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java |  7 ++---
 3 files changed, 26 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3633aea4/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--
diff --git 
a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
index ce92014..6ad4f9e 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
@@ -110,7 +110,7 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 ListCompositeComponent result = comparator.deconstruct(name);
 Tuple t = TupleFactory.getInstance().newTuple(result.size());
 for (int i=0; iresult.size(); i++)
-setTupleValue(t, i, 
result.get(i).comparator.compose(result.get(i).value));
+setTupleValue(t, i, cassandraToObj(result.get(i).comparator, 
result.get(i).value));
 
 return t;
 }
@@ -124,7 +124,7 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 if(comparator instanceof AbstractCompositeType)
 setTupleValue(pair, 0, 
composeComposite((AbstractCompositeType)comparator,col.name()));
 else
-setTupleValue(pair, 0, comparator.compose(col.name()));
+setTupleValue(pair, 0, cassandraToObj(comparator, col.name()));
 
 // value
 if (col instanceof Column)
@@ -134,10 +134,10 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 if (validators.get(col.name()) == null)
 {
 MapMarshallerType, AbstractType marshallers = 
getDefaultMarshallers(cfDef);
-setTupleValue(pair, 1, 
marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value()));
+setTupleValue(pair, 1, 
cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value()));
 }
 else
-setTupleValue(pair, 1, 
validators.get(col.name()).compose(col.value()));
+setTupleValue(pair, 1, 
cassandraToObj(validators.get(col.name()), col.value()));
 return pair;
 }
 else
@@ -327,9 +327,12 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 return DataType.LONG;
 else if (type instanceof IntegerType || type instanceof Int32Type) // 
IntegerType will overflow at 2**31, but is kept for compatibility until pig has 
a BigInteger
 return DataType.INTEGER;
-else if (type instanceof AsciiType)
-return DataType.CHARARRAY;
-else if (type instanceof UTF8Type)
+else if (type instanceof AsciiType || 
+type instanceof UTF8Type ||
+type instanceof DecimalType ||
+type instanceof InetAddressType ||
+type instanceof LexicalUUIDType ||
+type instanceof UUIDType )
 return DataType.CHARARRAY;
 else if (type instanceof FloatType)
 return DataType.FLOAT;
@@ -772,5 +775,18 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 }
 return null;
 }
+
+protected Object cassandraToObj(AbstractType validator, ByteBuffer value)
+{
+if (validator instanceof DecimalType ||
+validator instanceof InetAddressType ||
+validator instanceof LexicalUUIDType ||
+validator instanceof UUIDType)
+{
+return validator.getString(value);
+}
+else
+return validator.compose(value);
+}
 }
 


[2/2] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-10-07 Thread brandonwilliams
Merge branch 'cassandra-1.2' into cassandra-2.0

Conflicts:
src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c374aca1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c374aca1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c374aca1

Branch: refs/heads/cassandra-2.0
Commit: c374aca19ea39fbbc588a2309c669c422e0318cd
Parents: 8e8db1f 3633aea
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Oct 7 14:02:39 2013 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Oct 7 14:02:39 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 30 +++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |  2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java |  7 ++---
 3 files changed, 26 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--
diff --cc src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
index 1e207b3,6ad4f9e..c881734
--- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
@@@ -124,17 -124,31 +124,17 @@@ public abstract class AbstractCassandra
  if(comparator instanceof AbstractCompositeType)
  setTupleValue(pair, 0, 
composeComposite((AbstractCompositeType)comparator,col.name()));
  else
- setTupleValue(pair, 0, comparator.compose(col.name()));
+ setTupleValue(pair, 0, cassandraToObj(comparator, col.name()));
  
  // value
 -if (col instanceof Column)
 +MapByteBuffer,AbstractType validators = getValidatorMap(cfDef);
 +if (validators.get(col.name()) == null)
  {
 -// standard
 -MapByteBuffer,AbstractType validators = getValidatorMap(cfDef);
 -if (validators.get(col.name()) == null)
 -{
 -MapMarshallerType, AbstractType marshallers = 
getDefaultMarshallers(cfDef);
 -setTupleValue(pair, 1, 
cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value()));
 -}
 -else
 -setTupleValue(pair, 1, 
cassandraToObj(validators.get(col.name()), col.value()));
 -return pair;
 +MapMarshallerType, AbstractType marshallers = 
getDefaultMarshallers(cfDef);
- setTupleValue(pair, 1, 
marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value()));
++setTupleValue(pair, 1, 
cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value()));
  }
  else
- setTupleValue(pair, 1, 
validators.get(col.name()).compose(col.value()));
 -{
 -// super
 -ArrayListTuple subcols = new ArrayListTuple();
 -for (IColumn subcol : col.getSubColumns())
 -subcols.add(columnToTuple(subcol, cfDef, 
parseType(cfDef.getSubcomparator_type(;
 -
 -pair.set(1, new DefaultDataBag(subcols));
 -}
++setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), 
col.value()));
  return pair;
  }
  

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java
--



[jira] [Commented] (CASSANDRA-6131) JAVA_HOME on cassandra-env.sh is ignored on Debian packages

2013-10-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788446#comment-13788446
 ] 

Sebastián Lacuesta commented on CASSANDRA-6131:
---

Tried the patch with the 2.0.1 source tarball from the debian src file 
extracted from .dsc,
patching file debian/init
Hunk #4 FAILED at 53.
Hunk #5 FAILED at 95.
2 out of 5 hunks FAILED -- saving rejects to file debian/init.rej

content of debian/init.rej:
--- debian/init
+++ debian/init
@@ -53,10 +29,6 @@
 # Depend on lsb-base (= 3.0-6) to ensure that this file is present.
 . /lib/lsb/init-functions
 
-# If JNA is installed, add it to EXTRA_CLASSPATH
-#
-EXTRA_CLASSPATH=/usr/share/java/jna.jar:$EXTRA_CLASSPATH
-
 #
 # Function that returns 0 if process is running, or nonzero if not.
 #
@@ -95,7 +67,7 @@
 [ -e `dirname $PIDFILE` ] || \
 install -d -ocassandra -gcassandra -m750 `dirname $PIDFILE`
 
-export EXTRA_CLASSPATH
+
 
 start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -q -p $PIDFILE 
-t /dev/null || return 1


 JAVA_HOME on cassandra-env.sh is ignored on Debian packages
 ---

 Key: CASSANDRA-6131
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6131
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
 Environment: I've just got upgraded to 2.0.1 package from the apache 
 repositories using apt. I had the JAVA_HOME environment variable set in 
 /etc/cassandra/cassandra-env.sh but after the upgrade it only worked by 
 setting it on /usr/sbin/cassandra script. I can't configure java 7 system 
 wide, only for cassandra.
 Off-toppic: Thanks for getting rid of the jsvc mess.
Reporter: Sebastián Lacuesta
Assignee: Eric Evans
  Labels: debian
 Fix For: 2.0.2

 Attachments: 6131.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[1/2] git commit: Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128

2013-10-07 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.2 9d31ac14d - bdb7bb16f
  refs/heads/trunk b966e1ad2 - 538039a70


Add more data type mappings for pig.
Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bdb7bb16
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bdb7bb16
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bdb7bb16

Branch: refs/heads/cassandra-1.2
Commit: bdb7bb16facda0fbe266390bd3213f092d02c0dc
Parents: 9d31ac1
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Oct 7 13:57:45 2013 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Oct 7 14:03:08 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 30 +++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |  2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java |  7 ++---
 3 files changed, 26 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bdb7bb16/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--
diff --git 
a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
index ce92014..6ad4f9e 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
@@ -110,7 +110,7 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 ListCompositeComponent result = comparator.deconstruct(name);
 Tuple t = TupleFactory.getInstance().newTuple(result.size());
 for (int i=0; iresult.size(); i++)
-setTupleValue(t, i, 
result.get(i).comparator.compose(result.get(i).value));
+setTupleValue(t, i, cassandraToObj(result.get(i).comparator, 
result.get(i).value));
 
 return t;
 }
@@ -124,7 +124,7 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 if(comparator instanceof AbstractCompositeType)
 setTupleValue(pair, 0, 
composeComposite((AbstractCompositeType)comparator,col.name()));
 else
-setTupleValue(pair, 0, comparator.compose(col.name()));
+setTupleValue(pair, 0, cassandraToObj(comparator, col.name()));
 
 // value
 if (col instanceof Column)
@@ -134,10 +134,10 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 if (validators.get(col.name()) == null)
 {
 MapMarshallerType, AbstractType marshallers = 
getDefaultMarshallers(cfDef);
-setTupleValue(pair, 1, 
marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value()));
+setTupleValue(pair, 1, 
cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value()));
 }
 else
-setTupleValue(pair, 1, 
validators.get(col.name()).compose(col.value()));
+setTupleValue(pair, 1, 
cassandraToObj(validators.get(col.name()), col.value()));
 return pair;
 }
 else
@@ -327,9 +327,12 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 return DataType.LONG;
 else if (type instanceof IntegerType || type instanceof Int32Type) // 
IntegerType will overflow at 2**31, but is kept for compatibility until pig has 
a BigInteger
 return DataType.INTEGER;
-else if (type instanceof AsciiType)
-return DataType.CHARARRAY;
-else if (type instanceof UTF8Type)
+else if (type instanceof AsciiType || 
+type instanceof UTF8Type ||
+type instanceof DecimalType ||
+type instanceof InetAddressType ||
+type instanceof LexicalUUIDType ||
+type instanceof UUIDType )
 return DataType.CHARARRAY;
 else if (type instanceof FloatType)
 return DataType.FLOAT;
@@ -772,5 +775,18 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 }
 return null;
 }
+
+protected Object cassandraToObj(AbstractType validator, ByteBuffer value)
+{
+if (validator instanceof DecimalType ||
+validator instanceof InetAddressType ||
+validator instanceof LexicalUUIDType ||
+validator instanceof UUIDType)
+{
+return validator.getString(value);
+}
+else
+return validator.compose(value);
+}
 }
 


[2/2] git commit: Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128

2013-10-07 Thread brandonwilliams
Add more data type mappings for pig.
Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/538039a7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/538039a7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/538039a7

Branch: refs/heads/trunk
Commit: 538039a7001a4db0ff87dafbfe0be2877310b14f
Parents: b966e1a
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Oct 7 13:57:45 2013 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Oct 7 14:06:29 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 31 +++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |  2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java |  7 ++---
 3 files changed, 26 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/538039a7/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--
diff --git 
a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
index 1e207b3..0766adf 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
@@ -110,7 +110,7 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 ListCompositeComponent result = comparator.deconstruct(name);
 Tuple t = TupleFactory.getInstance().newTuple(result.size());
 for (int i=0; iresult.size(); i++)
-setTupleValue(t, i, 
result.get(i).comparator.compose(result.get(i).value));
+setTupleValue(t, i, cassandraToObj(result.get(i).comparator, 
result.get(i).value));
 
 return t;
 }
@@ -124,17 +124,16 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 if(comparator instanceof AbstractCompositeType)
 setTupleValue(pair, 0, 
composeComposite((AbstractCompositeType)comparator,col.name()));
 else
-setTupleValue(pair, 0, comparator.compose(col.name()));
+setTupleValue(pair, 0, cassandraToObj(comparator, col.name()));
 
 // value
-MapByteBuffer,AbstractType validators = getValidatorMap(cfDef);
 if (validators.get(col.name()) == null)
 {
 MapMarshallerType, AbstractType marshallers = 
getDefaultMarshallers(cfDef);
-setTupleValue(pair, 1, 
marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value()));
+setTupleValue(pair, 1, 
cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value()));
 }
 else
-setTupleValue(pair, 1, 
validators.get(col.name()).compose(col.value()));
+setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), 
col.value()));
 return pair;
 }
 
@@ -313,9 +312,12 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 return DataType.LONG;
 else if (type instanceof IntegerType || type instanceof Int32Type) // 
IntegerType will overflow at 2**31, but is kept for compatibility until pig has 
a BigInteger
 return DataType.INTEGER;
-else if (type instanceof AsciiType)
-return DataType.CHARARRAY;
-else if (type instanceof UTF8Type)
+else if (type instanceof AsciiType || 
+type instanceof UTF8Type ||
+type instanceof DecimalType ||
+type instanceof InetAddressType ||
+type instanceof LexicalUUIDType ||
+type instanceof UUIDType )
 return DataType.CHARARRAY;
 else if (type instanceof FloatType)
 return DataType.FLOAT;
@@ -758,5 +760,18 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 }
 return null;
 }
+
+protected Object cassandraToObj(AbstractType validator, ByteBuffer value)
+{
+if (validator instanceof DecimalType ||
+validator instanceof InetAddressType ||
+validator instanceof LexicalUUIDType ||
+validator instanceof UUIDType)
+{
+return validator.getString(value);
+}
+else
+return validator.compose(value);
+}
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/538039a7/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 

[jira] [Commented] (CASSANDRA-6128) Add more data mappings for Pig

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788450#comment-13788450
 ] 

Brandon Williams commented on CASSANDRA-6128:
-

Well, crap: PIG-2764

I guess we'll have to use a string for now, otherwise we box people into the 
corner of precision loss with no way out.  At least with strings they can do 
something in a UDF, so +1 and committed.

 Add more data mappings for Pig
 --

 Key: CASSANDRA-6128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6128
 Project: Cassandra
  Issue Type: Bug
Reporter: Alex Liu
Assignee: Alex Liu
 Attachments: 6128-1.2-branch.txt


 We need add more data mappings for
 {code}
  DecimalType
  InetAddressType
  LexicalUUIDType
  TimeUUIDType
  UUIDType
 {code}
 Existing implementation throws exception for those data type



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-6131) JAVA_HOME on cassandra-env.sh is ignored on Debian packages

2013-10-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788446#comment-13788446
 ] 

Sebastián Lacuesta edited comment on CASSANDRA-6131 at 10/7/13 7:07 PM:


Tried the patch with the 2.0.1 source tarball from the debian src file 
extracted from .dsc,
{code}
patching file debian/init
Hunk #4 FAILED at 53.
Hunk #5 FAILED at 95.
2 out of 5 hunks FAILED -- saving rejects to file debian/init.rej
{code}
content of debian/init.rej:
{code:title=debian/init.rej|borderStyle=solid}
--- debian/init
+++ debian/init
@@ -53,10 +29,6 @@
 # Depend on lsb-base (= 3.0-6) to ensure that this file is present.
 . /lib/lsb/init-functions
 
-# If JNA is installed, add it to EXTRA_CLASSPATH
-#
-EXTRA_CLASSPATH=/usr/share/java/jna.jar:$EXTRA_CLASSPATH
-
 #
 # Function that returns 0 if process is running, or nonzero if not.
 #
@@ -95,7 +67,7 @@
 [ -e `dirname $PIDFILE` ] || \
 install -d -ocassandra -gcassandra -m750 `dirname $PIDFILE`
 
-export EXTRA_CLASSPATH
+
 
 start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -q -p $PIDFILE 
-t /dev/null || return 1
{code}


was (Author: sebastianlacuesta):
Tried the patch with the 2.0.1 source tarball from the debian src file 
extracted from .dsc,
patching file debian/init
Hunk #4 FAILED at 53.
Hunk #5 FAILED at 95.
2 out of 5 hunks FAILED -- saving rejects to file debian/init.rej

content of debian/init.rej:
--- debian/init
+++ debian/init
@@ -53,10 +29,6 @@
 # Depend on lsb-base (= 3.0-6) to ensure that this file is present.
 . /lib/lsb/init-functions
 
-# If JNA is installed, add it to EXTRA_CLASSPATH
-#
-EXTRA_CLASSPATH=/usr/share/java/jna.jar:$EXTRA_CLASSPATH
-
 #
 # Function that returns 0 if process is running, or nonzero if not.
 #
@@ -95,7 +67,7 @@
 [ -e `dirname $PIDFILE` ] || \
 install -d -ocassandra -gcassandra -m750 `dirname $PIDFILE`
 
-export EXTRA_CLASSPATH
+
 
 start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -q -p $PIDFILE 
-t /dev/null || return 1


 JAVA_HOME on cassandra-env.sh is ignored on Debian packages
 ---

 Key: CASSANDRA-6131
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6131
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
 Environment: I've just got upgraded to 2.0.1 package from the apache 
 repositories using apt. I had the JAVA_HOME environment variable set in 
 /etc/cassandra/cassandra-env.sh but after the upgrade it only worked by 
 setting it on /usr/sbin/cassandra script. I can't configure java 7 system 
 wide, only for cassandra.
 Off-toppic: Thanks for getting rid of the jsvc mess.
Reporter: Sebastián Lacuesta
Assignee: Eric Evans
  Labels: debian
 Fix For: 2.0.2

 Attachments: 6131.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6154) Inserts are blocked in 2.1

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788453#comment-13788453
 ] 

Brandon Williams commented on CASSANDRA-6154:
-

Bisect points at CASSANDRA-6132

 Inserts are blocked in 2.1
 --

 Key: CASSANDRA-6154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Priority: Critical

 With cluster sizes 1 inserts are blocked indefinitely:
 {code}
 $ ccm create -v git:trunk test
 Fetching Cassandra updates...
 Current cluster is now: test
 $ ccm populate -n 2
 $ ccm start
 $ ccm node1 cqlsh
 Connected to test at 127.0.0.1:9160.
 [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
 19.37.0]
 Use HELP for help.
 cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': 1};
 cqlsh USE timeline;
 cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
 text, PRIMARY KEY (userid, event));
 cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
 'ryan', '2013-10-07', 'attempt');
 {code}
 The last INSERT statement never returns..



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6128) Add more data mappings for Pig

2013-10-07 Thread Alex Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Liu updated CASSANDRA-6128:


Description: 
We need add more data mappings for
{code}
 DecimalType
 InetAddressType
{code}

Existing implementation throws exception for those data type

  was:
We need add more data mappings for
{code}
 DecimalType
 InetAddressType
 LexicalUUIDType
 TimeUUIDType
 UUIDType
{code}

Existing implementation throws exception for those data type


 Add more data mappings for Pig
 --

 Key: CASSANDRA-6128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6128
 Project: Cassandra
  Issue Type: Bug
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 1.2.11, 2.0.2

 Attachments: 6128-1.2-branch.txt


 We need add more data mappings for
 {code}
  DecimalType
  InetAddressType
 {code}
 Existing implementation throws exception for those data type



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-6154) Inserts are blocked in 2.1

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788453#comment-13788453
 ] 

Brandon Williams edited comment on CASSANDRA-6154 at 10/7/13 7:12 PM:
--

Bisect points at CASSANDRA-6132, specfically the ninja commit in 
5440a0a6767544d6ea1ba34f5d2a3e223f260fb5


was (Author: brandon.williams):
Bisect points at CASSANDRA-6132

 Inserts are blocked in 2.1
 --

 Key: CASSANDRA-6154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Priority: Critical

 With cluster sizes 1 inserts are blocked indefinitely:
 {code}
 $ ccm create -v git:trunk test
 Fetching Cassandra updates...
 Current cluster is now: test
 $ ccm populate -n 2
 $ ccm start
 $ ccm node1 cqlsh
 Connected to test at 127.0.0.1:9160.
 [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
 19.37.0]
 Use HELP for help.
 cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': 1};
 cqlsh USE timeline;
 cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
 text, PRIMARY KEY (userid, event));
 cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
 'ryan', '2013-10-07', 'attempt');
 {code}
 The last INSERT statement never returns..



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6128) Add more data mappings for Pig

2013-10-07 Thread Alex Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Liu updated CASSANDRA-6128:


Description: 
We need add more data mappings for
{code}
 DecimalType
 InetAddressType
 LexicalUUIDType
 TimeUUIDType
 UUIDType
{code}

Existing implementation throws exception for those data type

  was:
We need add more data mappings for
{code}
 DecimalType
 InetAddressType
{code}

Existing implementation throws exception for those data type


 Add more data mappings for Pig
 --

 Key: CASSANDRA-6128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6128
 Project: Cassandra
  Issue Type: Bug
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 1.2.11, 2.0.2

 Attachments: 6128-1.2-branch.txt


 We need add more data mappings for
 {code}
  DecimalType
  InetAddressType
  LexicalUUIDType
  TimeUUIDType
  UUIDType
 {code}
 Existing implementation throws exception for those data type



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5202) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name

2013-10-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788456#comment-13788456
 ] 

Jonathan Ellis commented on CASSANDRA-5202:
---

Is there anything that we want to do as part of this ticket instead of 6060?

 CFs should have globally and temporally unique CF IDs to prevent reusing 
 data from earlier incarnation of same CF name
 

 Key: CASSANDRA-5202
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5202
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.9
 Environment: OS: Windows 7, 
 Server: Cassandra 1.1.9 release drop
 Client: astyanax 1.56.21, 
 JVM: Sun/Oracle JVM 64 bit (jdk1.6.0_27)
Reporter: Marat Bedretdinov
Assignee: Yuki Morishita
  Labels: test
 Fix For: 2.1

 Attachments: 5202-1.1.txt, 5202-2.0.0.txt, astyanax-stress-driver.zip


 Attached is a driver that sequentially:
 1. Drops keyspace
 2. Creates keyspace
 4. Creates 2 column families
 5. Seeds 1M rows with 100 columns
 6. Queries these 2 column families
 The above steps are repeated 1000 times.
 The following exception is observed at random (race - SEDA?):
 ERROR [ReadStage:55] 2013-01-29 19:24:52,676 AbstractCassandraDaemon.java 
 (line 135) Exception in thread Thread[ReadStage:55,5,main]
 java.lang.AssertionError: DecoratedKey(-1, ) != 
 DecoratedKey(62819832764241410631599989027761269388, 313a31) in 
 C:\var\lib\cassandra\data\user_role_reverse_index\business_entity_role\user_role_reverse_index-business_entity_role-hf-1-Data.db
   at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60)
   at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67)
   at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79)
   at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256)
   at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1367)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1229)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1164)
   at org.apache.cassandra.db.Table.getRow(Table.java:378)
   at 
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
   at 
 org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:822)
   at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 This exception appears in the server at the time of client submitting a query 
 request (row slice) and not at the time data is seeded. The client times out 
 and this data can no longer be queried as the same exception would always 
 occur from there on.
 Also on iteration 201, it appears that dropping column families failed and as 
 a result their recreation failed with unique column family name violation 
 (see exception below). Note that the data files are actually gone, so it 
 appears that the server runtime responsible for creating column family was 
 out of sync with the piece that dropped them:
 Starting dropping column families
 Dropped column families
 Starting dropping keyspace
 Dropped keyspace
 Starting creating column families
 Created column families
 Starting seeding data
 Total rows inserted: 100 in 5105 ms
 Iteration: 200; Total running time for 1000 queries is 232; Average running 
 time of 1000 queries is 0 ms
 Starting dropping column families
 Dropped column families
 Starting dropping keyspace
 Dropped keyspace
 Starting creating column families
 Created column families
 Starting seeding data
 Total rows inserted: 100 in 5361 ms
 Iteration: 201; Total running time for 1000 queries is 222; Average running 
 time of 1000 queries is 0 ms
 Starting dropping column families
 Starting creating column families
 Exception in thread main 
 com.netflix.astyanax.connectionpool.exceptions.BadRequestException: 
 BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=2468(2469), 
 attempts=1]InvalidRequestException(why:Keyspace names must be 
 case-insensitively unique (user_role_reverse_index conflicts with 
 user_role_reverse_index))
   at 
 

[jira] [Comment Edited] (CASSANDRA-5932) Speculative read performance data show unexpected results

2013-10-07 Thread Li Zou (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788437#comment-13788437
 ] 

Li Zou edited comment on CASSANDRA-5932 at 10/7/13 7:17 PM:


[~jbellis], this morning's trunk load has a slightly different symptom, and is 
even more serious than last Friday's load, as this time just commenting out the 
assert statement in the {{MessagingService.addCallback()}} will not help.

I copy the {{/var/log/cassandra/system.log}} exception errors below.

{noformat}
ERROR [Thrift:12] 2013-10-07 14:42:39,396 Caller+0   at 
org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134)
 - Exception in thread Thread[Thrift:12,5,main]
java.lang.AssertionError: null
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:543)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:591) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:869)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:123) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:739) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:511) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:581)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:379)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:363)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:126)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:267)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.thrift.CassandraServer.execute_prepared_cql3_query(CassandraServer.java:2061)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_prepared_cql3_query.getResult(Cassandra.java:4502)
 ~[apache-cassandra-thrift-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_prepared_cql3_query.getResult(Cassandra.java:4486)
 ~[apache-cassandra-thrift-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[libthrift-0.9.1.jar:0.9.1]
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
~[libthrift-0.9.1.jar:0.9.1]
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:194)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_25]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
~[na:1.7.0_25]
at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]

{noformat}




was (Author: lizou):
This morning's trunk load has a slightly different symptom, and is even more 
serious than last Friday's load, as this time just commenting out the assert 
statement in the {{MessagingService.addCallback()}} will not help.

I copy the {{/var/log/cassandra/system.log}} exception errors below.

{noformat}
ERROR [Thrift:12] 2013-10-07 14:42:39,396 Caller+0   at 
org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134)
 - Exception in thread Thread[Thrift:12,5,main]
java.lang.AssertionError: null
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:543)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:591) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) 
~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:869)
 ~[apache-cassandra-2.1-SNAPSHOT.jar:2.1-SNAPSHOT]
at 

[jira] [Commented] (CASSANDRA-5202) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name

2013-10-07 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788471#comment-13788471
 ] 

Yuki Morishita commented on CASSANDRA-5202:
---

Add CF ID to directory name if we still want to distinguish one KS/CF directory 
to another.
Updating key cache key to use CF ID is another one, but I think that will be 
done through 6060.

 CFs should have globally and temporally unique CF IDs to prevent reusing 
 data from earlier incarnation of same CF name
 

 Key: CASSANDRA-5202
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5202
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.9
 Environment: OS: Windows 7, 
 Server: Cassandra 1.1.9 release drop
 Client: astyanax 1.56.21, 
 JVM: Sun/Oracle JVM 64 bit (jdk1.6.0_27)
Reporter: Marat Bedretdinov
Assignee: Yuki Morishita
  Labels: test
 Fix For: 2.1

 Attachments: 5202-1.1.txt, 5202-2.0.0.txt, astyanax-stress-driver.zip


 Attached is a driver that sequentially:
 1. Drops keyspace
 2. Creates keyspace
 4. Creates 2 column families
 5. Seeds 1M rows with 100 columns
 6. Queries these 2 column families
 The above steps are repeated 1000 times.
 The following exception is observed at random (race - SEDA?):
 ERROR [ReadStage:55] 2013-01-29 19:24:52,676 AbstractCassandraDaemon.java 
 (line 135) Exception in thread Thread[ReadStage:55,5,main]
 java.lang.AssertionError: DecoratedKey(-1, ) != 
 DecoratedKey(62819832764241410631599989027761269388, 313a31) in 
 C:\var\lib\cassandra\data\user_role_reverse_index\business_entity_role\user_role_reverse_index-business_entity_role-hf-1-Data.db
   at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60)
   at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67)
   at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79)
   at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256)
   at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1367)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1229)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1164)
   at org.apache.cassandra.db.Table.getRow(Table.java:378)
   at 
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
   at 
 org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:822)
   at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 This exception appears in the server at the time of client submitting a query 
 request (row slice) and not at the time data is seeded. The client times out 
 and this data can no longer be queried as the same exception would always 
 occur from there on.
 Also on iteration 201, it appears that dropping column families failed and as 
 a result their recreation failed with unique column family name violation 
 (see exception below). Note that the data files are actually gone, so it 
 appears that the server runtime responsible for creating column family was 
 out of sync with the piece that dropped them:
 Starting dropping column families
 Dropped column families
 Starting dropping keyspace
 Dropped keyspace
 Starting creating column families
 Created column families
 Starting seeding data
 Total rows inserted: 100 in 5105 ms
 Iteration: 200; Total running time for 1000 queries is 232; Average running 
 time of 1000 queries is 0 ms
 Starting dropping column families
 Dropped column families
 Starting dropping keyspace
 Dropped keyspace
 Starting creating column families
 Created column families
 Starting seeding data
 Total rows inserted: 100 in 5361 ms
 Iteration: 201; Total running time for 1000 queries is 222; Average running 
 time of 1000 queries is 0 ms
 Starting dropping column families
 Starting creating column families
 Exception in thread main 
 com.netflix.astyanax.connectionpool.exceptions.BadRequestException: 
 BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=2468(2469), 
 attempts=1]InvalidRequestException(why:Keyspace names must be 
 case-insensitively 

git commit: Fix FileCacheService regressions patch by jbellis; reviewed by pyaskevich and tested by Kai Wang for CASSANDRA-6149

2013-10-07 Thread jbellis
Updated Branches:
  refs/heads/cassandra-2.0 c374aca19 - 01a57eea8


Fix FileCacheService regressions
patch by jbellis; reviewed by pyaskevich and tested by Kai Wang for 
CASSANDRA-6149


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/01a57eea
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/01a57eea
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/01a57eea

Branch: refs/heads/cassandra-2.0
Commit: 01a57eea841e51fb4a97329ab9fa0f59d0b826f6
Parents: c374aca
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Oct 7 14:20:42 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Oct 7 14:20:42 2013 -0500

--
 CHANGES.txt |  1 +
 .../compress/CompressedRandomAccessReader.java  |  5 ++
 .../cassandra/io/util/RandomAccessReader.java   |  2 +-
 .../apache/cassandra/io/util/SegmentedFile.java |  3 +-
 .../cassandra/service/FileCacheService.java | 87 ++--
 5 files changed, 54 insertions(+), 44 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 94fa927..ddd976e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.2
+ * Fix FileCacheService regressions (CASSANDRA-6149)
  * Never return WriteTimeout for CL.ANY (CASSANDRA-6032)
  * Fix race conditions in bulk loader (CASSANDRA-6129)
  * Add configurable metrics reporting (CASSANDRA-4430)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
--
diff --git 
a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java 
b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
index b6cffa2..131a4d6 100644
--- 
a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
+++ 
b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
@@ -154,6 +154,11 @@ public class CompressedRandomAccessReader extends 
RandomAccessReader
 return checksumBytes.getInt(0);
 }
 
+public int getTotalBufferSize()
+{
+return super.getTotalBufferSize() + compressed.capacity();
+}
+
 @Override
 public long length()
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
--
diff --git a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java 
b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
index 4ceb3c4..9a03480 100644
--- a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
+++ b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
@@ -152,7 +152,7 @@ public class RandomAccessReader extends RandomAccessFile 
implements FileDataInpu
 return filePath;
 }
 
-public int getBufferSize()
+public int getTotalBufferSize()
 {
 return buffer.length;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/util/SegmentedFile.java
--
diff --git a/src/java/org/apache/cassandra/io/util/SegmentedFile.java 
b/src/java/org/apache/cassandra/io/util/SegmentedFile.java
index 6231fd7..d4da177 100644
--- a/src/java/org/apache/cassandra/io/util/SegmentedFile.java
+++ b/src/java/org/apache/cassandra/io/util/SegmentedFile.java
@@ -19,6 +19,7 @@ package org.apache.cassandra.io.util;
 
 import java.io.DataInput;
 import java.io.DataOutput;
+import java.io.File;
 import java.io.IOException;
 import java.nio.MappedByteBuffer;
 import java.util.Iterator;
@@ -57,7 +58,7 @@ public abstract class SegmentedFile
 
 protected SegmentedFile(String path, long length, long onDiskLength)
 {
-this.path = path;
+this.path = new File(path).getAbsolutePath();
 this.length = length;
 this.onDiskLength = onDiskLength;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/service/FileCacheService.java
--
diff --git a/src/java/org/apache/cassandra/service/FileCacheService.java 
b/src/java/org/apache/cassandra/service/FileCacheService.java
index e6bc3e5..c939a6f 100644
--- a/src/java/org/apache/cassandra/service/FileCacheService.java
+++ b/src/java/org/apache/cassandra/service/FileCacheService.java
@@ -22,11 +22,9 @@ import java.util.concurrent.Callable;
 import java.util.concurrent.ConcurrentLinkedQueue;
 import 

[3/5] git commit: Fix FileCacheService regressions patch by jbellis; reviewed by pyaskevich and tested by Kai Wang for CASSANDRA-6149

2013-10-07 Thread jbellis
Fix FileCacheService regressions
patch by jbellis; reviewed by pyaskevich and tested by Kai Wang for 
CASSANDRA-6149


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/01a57eea
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/01a57eea
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/01a57eea

Branch: refs/heads/trunk
Commit: 01a57eea841e51fb4a97329ab9fa0f59d0b826f6
Parents: c374aca
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Oct 7 14:20:42 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Oct 7 14:20:42 2013 -0500

--
 CHANGES.txt |  1 +
 .../compress/CompressedRandomAccessReader.java  |  5 ++
 .../cassandra/io/util/RandomAccessReader.java   |  2 +-
 .../apache/cassandra/io/util/SegmentedFile.java |  3 +-
 .../cassandra/service/FileCacheService.java | 87 ++--
 5 files changed, 54 insertions(+), 44 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 94fa927..ddd976e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.2
+ * Fix FileCacheService regressions (CASSANDRA-6149)
  * Never return WriteTimeout for CL.ANY (CASSANDRA-6032)
  * Fix race conditions in bulk loader (CASSANDRA-6129)
  * Add configurable metrics reporting (CASSANDRA-4430)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
--
diff --git 
a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java 
b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
index b6cffa2..131a4d6 100644
--- 
a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
+++ 
b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
@@ -154,6 +154,11 @@ public class CompressedRandomAccessReader extends 
RandomAccessReader
 return checksumBytes.getInt(0);
 }
 
+public int getTotalBufferSize()
+{
+return super.getTotalBufferSize() + compressed.capacity();
+}
+
 @Override
 public long length()
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
--
diff --git a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java 
b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
index 4ceb3c4..9a03480 100644
--- a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
+++ b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
@@ -152,7 +152,7 @@ public class RandomAccessReader extends RandomAccessFile 
implements FileDataInpu
 return filePath;
 }
 
-public int getBufferSize()
+public int getTotalBufferSize()
 {
 return buffer.length;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/io/util/SegmentedFile.java
--
diff --git a/src/java/org/apache/cassandra/io/util/SegmentedFile.java 
b/src/java/org/apache/cassandra/io/util/SegmentedFile.java
index 6231fd7..d4da177 100644
--- a/src/java/org/apache/cassandra/io/util/SegmentedFile.java
+++ b/src/java/org/apache/cassandra/io/util/SegmentedFile.java
@@ -19,6 +19,7 @@ package org.apache.cassandra.io.util;
 
 import java.io.DataInput;
 import java.io.DataOutput;
+import java.io.File;
 import java.io.IOException;
 import java.nio.MappedByteBuffer;
 import java.util.Iterator;
@@ -57,7 +58,7 @@ public abstract class SegmentedFile
 
 protected SegmentedFile(String path, long length, long onDiskLength)
 {
-this.path = path;
+this.path = new File(path).getAbsolutePath();
 this.length = length;
 this.onDiskLength = onDiskLength;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/01a57eea/src/java/org/apache/cassandra/service/FileCacheService.java
--
diff --git a/src/java/org/apache/cassandra/service/FileCacheService.java 
b/src/java/org/apache/cassandra/service/FileCacheService.java
index e6bc3e5..c939a6f 100644
--- a/src/java/org/apache/cassandra/service/FileCacheService.java
+++ b/src/java/org/apache/cassandra/service/FileCacheService.java
@@ -22,11 +22,9 @@ import java.util.concurrent.Callable;
 import java.util.concurrent.ConcurrentLinkedQueue;
 import java.util.concurrent.ExecutionException;
 import java.util.concurrent.TimeUnit;
+import 

[5/5] git commit: Merge remote-tracking branch 'origin/trunk' into trunk

2013-10-07 Thread jbellis
Merge remote-tracking branch 'origin/trunk' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/558a9e57
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/558a9e57
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/558a9e57

Branch: refs/heads/trunk
Commit: 558a9e57bb2f443b69ef1acb31fd90aa8b373e5d
Parents: 6990f95 538039a
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Oct 7 14:23:59 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Oct 7 14:23:59 2013 -0500

--
 .../org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java   | 1 -
 1 file changed, 1 deletion(-)
--




[1/5] git commit: Add more data type mappings for pig. Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128

2013-10-07 Thread jbellis
Updated Branches:
  refs/heads/trunk 538039a70 - 558a9e57b


Add more data type mappings for pig.
Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6128


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3633aea4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3633aea4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3633aea4

Branch: refs/heads/trunk
Commit: 3633aea42d7689fa0252c104f62b0646d0858624
Parents: d396fd4
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Oct 7 13:57:45 2013 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Oct 7 13:57:45 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 30 +++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |  2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java |  7 ++---
 3 files changed, 26 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3633aea4/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--
diff --git 
a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
index ce92014..6ad4f9e 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
@@ -110,7 +110,7 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 ListCompositeComponent result = comparator.deconstruct(name);
 Tuple t = TupleFactory.getInstance().newTuple(result.size());
 for (int i=0; iresult.size(); i++)
-setTupleValue(t, i, 
result.get(i).comparator.compose(result.get(i).value));
+setTupleValue(t, i, cassandraToObj(result.get(i).comparator, 
result.get(i).value));
 
 return t;
 }
@@ -124,7 +124,7 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 if(comparator instanceof AbstractCompositeType)
 setTupleValue(pair, 0, 
composeComposite((AbstractCompositeType)comparator,col.name()));
 else
-setTupleValue(pair, 0, comparator.compose(col.name()));
+setTupleValue(pair, 0, cassandraToObj(comparator, col.name()));
 
 // value
 if (col instanceof Column)
@@ -134,10 +134,10 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 if (validators.get(col.name()) == null)
 {
 MapMarshallerType, AbstractType marshallers = 
getDefaultMarshallers(cfDef);
-setTupleValue(pair, 1, 
marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value()));
+setTupleValue(pair, 1, 
cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value()));
 }
 else
-setTupleValue(pair, 1, 
validators.get(col.name()).compose(col.value()));
+setTupleValue(pair, 1, 
cassandraToObj(validators.get(col.name()), col.value()));
 return pair;
 }
 else
@@ -327,9 +327,12 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 return DataType.LONG;
 else if (type instanceof IntegerType || type instanceof Int32Type) // 
IntegerType will overflow at 2**31, but is kept for compatibility until pig has 
a BigInteger
 return DataType.INTEGER;
-else if (type instanceof AsciiType)
-return DataType.CHARARRAY;
-else if (type instanceof UTF8Type)
+else if (type instanceof AsciiType || 
+type instanceof UTF8Type ||
+type instanceof DecimalType ||
+type instanceof InetAddressType ||
+type instanceof LexicalUUIDType ||
+type instanceof UUIDType )
 return DataType.CHARARRAY;
 else if (type instanceof FloatType)
 return DataType.FLOAT;
@@ -772,5 +775,18 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 }
 return null;
 }
+
+protected Object cassandraToObj(AbstractType validator, ByteBuffer value)
+{
+if (validator instanceof DecimalType ||
+validator instanceof InetAddressType ||
+validator instanceof LexicalUUIDType ||
+validator instanceof UUIDType)
+{
+return validator.getString(value);
+}
+else
+return validator.compose(value);
+}
 }
 


[4/5] git commit: Merge branch 'cassandra-2.0' into trunk

2013-10-07 Thread jbellis
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6990f95b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6990f95b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6990f95b

Branch: refs/heads/trunk
Commit: 6990f95b1dff4e4132ae43c037d0f117487ffb6e
Parents: b966e1a 01a57ee
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Oct 7 14:20:48 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Oct 7 14:20:48 2013 -0500

--
 CHANGES.txt |  1 +
 .../hadoop/pig/AbstractCassandraStorage.java| 30 +--
 .../cassandra/hadoop/pig/CassandraStorage.java  |  2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java |  7 +-
 .../compress/CompressedRandomAccessReader.java  |  5 ++
 .../cassandra/io/util/RandomAccessReader.java   |  2 +-
 .../apache/cassandra/io/util/SegmentedFile.java |  3 +-
 .../cassandra/service/FileCacheService.java | 87 ++--
 8 files changed, 80 insertions(+), 57 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/6990f95b/CHANGES.txt
--
diff --cc CHANGES.txt
index 4ec387c,ddd976e..2e6e06a
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,13 -1,5 +1,14 @@@
 +2.1
 + * Parallelize fetching rows for low-cardinality indexes (CASSANDRA-1337)
 + * change logging from log4j to logback (CASSANDRA-5883)
 + * switch to LZ4 compression for internode communication (CASSANDRA-5887)
 + * Stop using Thrift-generated Index* classes internally (CASSANDRA-5971)
 + * Remove 1.2 network compatibility code (CASSANDRA-5960)
 + * Remove leveled json manifest migration code (CASSANDRA-5996)
 +
 +
  2.0.2
+  * Fix FileCacheService regressions (CASSANDRA-6149)
   * Never return WriteTimeout for CL.ANY (CASSANDRA-6032)
   * Fix race conditions in bulk loader (CASSANDRA-6129)
   * Add configurable metrics reporting (CASSANDRA-4430)



[2/5] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-10-07 Thread jbellis
Merge branch 'cassandra-1.2' into cassandra-2.0

Conflicts:
src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c374aca1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c374aca1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c374aca1

Branch: refs/heads/trunk
Commit: c374aca19ea39fbbc588a2309c669c422e0318cd
Parents: 8e8db1f 3633aea
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Oct 7 14:02:39 2013 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Oct 7 14:02:39 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 30 +++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |  2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java |  7 ++---
 3 files changed, 26 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--
diff --cc src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
index 1e207b3,6ad4f9e..c881734
--- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
@@@ -124,17 -124,31 +124,17 @@@ public abstract class AbstractCassandra
  if(comparator instanceof AbstractCompositeType)
  setTupleValue(pair, 0, 
composeComposite((AbstractCompositeType)comparator,col.name()));
  else
- setTupleValue(pair, 0, comparator.compose(col.name()));
+ setTupleValue(pair, 0, cassandraToObj(comparator, col.name()));
  
  // value
 -if (col instanceof Column)
 +MapByteBuffer,AbstractType validators = getValidatorMap(cfDef);
 +if (validators.get(col.name()) == null)
  {
 -// standard
 -MapByteBuffer,AbstractType validators = getValidatorMap(cfDef);
 -if (validators.get(col.name()) == null)
 -{
 -MapMarshallerType, AbstractType marshallers = 
getDefaultMarshallers(cfDef);
 -setTupleValue(pair, 1, 
cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value()));
 -}
 -else
 -setTupleValue(pair, 1, 
cassandraToObj(validators.get(col.name()), col.value()));
 -return pair;
 +MapMarshallerType, AbstractType marshallers = 
getDefaultMarshallers(cfDef);
- setTupleValue(pair, 1, 
marshallers.get(MarshallerType.DEFAULT_VALIDATOR).compose(col.value()));
++setTupleValue(pair, 1, 
cassandraToObj(marshallers.get(MarshallerType.DEFAULT_VALIDATOR), col.value()));
  }
  else
- setTupleValue(pair, 1, 
validators.get(col.name()).compose(col.value()));
 -{
 -// super
 -ArrayListTuple subcols = new ArrayListTuple();
 -for (IColumn subcol : col.getSubColumns())
 -subcols.add(columnToTuple(subcol, cfDef, 
parseType(cfDef.getSubcomparator_type(;
 -
 -pair.set(1, new DefaultDataBag(subcols));
 -}
++setTupleValue(pair, 1, cassandraToObj(validators.get(col.name()), 
col.value()));
  return pair;
  }
  

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c374aca1/src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java
--



[jira] [Commented] (CASSANDRA-5202) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name

2013-10-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788476#comment-13788476
 ] 

Jonathan Ellis commented on CASSANDRA-5202:
---

bq. Add CF ID to directory name if we still want to distinguish one KS/CF 
directory to another.

All right, I'm down to narrow the scope here to that.

 CFs should have globally and temporally unique CF IDs to prevent reusing 
 data from earlier incarnation of same CF name
 

 Key: CASSANDRA-5202
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5202
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.9
 Environment: OS: Windows 7, 
 Server: Cassandra 1.1.9 release drop
 Client: astyanax 1.56.21, 
 JVM: Sun/Oracle JVM 64 bit (jdk1.6.0_27)
Reporter: Marat Bedretdinov
Assignee: Yuki Morishita
  Labels: test
 Fix For: 2.1

 Attachments: 5202-1.1.txt, 5202-2.0.0.txt, astyanax-stress-driver.zip


 Attached is a driver that sequentially:
 1. Drops keyspace
 2. Creates keyspace
 4. Creates 2 column families
 5. Seeds 1M rows with 100 columns
 6. Queries these 2 column families
 The above steps are repeated 1000 times.
 The following exception is observed at random (race - SEDA?):
 ERROR [ReadStage:55] 2013-01-29 19:24:52,676 AbstractCassandraDaemon.java 
 (line 135) Exception in thread Thread[ReadStage:55,5,main]
 java.lang.AssertionError: DecoratedKey(-1, ) != 
 DecoratedKey(62819832764241410631599989027761269388, 313a31) in 
 C:\var\lib\cassandra\data\user_role_reverse_index\business_entity_role\user_role_reverse_index-business_entity_role-hf-1-Data.db
   at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60)
   at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67)
   at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79)
   at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256)
   at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1367)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1229)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1164)
   at org.apache.cassandra.db.Table.getRow(Table.java:378)
   at 
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
   at 
 org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:822)
   at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 This exception appears in the server at the time of client submitting a query 
 request (row slice) and not at the time data is seeded. The client times out 
 and this data can no longer be queried as the same exception would always 
 occur from there on.
 Also on iteration 201, it appears that dropping column families failed and as 
 a result their recreation failed with unique column family name violation 
 (see exception below). Note that the data files are actually gone, so it 
 appears that the server runtime responsible for creating column family was 
 out of sync with the piece that dropped them:
 Starting dropping column families
 Dropped column families
 Starting dropping keyspace
 Dropped keyspace
 Starting creating column families
 Created column families
 Starting seeding data
 Total rows inserted: 100 in 5105 ms
 Iteration: 200; Total running time for 1000 queries is 232; Average running 
 time of 1000 queries is 0 ms
 Starting dropping column families
 Dropped column families
 Starting dropping keyspace
 Dropped keyspace
 Starting creating column families
 Created column families
 Starting seeding data
 Total rows inserted: 100 in 5361 ms
 Iteration: 201; Total running time for 1000 queries is 222; Average running 
 time of 1000 queries is 0 ms
 Starting dropping column families
 Starting creating column families
 Exception in thread main 
 com.netflix.astyanax.connectionpool.exceptions.BadRequestException: 
 BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=2468(2469), 
 attempts=1]InvalidRequestException(why:Keyspace names must be 
 case-insensitively unique (user_role_reverse_index conflicts 

[jira] [Commented] (CASSANDRA-4785) Secondary Index Sporadically Doesn't Return Rows

2013-10-07 Thread Tom van den Berge (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788480#comment-13788480
 ] 

Tom van den Berge commented on CASSANDRA-4785:
--

I'm seeing this problem, too, (cassandra 1.2.3), but my CF has caching 
KEYS_ONLY. It only happens to specific rows in the CF; not all. Also, it only 
happens on one single node in my 2-node cluster (replication factor 2). 

Storing the indexed value again solves the problem for this particular row, but 
I've seen this problem happen several times now, even on the same rows -- also 
after having fixed it as I just described. I'm not 100% sure, but I think the 
problem occurred again after having rebuilt the node.

 Secondary Index Sporadically Doesn't Return Rows
 

 Key: CASSANDRA-4785
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4785
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.5, 1.1.6
 Environment: Ubuntu 10.04
 Java 6 Sun
 Cassandra 1.1.5 upgraded from 1.1.2 - 1.1.3 - 1.1.5
Reporter: Arya Goudarzi
Assignee: Sam Tunnicliffe
 Attachments: entity_aliases.txt, repro.py


 I have a ColumnFamily with caching = ALL. I have 2 secondary indexes on it. I 
 have noticed if I query using the secondary index in the where clause, 
 sometimes I get the results and sometimes I don't. Until 2 weeks ago, the 
 caching option on this CF was set to NONE. So, I suspect something happened 
 in secondary index caching scheme. 
 Here are things I tried:
 1. I rebuild indexes for that CF on all nodes;
 2. I set the caching to KEYS_ONLY and rebuild the index again;
 3. I set the caching to NONE and rebuild the index again;
 None of the above helped. I suppose the caching still exists as this behavior 
 looks like cache mistmatch.
 I did a bit research, and found CASSANDRA-4197 that could be related.
 Please advice.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results

2013-10-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788485#comment-13788485
 ] 

Jonathan Ellis commented on CASSANDRA-5932:
---

There's a ticket open for trunk over at CASSANDRA-6154.

 Speculative read performance data show unexpected results
 -

 Key: CASSANDRA-5932
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5932
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Aleksey Yeschenko
 Fix For: 2.0.2

 Attachments: 5932.6692c50412ef7d.compaction.png, 
 5932-6692c50412ef7d.png, 5932.6692c50412ef7d.rr0.png, 
 5932.6692c50412ef7d.rr1.png, 5932.ded39c7e1c2fa.logs.tar.gz, 5932.txt, 
 5933-128_and_200rc1.png, 5933-7a87fc11.png, 5933-logs.tar.gz, 
 5933-randomized-dsnitch-replica.2.png, 5933-randomized-dsnitch-replica.3.png, 
 5933-randomized-dsnitch-replica.png, compaction-makes-slow.png, 
 compaction-makes-slow-stats.png, eager-read-looks-promising.png, 
 eager-read-looks-promising-stats.png, eager-read-not-consistent.png, 
 eager-read-not-consistent-stats.png, node-down-increase-performance.png


 I've done a series of stress tests with eager retries enabled that show 
 undesirable behavior. I'm grouping these behaviours into one ticket as they 
 are most likely related.
 1) Killing off a node in a 4 node cluster actually increases performance.
 2) Compactions make nodes slow, even after the compaction is done.
 3) Eager Reads tend to lessen the *immediate* performance impact of a node 
 going down, but not consistently.
 My Environment:
 1 stress machine: node0
 4 C* nodes: node4, node5, node6, node7
 My script:
 node0 writes some data: stress -d node4 -F 3000 -n 3000 -i 5 -l 2 -K 
 20
 node0 reads some data: stress -d node4 -n 3000 -o read -i 5 -K 20
 h3. Examples:
 h5. A node going down increases performance:
 !node-down-increase-performance.png!
 [Data for this test 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]
 At 450s, I kill -9 one of the nodes. There is a brief decrease in performance 
 as the snitch adapts, but then it recovers... to even higher performance than 
 before.
 h5. Compactions make nodes permanently slow:
 !compaction-makes-slow.png!
 !compaction-makes-slow-stats.png!
 The green and orange lines represent trials with eager retry enabled, they 
 never recover their op-rate from before the compaction as the red and blue 
 lines do.
 [Data for this test 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.compaction.2.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]
 h5. Speculative Read tends to lessen the *immediate* impact:
 !eager-read-looks-promising.png!
 !eager-read-looks-promising-stats.png!
 This graph looked the most promising to me, the two trials with eager retry, 
 the green and orange line, at 450s showed the smallest dip in performance. 
 [Data for this test 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]
 h5. But not always:
 !eager-read-not-consistent.png!
 !eager-read-not-consistent-stats.png!
 This is a retrial with the same settings as above, yet the 95percentile eager 
 retry (red line) did poorly this time at 450s.
 [Data for this test 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.rc1.try2.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6154) Inserts are blocked in 2.1

2013-10-07 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6154:
--

Attachment: 6154.txt

Looks like merge to trunk from 2.0 was syntactically correct but semantically 
broken.  Attached.

 Inserts are blocked in 2.1
 --

 Key: CASSANDRA-6154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Priority: Critical
 Attachments: 6154.txt


 With cluster sizes 1 inserts are blocked indefinitely:
 {code}
 $ ccm create -v git:trunk test
 Fetching Cassandra updates...
 Current cluster is now: test
 $ ccm populate -n 2
 $ ccm start
 $ ccm node1 cqlsh
 Connected to test at 127.0.0.1:9160.
 [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
 19.37.0]
 Use HELP for help.
 cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': 1};
 cqlsh USE timeline;
 cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
 text, PRIMARY KEY (userid, event));
 cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
 'ryan', '2013-10-07', 'attempt');
 {code}
 The last INSERT statement never returns..



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Lyuben Todorov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lyuben Todorov updated CASSANDRA-4809:
--

Attachment: (was: 4809__v2.patch)

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput

2013-10-07 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788501#comment-13788501
 ] 

Benedict commented on CASSANDRA-4718:
-

Not necessarily. I still think that was most likely variance:

- I have BAQ at same speed as LBQ in application
- a 2x slow down of LBQ - 0.01x slow down of application
- a 10x slow down of LBQ - 0.05x slow down of application

= the queue speed is currently only ~1% of application cost. It's possible the 
faster queue is causing greater contention at a sync point, but this wouldn't 
work in the opposite direction if the contention at the sync point is low. 
Either way, if this were true we'd see the artificially slow queues also 
improve stress performance.

Ryan also ran some of my tests and found no difference. I wouldn't absolutely 
rule out the possibility his test was valid, though, as I did not swap out the 
queues in OutboundTcpConnection for these tests as, at the time, I was 
concerned about the calls to size() which are expensive for my test queues, and 
I wanted the queue swap to be on equal terms across the board. I realise now 
these are only called via JMX, so shouldn't stop me swapping them in.

I've just tried a quick test of directly (in process) stressing through the 
MessagingService and found no measureable difference to putting BAQ in the 
OutboundTcpConnection, though if I swap out across the board it is about 25% 
slower, which itself is interesting as this is close to a full stress, minus 
thrift.

 More-efficient ExecutorService for improved throughput
 --

 Key: CASSANDRA-4718
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
Priority: Minor
  Labels: performance
 Attachments: baq vs trunk.png, op costs of various queues.ods, 
 PerThreadQueue.java, stress op rate with various queues.ods


 Currently all our execution stages dequeue tasks one at a time.  This can 
 result in contention between producers and consumers (although we do our best 
 to minimize this by using LinkedBlockingQueue).
 One approach to mitigating this would be to make consumer threads do more 
 work in bulk instead of just one task per dequeue.  (Producer threads tend 
 to be single-task oriented by nature, so I don't see an equivalent 
 opportunity there.)
 BlockingQueue has a drainTo(collection, int) method that would be perfect for 
 this.  However, no ExecutorService in the jdk supports using drainTo, nor 
 could I google one.
 What I would like to do here is create just such a beast and wire it into (at 
 least) the write and read stages.  (Other possible candidates for such an 
 optimization, such as the CommitLog and OutboundTCPConnection, are not 
 ExecutorService-based and will need to be one-offs.)
 AbstractExecutorService may be useful.  The implementations of 
 ICommitLogExecutorService may also be useful. (Despite the name these are not 
 actual ExecutorServices, although they share the most important properties of 
 one.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-4809) Allow restoring specific column families from archived commitlog

2013-10-07 Thread Lyuben Todorov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lyuben Todorov updated CASSANDRA-4809:
--

Attachment: 4809_v2.patch

Removed two redundant System.getProperties(...)  lines.

 Allow restoring specific column families from archived commitlog
 

 Key: CASSANDRA-4809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4809
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Nick Bailey
Assignee: Lyuben Todorov
  Labels: lhf
 Fix For: 2.0.2

 Attachments: 4809.patch, 4809_v2.patch


 Currently you can only restore the entire contents of a commit log archive. 
 It would be useful to specify the keyspaces/column families you want to 
 restore from an archived commitlog.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5818) Duplicated error messages on directory creation error at startup

2013-10-07 Thread koray sariteke (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788514#comment-13788514
 ] 

koray sariteke commented on CASSANDRA-5818:
---

+1

 Duplicated error messages on directory creation error at startup
 

 Key: CASSANDRA-5818
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5818
 Project: Cassandra
  Issue Type: Bug
Reporter: Michaël Figuière
Assignee: koray sariteke
Priority: Trivial
 Fix For: 2.1

 Attachments: trunk-5818.patch


 When I start Cassandra without the appropriate OS access rights to the 
 default Cassandra directories, I get a flood of {{ERROR}} messages at 
 startup, whereas one per directory would be more appropriate. See bellow:
 {code}
 ERROR 13:37:39,792 Failed to create 
 /var/lib/cassandra/data/system/schema_triggers directory
 ERROR 13:37:39,797 Failed to create 
 /var/lib/cassandra/data/system/schema_triggers directory
 ERROR 13:37:39,798 Failed to create 
 /var/lib/cassandra/data/system/schema_triggers directory
 ERROR 13:37:39,798 Failed to create 
 /var/lib/cassandra/data/system/schema_triggers directory
 ERROR 13:37:39,799 Failed to create 
 /var/lib/cassandra/data/system/schema_triggers directory
 ERROR 13:37:39,800 Failed to create /var/lib/cassandra/data/system/batchlog 
 directory
 ERROR 13:37:39,801 Failed to create /var/lib/cassandra/data/system/batchlog 
 directory
 ERROR 13:37:39,801 Failed to create /var/lib/cassandra/data/system/batchlog 
 directory
 ERROR 13:37:39,802 Failed to create /var/lib/cassandra/data/system/batchlog 
 directory
 ERROR 13:37:39,802 Failed to create 
 /var/lib/cassandra/data/system/peer_events directory
 ERROR 13:37:39,803 Failed to create 
 /var/lib/cassandra/data/system/peer_events directory
 ERROR 13:37:39,803 Failed to create 
 /var/lib/cassandra/data/system/peer_events directory
 ERROR 13:37:39,804 Failed to create 
 /var/lib/cassandra/data/system/compactions_in_progress directory
 ERROR 13:37:39,805 Failed to create 
 /var/lib/cassandra/data/system/compactions_in_progress directory
 ERROR 13:37:39,805 Failed to create 
 /var/lib/cassandra/data/system/compactions_in_progress directory
 ERROR 13:37:39,806 Failed to create 
 /var/lib/cassandra/data/system/compactions_in_progress directory
 ERROR 13:37:39,807 Failed to create 
 /var/lib/cassandra/data/system/compactions_in_progress directory
 ERROR 13:37:39,808 Failed to create /var/lib/cassandra/data/system/hints 
 directory
 ERROR 13:37:39,809 Failed to create /var/lib/cassandra/data/system/hints 
 directory
 ERROR 13:37:39,809 Failed to create /var/lib/cassandra/data/system/hints 
 directory
 ERROR 13:37:39,811 Failed to create /var/lib/cassandra/data/system/hints 
 directory
 ERROR 13:37:39,811 Failed to create /var/lib/cassandra/data/system/hints 
 directory
 ERROR 13:37:39,812 Failed to create 
 /var/lib/cassandra/data/system/schema_keyspaces directory
 ERROR 13:37:39,812 Failed to create 
 /var/lib/cassandra/data/system/schema_keyspaces directory
 ERROR 13:37:39,813 Failed to create 
 /var/lib/cassandra/data/system/schema_keyspaces directory
 ERROR 13:37:39,814 Failed to create 
 /var/lib/cassandra/data/system/schema_keyspaces directory
 ERROR 13:37:39,814 Failed to create 
 /var/lib/cassandra/data/system/schema_keyspaces directory
 ERROR 13:37:39,815 Failed to create 
 /var/lib/cassandra/data/system/range_xfers directory
 ERROR 13:37:39,816 Failed to create 
 /var/lib/cassandra/data/system/range_xfers directory
 ERROR 13:37:39,817 Failed to create 
 /var/lib/cassandra/data/system/range_xfers directory
 ERROR 13:37:39,817 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,818 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,818 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,820 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,821 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,821 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,822 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,822 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,823 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,824 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,824 Failed to create 
 /var/lib/cassandra/data/system/schema_columnfamilies directory
 ERROR 13:37:39,825 Failed to create 
 

[jira] [Commented] (CASSANDRA-6154) Inserts are blocked in 2.1

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788517#comment-13788517
 ] 

Brandon Williams commented on CASSANDRA-6154:
-

Not quite.

noformat
ERROR [GossipStage:1] 2013-10-07 20:09:15,849 Caller+0   at 
org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134)
 - Exception in thread Thread[GossipStage:1,5,main]
java.lang.AssertionError: null
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:552)
 ~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:576) 
~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) 
~[main/:na]
at org.apache.cassandra.gms.Gossiper.markAlive(Gossiper.java:808) 
~[main/:na]
at 
org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:849) 
~[main/:na]
at 
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:934) 
~[main/:na]
at 
org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
 ~[main/:na]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) 
~[main/:na]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_17]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
~[na:1.7.0_17]
noformat

 Inserts are blocked in 2.1
 --

 Key: CASSANDRA-6154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Jonathan Ellis
Priority: Critical
 Attachments: 6154.txt


 With cluster sizes 1 inserts are blocked indefinitely:
 {code}
 $ ccm create -v git:trunk test
 Fetching Cassandra updates...
 Current cluster is now: test
 $ ccm populate -n 2
 $ ccm start
 $ ccm node1 cqlsh
 Connected to test at 127.0.0.1:9160.
 [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
 19.37.0]
 Use HELP for help.
 cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': 1};
 cqlsh USE timeline;
 cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
 text, PRIMARY KEY (userid, event));
 cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
 'ryan', '2013-10-07', 'attempt');
 {code}
 The last INSERT statement never returns..



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-6154) Inserts are blocked in 2.1

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788517#comment-13788517
 ] 

Brandon Williams edited comment on CASSANDRA-6154 at 10/7/13 8:14 PM:
--

Not quite.

{noformat}
ERROR [GossipStage:1] 2013-10-07 20:09:15,849 Caller+0   at 
org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134)
 - Exception in thread Thread[GossipStage:1,5,main]
java.lang.AssertionError: null
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:552)
 ~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:576) 
~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) 
~[main/:na]
at org.apache.cassandra.gms.Gossiper.markAlive(Gossiper.java:808) 
~[main/:na]
at 
org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:849) 
~[main/:na]
at 
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:934) 
~[main/:na]
at 
org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
 ~[main/:na]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) 
~[main/:na]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_17]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
~[na:1.7.0_17]
{noformat}


was (Author: brandon.williams):
Not quite.

noformat
ERROR [GossipStage:1] 2013-10-07 20:09:15,849 Caller+0   at 
org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:134)
 - Exception in thread Thread[GossipStage:1,5,main]
java.lang.AssertionError: null
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:552)
 ~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:576) 
~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:571) 
~[main/:na]
at org.apache.cassandra.gms.Gossiper.markAlive(Gossiper.java:808) 
~[main/:na]
at 
org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:849) 
~[main/:na]
at 
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:934) 
~[main/:na]
at 
org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
 ~[main/:na]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) 
~[main/:na]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_17]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
~[na:1.7.0_17]
noformat

 Inserts are blocked in 2.1
 --

 Key: CASSANDRA-6154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6154
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Jonathan Ellis
Priority: Critical
 Attachments: 6154.txt


 With cluster sizes 1 inserts are blocked indefinitely:
 {code}
 $ ccm create -v git:trunk test
 Fetching Cassandra updates...
 Current cluster is now: test
 $ ccm populate -n 2
 $ ccm start
 $ ccm node1 cqlsh
 Connected to test at 127.0.0.1:9160.
 [cqlsh 4.0.1 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
 19.37.0]
 Use HELP for help.
 cqlsh CREATE KEYSPACE timeline WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': 1};
 cqlsh USE timeline;
 cqlsh:timeline CREATE TABLE user_events (userid text, event timestamp, value 
 text, PRIMARY KEY (userid, event));
 cqlsh:timeline INSERT INTO user_events (userid, event , value ) VALUES ( 
 'ryan', '2013-10-07', 'attempt');
 {code}
 The last INSERT statement never returns..



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5916) gossip and tokenMetadata get hostId out of sync on failed replace_node with the same IP address

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788523#comment-13788523
 ] 

Brandon Williams commented on CASSANDRA-5916:
-

First, thanks for testing, [~ravilr]!

bq. does it make sense to allow the operator to specify replace_token with the 
token(s) along with the replace_address to recover

That could work, but I find it a bit ugly and confusing, especially since 
replace_token alone is supposed to work right now, but does not.

bq. I think remaining in shadow mode may not work optimally well for cases 
where the node being replaced was down for more than hint window. So, all the 
nodes would have stopped hinting, and after replace, it would require repair to 
be ran to get the new data fed during the replace.

That is true regardless of shadow mode though, since hibernate is a dead state 
and the node doesn't go live to reset the hint timer until the replace has 
completed.

 gossip and tokenMetadata get hostId out of sync on failed replace_node with 
 the same IP address
 ---

 Key: CASSANDRA-5916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5916
 Project: Cassandra
  Issue Type: Bug
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 1.2.11

 Attachments: 5916.txt


 If you try to replace_node an existing, live hostId, it will error out.  
 However if you're using an existing IP to do this (as in, you chose the wrong 
 uuid to replace on accident) then the newly generated hostId wipes out the 
 old one in TMD, and when you do try to replace it replace_node will complain 
 it does not exist.  Examination of gossipinfo still shows the old hostId, 
 however now you can't replace it either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6102) CassandraStorage broken for bigints and ints

2013-10-07 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788529#comment-13788529
 ] 

Brandon Williams commented on CASSANDRA-6102:
-

This part:

{noformat}
+// Don't want to create another TBase class, so use 
CfDef.populate_io_cache_on_flush 
+// to store flag of compact storage cql table.
+if (cql3Table  !(parseType(cfDef.comparator_type) instanceof 
AbstractCompositeType))
+cfDef.setPopulate_io_cache_on_flush(true);
+
+// Don't want to create another TBase class, so use 
CfDef.replicate_on_write 
+// to store flag of cql table.
+if (cql3Table)
+cfDef.setReplicate_on_write(true); 
{noformat}

Feels like a hack that is going to bite us down the road when those options 
really do get removed.

 CassandraStorage broken for bigints and ints
 

 Key: CASSANDRA-6102
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6102
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
 Environment: Cassandra 1.2.9  1.2.10, Pig 0.11.1, OSX 10.8.x
Reporter: Janne Jalkanen
Assignee: Alex Liu
 Attachments: 6102-1.2-branch.txt


 I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle 
 integer values.
 Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for 
 testing this. 
 First a table:
 {noformat}
  CREATE TABLE testc (
  key text PRIMARY KEY,
  ivalue int,
  svalue text,
  value bigint
 ) WITH COMPACT STORAGE;
  insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
  select * from testc;
 key | ivalue | svalue | value
 -+++---
 foo | 10 |bar | 65
 {noformat}
 For my Pig setup, I then use libraries from different C* versions to actually 
 talk to my database (which stays on 1.2.10 all the time).
 Cassandra 1.0.12 (using cassandra_storage.jar):
 {noformat}
 testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
 dump testc
 (foo,(svalue,bar),(ivalue,10),(value,65),{})
 {noformat}
 Cassandra 1.1.10:
 {noformat}
 testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
 dump testc
 (foo,(svalue,bar),(ivalue,10),(value,65),{})
 {noformat}
 Cassandra 1.2.10:
 {noformat}
 (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
 dump testc
 foo,{(ivalue,
 ),(svalue,bar),(value,A)})
 {noformat}
 To me it appears that ints and bigints are interpreted as ascii values in 
 cass 1.2.10.  Did something change for CassandraStorage, is there a 
 regression, or am I doing something wrong?  Quick perusal of the JIRA didn't 
 reveal anything that I could directly pin on this.
 Note that using compact storage does not seem to affect the issue, though it 
 obviously changes the resulting pig format.
 In addition, trying to use Pygmalion 
 {noformat}
 tf = foreach testc generate key, 
 flatten(FromCassandraBag('ivalue,svalue,value',columns)) as 
 (ivalue:int,svalue:chararray,lvalue:long);
 dump tf
 (foo,
 ,bar,A)
 {noformat}
 So no help there. Explicitly casting the values to (long) or (int) just 
 results in a ClassCastException.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[3/3] git commit: Merge branch 'cassandra-2.0' into trunk

2013-10-07 Thread yukim
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a5798165
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a5798165
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a5798165

Branch: refs/heads/trunk
Commit: a57981650453470d6ee204329edf0dd5d008d18e
Parents: 558a9e5 a2b1278
Author: Yuki Morishita yu...@apache.org
Authored: Mon Oct 7 15:33:17 2013 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Oct 7 15:33:17 2013 -0500

--
 CHANGES.txt |  1 +
 NEWS.txt|  3 +
 .../org/apache/cassandra/config/CFMetaData.java | 11 +++
 .../org/apache/cassandra/config/KSMetaData.java |  1 +
 .../org/apache/cassandra/db/SystemKeyspace.java | 25 ++
 .../CompactionHistoryTabularData.java   | 84 
 .../db/compaction/CompactionManager.java| 14 
 .../db/compaction/CompactionManagerMBean.java   |  4 +
 .../cassandra/db/compaction/CompactionTask.java | 52 ++--
 .../org/apache/cassandra/tools/NodeCmd.java | 26 ++
 .../org/apache/cassandra/tools/NodeProbe.java   |  6 ++
 .../apache/cassandra/tools/NodeToolHelp.yaml|  3 +
 12 files changed, 204 insertions(+), 26 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/NEWS.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/src/java/org/apache/cassandra/config/CFMetaData.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/src/java/org/apache/cassandra/db/SystemKeyspace.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5798165/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
--



[2/3] git commit: Save compaction history to system keyspace

2013-10-07 Thread yukim
Save compaction history to system keyspace

patch by lantao yan; reviewed by yukim for CASSANDRA-5078


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a2b12784
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a2b12784
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a2b12784

Branch: refs/heads/trunk
Commit: a2b12784fe3785fe96d9c0e2d7e8c72bfc88ac7c
Parents: 01a57ee
Author: lantao yan yanlan...@hotmail.com
Authored: Mon Oct 7 15:22:11 2013 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Oct 7 15:30:50 2013 -0500

--
 CHANGES.txt |  1 +
 NEWS.txt|  3 +
 .../org/apache/cassandra/config/CFMetaData.java | 11 +++
 .../org/apache/cassandra/config/KSMetaData.java |  1 +
 .../org/apache/cassandra/db/SystemKeyspace.java | 25 ++
 .../CompactionHistoryTabularData.java   | 84 
 .../db/compaction/CompactionManager.java| 14 
 .../db/compaction/CompactionManagerMBean.java   |  4 +
 .../cassandra/db/compaction/CompactionTask.java | 52 ++--
 .../org/apache/cassandra/tools/NodeCmd.java | 26 ++
 .../org/apache/cassandra/tools/NodeProbe.java   |  6 ++
 .../apache/cassandra/tools/NodeToolHelp.yaml|  3 +
 12 files changed, 204 insertions(+), 26 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index ddd976e..ee631a0 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -20,6 +20,7 @@
  * Allow alter keyspace on system_traces (CASSANDRA-6016)
  * Disallow empty column names in cql (CASSANDRA-6136)
  * Use Java7 file-handling APIs and fix file moving on Windows (CASSANDRA-5383)
+ * Save compaction history to system keyspace (CASSANDRA-5078)
 Merged from 1.2:
  * Limit CQL prepared statement cache by size instead of count (CASSANDRA-6107)
  * Tracing should log write failure rather than raw exceptions (CASSANDRA-6133)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index 6ed8449..37fbae7 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -23,6 +23,9 @@ New features
   (See blog post at TODO)
 - Configurable metrics reporting
   (see conf/metrics-reporter-config-sample.yaml)
+- Compaction history and stats are now saved to system keyspace
+  (system.compaction_history table). You can access historiy via
+  new 'nodetool compactionhistory' command or CQL.
 
 Upgrading
 -

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index 8c4075c..bbea21e 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -268,6 +268,17 @@ public final class CFMetaData
+ PRIMARY KEY 
((keyspace_name, columnfamily_name, generation))
+ ) WITH 
COMMENT='historic sstable read rates');
 
+public static final CFMetaData CompactionHistoryCf = compile(CREATE TABLE 
 + SystemKeyspace.COMPACTION_HISTORY_CF +  (
+ + id uuid,
+ + 
keyspace_name text,
+ + 
columnfamily_name text,
+ + 
compacted_at timestamp,
+ + bytes_in 
bigint,
+ + bytes_out 
bigint,
+ + 
rows_merged mapint, bigint,
+ + PRIMARY 
KEY (id)
+ + ) WITH 
COMMENT='show all compaction history' AND DEFAULT_TIME_TO_LIVE=604800);
+
 public enum Caching
 {
 ALL, KEYS_ONLY, ROWS_ONLY, NONE;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/src/java/org/apache/cassandra/config/KSMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/KSMetaData.java 
b/src/java/org/apache/cassandra/config/KSMetaData.java
index 

[1/3] git commit: Save compaction history to system keyspace

2013-10-07 Thread yukim
Updated Branches:
  refs/heads/cassandra-2.0 01a57eea8 - a2b12784f
  refs/heads/trunk 558a9e57b - a57981650


Save compaction history to system keyspace

patch by lantao yan; reviewed by yukim for CASSANDRA-5078


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a2b12784
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a2b12784
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a2b12784

Branch: refs/heads/cassandra-2.0
Commit: a2b12784fe3785fe96d9c0e2d7e8c72bfc88ac7c
Parents: 01a57ee
Author: lantao yan yanlan...@hotmail.com
Authored: Mon Oct 7 15:22:11 2013 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Oct 7 15:30:50 2013 -0500

--
 CHANGES.txt |  1 +
 NEWS.txt|  3 +
 .../org/apache/cassandra/config/CFMetaData.java | 11 +++
 .../org/apache/cassandra/config/KSMetaData.java |  1 +
 .../org/apache/cassandra/db/SystemKeyspace.java | 25 ++
 .../CompactionHistoryTabularData.java   | 84 
 .../db/compaction/CompactionManager.java| 14 
 .../db/compaction/CompactionManagerMBean.java   |  4 +
 .../cassandra/db/compaction/CompactionTask.java | 52 ++--
 .../org/apache/cassandra/tools/NodeCmd.java | 26 ++
 .../org/apache/cassandra/tools/NodeProbe.java   |  6 ++
 .../apache/cassandra/tools/NodeToolHelp.yaml|  3 +
 12 files changed, 204 insertions(+), 26 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index ddd976e..ee631a0 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -20,6 +20,7 @@
  * Allow alter keyspace on system_traces (CASSANDRA-6016)
  * Disallow empty column names in cql (CASSANDRA-6136)
  * Use Java7 file-handling APIs and fix file moving on Windows (CASSANDRA-5383)
+ * Save compaction history to system keyspace (CASSANDRA-5078)
 Merged from 1.2:
  * Limit CQL prepared statement cache by size instead of count (CASSANDRA-6107)
  * Tracing should log write failure rather than raw exceptions (CASSANDRA-6133)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index 6ed8449..37fbae7 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -23,6 +23,9 @@ New features
   (See blog post at TODO)
 - Configurable metrics reporting
   (see conf/metrics-reporter-config-sample.yaml)
+- Compaction history and stats are now saved to system keyspace
+  (system.compaction_history table). You can access historiy via
+  new 'nodetool compactionhistory' command or CQL.
 
 Upgrading
 -

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index 8c4075c..bbea21e 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -268,6 +268,17 @@ public final class CFMetaData
+ PRIMARY KEY 
((keyspace_name, columnfamily_name, generation))
+ ) WITH 
COMMENT='historic sstable read rates');
 
+public static final CFMetaData CompactionHistoryCf = compile(CREATE TABLE 
 + SystemKeyspace.COMPACTION_HISTORY_CF +  (
+ + id uuid,
+ + 
keyspace_name text,
+ + 
columnfamily_name text,
+ + 
compacted_at timestamp,
+ + bytes_in 
bigint,
+ + bytes_out 
bigint,
+ + 
rows_merged mapint, bigint,
+ + PRIMARY 
KEY (id)
+ + ) WITH 
COMMENT='show all compaction history' AND DEFAULT_TIME_TO_LIVE=604800);
+
 public enum Caching
 {
 ALL, KEYS_ONLY, ROWS_ONLY, NONE;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2b12784/src/java/org/apache/cassandra/config/KSMetaData.java
--
diff --git 

[jira] [Commented] (CASSANDRA-6137) CQL3 SELECT IN CLAUSE inconsistent

2013-10-07 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788541#comment-13788541
 ] 

Constance Eustace commented on CASSANDRA-6137:
--

We are currently simply selecting all the column keys rather than doing a 
subset of the columns using a where columnkey in (columnkeylist).

The row we are doing this on isn't horribly worse than just getting the desired 
ones, but it's still not ideal. 


 CQL3 SELECT IN CLAUSE inconsistent
 --

 Key: CASSANDRA-6137
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6137
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu AWS Cassandra 2.0.1 SINGLE NODE
Reporter: Constance Eustace
 Fix For: 2.0.1


 We are encountering inconsistent results from CQL3 queries with column keys 
 using IN clause in WHERE. This has been reproduced in cqlsh.
 Rowkey is e_entid
 Column key is p_prop
 This returns roughly 21 rows for 21 column keys that match p_prop.
 cqlsh SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB';
 These three queries each return one row for the requested single column key 
 in the IN clause:
 SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:complete:count');
 SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:all:count');
 SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:fail:count');
 This query returns ONLY ONE ROW (one column key), not three as I would expect 
 from the three-column-key IN clause:
 cqlsh SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
 ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');
 This query does return two rows however for the requested two column keys:
 cqlsh SELECT 
 e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
  FROM internal_submission.Entity_Job WHERE e_entid = 
 '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in (  
   
 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');
 cqlsh describe table internal_submission.entity_job;
 CREATE TABLE entity_job (
   e_entid text,
   p_prop text,
   describes text,
   dndcondition text,
   e_entlinks text,
   e_entname text,
   e_enttype text,
   ingeststatus text,
   ingeststatusdetail text,
   p_flags text,
   p_propid text,
   p_proplinks text,
   p_storage text,
   p_subents text,
   p_val text,
   p_vallang text,
   p_vallinks text,
   p_valtype text,
   p_valunit text,
   p_vars text,
   partnerid text,
   referenceid text,
   size int,
   sourceip text,
   submitdate bigint,
   submitevent text,
   userid text,
   version text,
   PRIMARY KEY (e_entid, p_prop)
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='NONE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes);
 CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job 
 (dndcondition);
 CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job 
 (ingeststatus);
 CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job 
 (ingeststatusdetail);
 CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job 
 (referenceid);
 CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid);
 CREATE INDEX 

[jira] [Updated] (CASSANDRA-6137) CQL3 SELECT IN CLAUSE inconsistent

2013-10-07 Thread Constance Eustace (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Constance Eustace updated CASSANDRA-6137:
-

Description: 
We are encountering inconsistent results from CQL3 queries with column keys 
using IN clause in WHERE. This has been reproduced in cqlsh and the jdbc driver.

Rowkey is e_entid
Column key is p_prop

This returns roughly 21 rows for 21 column keys that match p_prop.

cqlsh SELECT 
e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
 FROM internal_submission.Entity_Job WHERE e_entid = 
'845b38f1-2b91-11e3-854d-126aad0075d4-CJOB';

These three queries each return one row for the requested single column key in 
the IN clause:

SELECT 
e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
 FROM internal_submission.Entity_Job WHERE e_entid = 
'845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
('urn:bby:pcm:job:ingest:content:complete:count');
SELECT 
e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
 FROM internal_submission.Entity_Job WHERE e_entid = 
'845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
('urn:bby:pcm:job:ingest:content:all:count');
SELECT 
e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
 FROM internal_submission.Entity_Job WHERE e_entid = 
'845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
('urn:bby:pcm:job:ingest:content:fail:count');

This query returns ONLY ONE ROW (one column key), not three as I would expect 
from the three-column-key IN clause:

cqlsh SELECT 
e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
 FROM internal_submission.Entity_Job WHERE e_entid = 
'845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in 
('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');

This query does return two rows however for the requested two column keys:

cqlsh SELECT 
e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
 FROM internal_submission.Entity_Job WHERE e_entid = 
'845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'  AND p_prop in (

'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');





cqlsh describe table internal_submission.entity_job;

CREATE TABLE entity_job (
  e_entid text,
  p_prop text,
  describes text,
  dndcondition text,
  e_entlinks text,
  e_entname text,
  e_enttype text,
  ingeststatus text,
  ingeststatusdetail text,
  p_flags text,
  p_propid text,
  p_proplinks text,
  p_storage text,
  p_subents text,
  p_val text,
  p_vallang text,
  p_vallinks text,
  p_valtype text,
  p_valunit text,
  p_vars text,
  partnerid text,
  referenceid text,
  size int,
  sourceip text,
  submitdate bigint,
  submitevent text,
  userid text,
  version text,
  PRIMARY KEY (e_entid, p_prop)
) WITH
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.10 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='NONE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes);

CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job 
(dndcondition);

CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job 
(ingeststatus);

CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job 
(ingeststatusdetail);

CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job (referenceid);

CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid);

CREATE INDEX internal_submission__JobVersionIDX ON entity_job (version);

---

My suspicion is that the three-column-key IN Clause is translated (improperly 
or not) to a two-column key range with the assumption that the third column key 
is present in that range, but it isn't...


  was:
We are encountering inconsistent results from CQL3 queries with column keys 
using IN clause in WHERE. This has been reproduced in cqlsh.

Rowkey is e_entid
Column key is p_prop

This returns roughly 21 rows for 21 column keys that match p_prop.

cqlsh SELECT 
e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
 FROM internal_submission.Entity_Job WHERE e_entid = 

[jira] [Commented] (CASSANDRA-5916) gossip and tokenMetadata get hostId out of sync on failed replace_node with the same IP address

2013-10-07 Thread Ravi Prasad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788570#comment-13788570
 ] 

Ravi Prasad commented on CASSANDRA-5916:


That is true regardless of shadow mode though, since hibernate is a dead state 
and the node doesn't go live to reset the hint timer  until the replace has 
completed.

my understanding is due to the generation change of the replacing node, 
gossiper.handleMajorStateChange marks the node as dead, as hibernate is one of 
the DEAD_STATES. So, the other nodes marks the replacing node as dead before 
the token bootstrap starts, hence should be storing hints to the replacing node 
from that point.

 gossip and tokenMetadata get hostId out of sync on failed replace_node with 
 the same IP address
 ---

 Key: CASSANDRA-5916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5916
 Project: Cassandra
  Issue Type: Bug
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 1.2.11

 Attachments: 5916.txt


 If you try to replace_node an existing, live hostId, it will error out.  
 However if you're using an existing IP to do this (as in, you chose the wrong 
 uuid to replace on accident) then the newly generated hostId wipes out the 
 old one in TMD, and when you do try to replace it replace_node will complain 
 it does not exist.  Examination of gossipinfo still shows the old hostId, 
 however now you can't replace it either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


  1   2   >