[jira] [Commented] (CASSANDRA-8192) AssertionError in Memory.java

2014-11-26 Thread Andreas Schnitzerling (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225896#comment-14225896
 ] 

Andreas Schnitzerling commented on CASSANDRA-8192:
--

Hello, we are not the same persons :-). It is happening at an existing dataset 
from 2.0.10. Actually I'm downgrading my 3 of 12 nodes back to 2.0.1x because I 
really need a stable system now. In my expirience 2.0 and 2.1 are not full 
compatible (cql metadata like tables are not shown in mixed cluster and repair 
is failure-prone as well between 2.0 and 2.1). Since yst we have 40 new devices 
under test and today additionally 36. All their data I need to import to my 
cluster. I will upgrade in a few months after some releases. As far I know, 
there is a C* mode for testing w/o (completly) joining the cluster? How do I 
configurate it?

 AssertionError in Memory.java
 -

 Key: CASSANDRA-8192
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8192
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Windows-7-32 bit, 3GB RAM, Java 1.7.0_67
Reporter: Andreas Schnitzerling
Assignee: Joshua McKenzie
 Attachments: cassandra.bat, cassandra.yaml, 
 printChunkOffsetErrors.txt, system.log


 Since update of 1 of 12 nodes from 2.1.0-rel to 2.1.1-rel Exception during 
 start up.
 {panel:title=system.log}
 ERROR [SSTableBatchOpen:1] 2014-10-27 09:44:00,079 CassandraDaemon.java:153 - 
 Exception in thread Thread[SSTableBatchOpen:1,5,main]
 java.lang.AssertionError: null
   at org.apache.cassandra.io.util.Memory.size(Memory.java:307) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:135)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:83)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:50)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:48)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:766) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:725) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:402) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:302) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:438) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
 ~[na:1.7.0_55]
   at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
 [na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
 [na:1.7.0_55]
   at java.lang.Thread.run(Unknown Source) [na:1.7.0_55]
 {panel}
 In the attached log you can still see as well CASSANDRA-8069 and 
 CASSANDRA-6283.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8192) AssertionError in Memory.java

2014-11-26 Thread Andreas Schnitzerling (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225907#comment-14225907
 ] 

Andreas Schnitzerling commented on CASSANDRA-8192:
--

I have another, offtopic question: If I make repair on every node w/ option 
-pr, is it the same effect like w/o -pr on one node? What is neccessary to sync 
replicas as well? (f.e. always w/o -pr or if -pr then repair on every 
node...)(FYI: I need to use option -par since Im using windows). Thx.

 AssertionError in Memory.java
 -

 Key: CASSANDRA-8192
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8192
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Windows-7-32 bit, 3GB RAM, Java 1.7.0_67
Reporter: Andreas Schnitzerling
Assignee: Joshua McKenzie
 Attachments: cassandra.bat, cassandra.yaml, 
 printChunkOffsetErrors.txt, system.log


 Since update of 1 of 12 nodes from 2.1.0-rel to 2.1.1-rel Exception during 
 start up.
 {panel:title=system.log}
 ERROR [SSTableBatchOpen:1] 2014-10-27 09:44:00,079 CassandraDaemon.java:153 - 
 Exception in thread Thread[SSTableBatchOpen:1,5,main]
 java.lang.AssertionError: null
   at org.apache.cassandra.io.util.Memory.size(Memory.java:307) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:135)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:83)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:50)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:48)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:766) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:725) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:402) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:302) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:438) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
 ~[na:1.7.0_55]
   at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
 [na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
 [na:1.7.0_55]
   at java.lang.Thread.run(Unknown Source) [na:1.7.0_55]
 {panel}
 In the attached log you can still see as well CASSANDRA-8069 and 
 CASSANDRA-6283.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-26 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7563:

Attachment: (was: 7563v8-diff-diff.txt)

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt, 7563v7.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-26 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7563:

Attachment: (was: 7563v8.txt)

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt, 7563v7.txt, 7563v8-diff-diff.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-26 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7563:

Attachment: 7563v8.txt
7563v8-diff-diff.txt

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt, 7563v7.txt, 7563v8-diff-diff.txt, 
 7563v8.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8192) AssertionError in Memory.java

2014-11-26 Thread Andreas Schnitzerling (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Schnitzerling updated CASSANDRA-8192:
-
Attachment: logdata-onlinedata-ka-196504-CompressionInfo.zip
system-sstable_activity-jb-25-Filter.zip
system-compactions_in_progress-ka-47594-CompressionInfo.zip
system_AssertionTest.log

The patch found some files. I attached them including the log.

 AssertionError in Memory.java
 -

 Key: CASSANDRA-8192
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8192
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Windows-7-32 bit, 3GB RAM, Java 1.7.0_67
Reporter: Andreas Schnitzerling
Assignee: Joshua McKenzie
 Attachments: cassandra.bat, cassandra.yaml, 
 logdata-onlinedata-ka-196504-CompressionInfo.zip, printChunkOffsetErrors.txt, 
 system-compactions_in_progress-ka-47594-CompressionInfo.zip, 
 system-sstable_activity-jb-25-Filter.zip, system.log, system_AssertionTest.log


 Since update of 1 of 12 nodes from 2.1.0-rel to 2.1.1-rel Exception during 
 start up.
 {panel:title=system.log}
 ERROR [SSTableBatchOpen:1] 2014-10-27 09:44:00,079 CassandraDaemon.java:153 - 
 Exception in thread Thread[SSTableBatchOpen:1,5,main]
 java.lang.AssertionError: null
   at org.apache.cassandra.io.util.Memory.size(Memory.java:307) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:135)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:83)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:50)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:48)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:766) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:725) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:402) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:302) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:438) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
 ~[na:1.7.0_55]
   at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
 [na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
 [na:1.7.0_55]
   at java.lang.Thread.run(Unknown Source) [na:1.7.0_55]
 {panel}
 In the attached log you can still see as well CASSANDRA-8069 and 
 CASSANDRA-6283.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8374) Better support of null for UDF

2014-11-26 Thread Sylvain Lebresne (JIRA)
Sylvain Lebresne created CASSANDRA-8374:
---

 Summary: Better support of null for UDF
 Key: CASSANDRA-8374
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8374
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
 Fix For: 3.0


Currently, every function needs to deal with it's argument potentially being 
{{null}}. There is very many case where that's just annoying, users should be 
able to define a function like:
{noformat}
CREATE FUNCTION addTwo(val int) RETURNS int LANGUAGE JAVA AS 'return val + 2;'
{noformat}
without having this crashing as soon as a column it's applied to doesn't a 
value for some rows (I'll note that this definition apparently cannot be 
compiled currently, which should be looked into).  

In fact, I think that by default methods shouldn't have to care about {{null}} 
values: if the value is {{null}}, we should not call the method at all and 
return {{null}}. There is still methods that may explicitely want to handle 
{{null}} (to return a default value for instance), so maybe we can add an 
{{ALLOW NULLS}} to the creation syntax.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8354) A better story for dealing with empty values

2014-11-26 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225968#comment-14225968
 ] 

Sylvain Lebresne commented on CASSANDRA-8354:
-

Thing is, I'm not sure how we could properly convert people out of those empty 
values completely without breaking thrift compatibility.

I'm typically not sure how that {{strict_cql_values}} option would work in 
practice (would that be a global yaml option that affects thrift too btw?).

That said, I've kind of mixed two issues in this ticket. The main reason I've 
opened this was the UDF question, but I realize that this question is actually 
already a problem with {{null}} and so I've created a separate issue for it 
(CASSANDRA-8374). Provided we fix that latter issue, it's probably ok for UDT 
to consider that empty values (for types for which they are not reasonable 
values) are always converted to {{null}} (which is already how it works in 
fact).

Still, it would be nice to change the default for CQL so that empty values are 
refused. I'm just not sure I see how to make that happen in practice without a 
syntax addition.


 A better story for dealing with empty values
 

 Key: CASSANDRA-8354
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8354
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
 Fix For: 3.0


 In CQL, a value of any type can be empty, even for types for which such 
 values doesn't make any sense (int, uuid, ...). Note that it's different from 
 having no value (i.e. a {{null}}). This is due to historical reasons, and we 
 can't entirely disallow it for backward compatibility, but it's pretty 
 painful when working with CQL since you always need to be defensive about 
 such largely non-sensical values.
 This is particularly annoying with UDF: those empty values are represented as 
 {{null}} for UDF and that plays weirdly with UDF that use unboxed native 
 types.
 So I would suggest that we introduce variations of the types that don't 
 accept empty byte buffers for those type for which it's not a particularly 
 sensible value.
 Ideally we'd use those variant by default, that is:
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int)
 {noformat}
 would not accept empty values for {{v}}. But
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int ALLOW EMPTY)
 {noformat}
 would.
 Similarly, for UDF, a function like:
 {noformat}
 CREATE FUNCTION incr(v int) RETURNS int LANGUAGE JAVA AS 'return v + 1';
 {noformat}
 would be guaranteed it can only be applied where no empty values are allowed. 
 A
 function that wants to handle empty values could be created with:
 {noformat}
 CREATE FUNCTION incr(v int ALLOW EMPTY) RETURNS int ALLOW EMPTY LANGUAGE JAVA 
 AS 'return (v == null) ? null : v + 1';
 {noformat}
 Of course, doing that has the problem of backward compatibility. One option 
 could be to say that if a type doesn't accept empties, but we do have an 
 empty internally, then we convert it to some reasonably sensible default 
 value (0 for numeric values, the smallest possible uuid for uuids, etc...). 
 This way, we could allow convesion of types to and from 'ALLOW EMPTY'. And 
 maybe we'd say that existing compact tables gets the 'ALLOW EMPTY' flag for 
 their types by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8374) Better support of null for UDF

2014-11-26 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225971#comment-14225971
 ] 

Robert Stupp commented on CASSANDRA-8374:
-

Just a note: I recommend to wait for CASSANDRA-7563 and CASSANDRA-8053 before 
tackling this ticket to avoid a lot of merge conflicts.

 Better support of null for UDF
 --

 Key: CASSANDRA-8374
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8374
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
 Fix For: 3.0


 Currently, every function needs to deal with it's argument potentially being 
 {{null}}. There is very many case where that's just annoying, users should be 
 able to define a function like:
 {noformat}
 CREATE FUNCTION addTwo(val int) RETURNS int LANGUAGE JAVA AS 'return val + 2;'
 {noformat}
 without having this crashing as soon as a column it's applied to doesn't a 
 value for some rows (I'll note that this definition apparently cannot be 
 compiled currently, which should be looked into).  
 In fact, I think that by default methods shouldn't have to care about 
 {{null}} values: if the value is {{null}}, we should not call the method at 
 all and return {{null}}. There is still methods that may explicitely want to 
 handle {{null}} (to return a default value for instance), so maybe we can add 
 an {{ALLOW NULLS}} to the creation syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-4987) Support more queries when ALLOW FILTERING is used.

2014-11-26 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225975#comment-14225975
 ] 

Sylvain Lebresne commented on CASSANDRA-4987:
-

bq.  is this ticket for implementing the same for other types of queries such 
as UPDATE, DELETE etc with row selection predicate?

No it's not. The implementation of {{ALLOW FILTERING}} for {{SELECT}} is 
incomplete and this ticket was created to somewhat complete it.

 Support more queries when ALLOW FILTERING is used.
 --

 Key: CASSANDRA-4987
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4987
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
  Labels: cql
 Fix For: 3.0


 Even after CASSANDRA-4915, there is still a bunch of queries that we don't 
 support even if {{ALLOW FILTERING}} is used. Typically, pretty much any 
 queries with restriction on a non-primary-key column unless we have one of 
 those restriction that is an EQ on an indexed column.
 If {{ALLOW FILTERING}} is used, we could allow those queries out of 
 convenience.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8267) Only stream from unrepaired sstables during incremental repair

2014-11-26 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226035#comment-14226035
 ] 

Aleksey Yeschenko commented on CASSANDRA-8267:
--

How about we just implement CASSANDRA-8110 one release earlier?

Then we can negotiate the version like we do w/ messaging version for outbound 
connections.

 Only stream from unrepaired sstables during incremental repair
 --

 Key: CASSANDRA-8267
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8267
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 2.1.3


 Seems we stream from all sstables even if we do incremental repair, we should 
 limit this to only stream from the unrepaired sstables if we do incremental 
 repair



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-8375) Cleanup of generics in bounds serialization

2014-11-26 Thread Branimir Lambov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov reassigned CASSANDRA-8375:
--

Assignee: Branimir Lambov

 Cleanup of generics in bounds serialization
 ---

 Key: CASSANDRA-8375
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8375
 Project: Cassandra
  Issue Type: Improvement
Reporter: Branimir Lambov
Assignee: Branimir Lambov
Priority: Trivial

 There is currently a single serializer for {{AbstractBounds}} applied to both 
 {{Token}} and {{RowPosition}} ranges and bounds. This serializer does not 
 know which kind of bounds it needs to work with, which causes some 
 necessarily unsafe conversions and needs extra code in all bounds types 
 ({{toRowBounds}}/{{toTokenBounds}}) to make the conversions safe, the 
 application of which can be easily forgotten.
 As all users of this serialization know in advance the kind of range they 
 want to serialize, this can be replaced by simpler type-specific 
 serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8375) Cleanup of generics in bounds serialization

2014-11-26 Thread Branimir Lambov (JIRA)
Branimir Lambov created CASSANDRA-8375:
--

 Summary: Cleanup of generics in bounds serialization
 Key: CASSANDRA-8375
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8375
 Project: Cassandra
  Issue Type: Improvement
Reporter: Branimir Lambov
Priority: Trivial


There is currently a single serializer for {{AbstractBounds}} applied to both 
{{Token}} and {{RowPosition}} ranges and bounds. This serializer does not know 
which kind of bounds it needs to work with, which causes some necessarily 
unsafe conversions and needs extra code in all bounds types 
({{toRowBounds}}/{{toTokenBounds}}) to make the conversions safe, the 
application of which can be easily forgotten.

As all users of this serialization know in advance the kind of range they want 
to serialize, this can be replaced by simpler type-specific serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8375) Cleanup of generics in bounds serialization

2014-11-26 Thread Branimir Lambov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-8375:
---
Attachment: 8375.patch

 Cleanup of generics in bounds serialization
 ---

 Key: CASSANDRA-8375
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8375
 Project: Cassandra
  Issue Type: Improvement
Reporter: Branimir Lambov
Assignee: Branimir Lambov
Priority: Trivial
 Attachments: 8375.patch


 There is currently a single serializer for {{AbstractBounds}} applied to both 
 {{Token}} and {{RowPosition}} ranges and bounds. This serializer does not 
 know which kind of bounds it needs to work with, which causes some 
 necessarily unsafe conversions and needs extra code in all bounds types 
 ({{toRowBounds}}/{{toTokenBounds}}) to make the conversions safe, the 
 application of which can be easily forgotten.
 As all users of this serialization know in advance the kind of range they 
 want to serialize, this can be replaced by simpler type-specific 
 serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8375) Cleanup of generics in bounds serialization

2014-11-26 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-8375:
-
Reviewer: Joshua McKenzie

 Cleanup of generics in bounds serialization
 ---

 Key: CASSANDRA-8375
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8375
 Project: Cassandra
  Issue Type: Improvement
Reporter: Branimir Lambov
Assignee: Branimir Lambov
Priority: Trivial
 Attachments: 8375.patch


 There is currently a single serializer for {{AbstractBounds}} applied to both 
 {{Token}} and {{RowPosition}} ranges and bounds. This serializer does not 
 know which kind of bounds it needs to work with, which causes some 
 necessarily unsafe conversions and needs extra code in all bounds types 
 ({{toRowBounds}}/{{toTokenBounds}}) to make the conversions safe, the 
 application of which can be easily forgotten.
 As all users of this serialization know in advance the kind of range they 
 want to serialize, this can be replaced by simpler type-specific 
 serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7933) Update cassandra-stress README

2014-11-26 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-7933:
---
Fix Version/s: 2.1.3

 Update cassandra-stress README
 --

 Key: CASSANDRA-7933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7933
 Project: Cassandra
  Issue Type: Task
Reporter: Benedict
Assignee: Liang Xie
Priority: Minor
 Fix For: 2.1.3

 Attachments: CASSANDRA-7933.txt


 There is a README in the tools/stress directory. It is completely out of date.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7933) Update cassandra-stress README

2014-11-26 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226206#comment-14226206
 ] 

Philip Thompson commented on CASSANDRA-7933:


[~xieliang007], you make no distinction between the options that need preceded 
by a dash, such as {{bin/cassandra-stress write -node}}, and those that do not, 
such as {{bin/cassandra-stress write n}}. It is also unclear what criteria you 
used to decide which of the many options belong in the README under Important 
Options.

More seriously, there is also no notice in the README that many of these 
options are really suboptions, such as 'keyspace' being part of '-schema'.

Finally, the README gives no information on how to discover the remaining 
options, or the syntax to use for any of these, which should be accomplished 
with an explanation of the help command.

I'm +0 on this, as it would be better top ship 2.1.3 with the included patch 
than with what is currently there.

 Update cassandra-stress README
 --

 Key: CASSANDRA-7933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7933
 Project: Cassandra
  Issue Type: Task
Reporter: Benedict
Assignee: Liang Xie
Priority: Minor
 Fix For: 2.1.3

 Attachments: CASSANDRA-7933.txt


 There is a README in the tools/stress directory. It is completely out of date.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7159) sstablemetadata command should print some more stuff

2014-11-26 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-7159:
---
Fix Version/s: (was: 2.0.12)
   2.1.3

 sstablemetadata command should print some more stuff
 

 Key: CASSANDRA-7159
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7159
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jeremiah Jordan
Assignee: Vladislav Sinjavin
Priority: Trivial
  Labels: lhf
 Fix For: 2.1.3

 Attachments: 
 CASSANDRA-7159_-_sstablemetadata_command_should_print_some_more_stuff.patch


 It would be nice if the sstablemetadata command printed out some more of the 
 stuff we track.  Like the Min/Max column names and the min/max token in the 
 file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2014-11-26 Thread marcuse
Merge branch 'cassandra-2.1' into trunk

Conflicts:
CHANGES.txt
test/unit/org/apache/cassandra/io/sstable/SSTableRewriterTest.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/32b0a4e9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/32b0a4e9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/32b0a4e9

Branch: refs/heads/trunk
Commit: 32b0a4e954926a74491f24a9c9333ec16f0329f2
Parents: 999ce83 b106292
Author: Marcus Eriksson marc...@apache.org
Authored: Wed Nov 26 16:16:24 2014 +0100
Committer: Marcus Eriksson marc...@apache.org
Committed: Wed Nov 26 16:16:24 2014 +0100

--
 CHANGES.txt |   1 +
 .../cassandra/io/sstable/SSTableRewriter.java   |  74 ++--
 .../io/sstable/format/SSTableReader.java|  22 +++
 .../io/sstable/SSTableRewriterTest.java | 179 +++
 4 files changed, 225 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/32b0a4e9/CHANGES.txt
--
diff --cc CHANGES.txt
index 0529ca6,e5f7c28..162d579
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,39 -1,6 +1,40 @@@
 +3.0
 + * Fix aggregate fn results on empty selection, result column name,
 +   and cqlsh parsing (CASSANDRA-8229)
 + * Mark sstables as repaired after full repair (CASSANDRA-7586)
 + * Extend Descriptor to include a format value and refactor reader/writer 
apis (CASSANDRA-7443)
 + * Integrate JMH for microbenchmarks (CASSANDRA-8151)
 + * Keep sstable levels when bootstrapping (CASSANDRA-7460)
 + * Add Sigar library and perform basic OS settings check on startup 
(CASSANDRA-7838)
 + * Support for aggregation functions (CASSANDRA-4914)
 + * Remove cassandra-cli (CASSANDRA-7920)
 + * Accept dollar quoted strings in CQL (CASSANDRA-7769)
 + * Make assassinate a first class command (CASSANDRA-7935)
 + * Support IN clause on any clustering column (CASSANDRA-4762)
 + * Improve compaction logging (CASSANDRA-7818)
 + * Remove YamlFileNetworkTopologySnitch (CASSANDRA-7917)
 + * Do anticompaction in groups (CASSANDRA-6851)
 + * Support pure user-defined functions (CASSANDRA-7395, 7526, 7562, 7740, 
7781, 7929,
 +   7924, 7812, 8063, 7813)
 + * Permit configurable timestamps with cassandra-stress (CASSANDRA-7416)
 + * Move sstable RandomAccessReader to nio2, which allows using the
 +   FILE_SHARE_DELETE flag on Windows (CASSANDRA-4050)
 + * Remove CQL2 (CASSANDRA-5918)
 + * Add Thrift get_multi_slice call (CASSANDRA-6757)
 + * Optimize fetching multiple cells by name (CASSANDRA-6933)
 + * Allow compilation in java 8 (CASSANDRA-7028)
 + * Make incremental repair default (CASSANDRA-7250)
 + * Enable code coverage thru JaCoCo (CASSANDRA-7226)
 + * Switch external naming of 'column families' to 'tables' (CASSANDRA-4369) 
 + * Shorten SSTable path (CASSANDRA-6962)
 + * Use unsafe mutations for most unit tests (CASSANDRA-6969)
 + * Fix race condition during calculation of pending ranges (CASSANDRA-7390)
 + * Fail on very large batch sizes (CASSANDRA-8011)
 + * Improve concurrency of repair (CASSANDRA-6455, 8208)
 +
 +
  2.1.3
+  * Handle abort() in SSTableRewriter properly (CASSANDRA-8320)
 - * Fix high size calculations for prepared statements (CASSANDRA-8231)
   * Centralize shared executors (CASSANDRA-8055)
   * Fix filtering for CONTAINS (KEY) relations on frozen collection
 clustering columns when the query is restricted to a single

http://git-wip-us.apache.org/repos/asf/cassandra/blob/32b0a4e9/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/32b0a4e9/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
--
diff --cc src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
index 54a244b,000..90f3b92
mode 100644,00..100644
--- a/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
+++ b/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
@@@ -1,1888 -1,0 +1,1910 @@@
 +/*
 + * Licensed to the Apache Software Foundation (ASF) under one
 + * or more contributor license agreements.  See the NOTICE file
 + * distributed with this work for additional information
 + * regarding copyright ownership.  The ASF licenses this file
 + * to you under the Apache License, Version 2.0 (the
 + * License); you may not use this file except in compliance
 + * with the License.  You may obtain a copy of the License at
 + *
 + * http://www.apache.org/licenses/LICENSE-2.0
 + *
 + * Unless required by applicable law or agreed to in writing, software
 + * distributed under the License is 

cassandra git commit: Handle abort() properly in SSTableRewriter

2014-11-26 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 3faff8b15 - b10629291


Handle abort() properly in SSTableRewriter

Patch by marcuse; reviewed by jmckenzie for CASSANDRA-8320


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b1062929
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b1062929
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b1062929

Branch: refs/heads/cassandra-2.1
Commit: b1062929185690567e4567e0e657b361c5105482
Parents: 3faff8b
Author: Marcus Eriksson marc...@apache.org
Authored: Tue Nov 18 07:07:30 2014 +0100
Committer: Marcus Eriksson marc...@apache.org
Committed: Wed Nov 26 16:00:51 2014 +0100

--
 CHANGES.txt |   1 +
 .../cassandra/io/sstable/SSTableReader.java |  22 +++
 .../cassandra/io/sstable/SSTableRewriter.java   |  74 ++--
 .../io/sstable/SSTableRewriterTest.java | 180 +++
 4 files changed, 226 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b1062929/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index f022b19..e5f7c28 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.3
+ * Handle abort() in SSTableRewriter properly (CASSANDRA-8320)
  * Fix high size calculations for prepared statements (CASSANDRA-8231)
  * Centralize shared executors (CASSANDRA-8055)
  * Fix filtering for CONTAINS (KEY) relations on frozen collection

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b1062929/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableReader.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
index a3e3cf5..1fe4330 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
@@ -202,6 +202,7 @@ public class SSTableReader extends SSTable
 private Object replaceLock = new Object();
 private SSTableReader replacedBy;
 private SSTableReader replaces;
+private SSTableReader sharesBfWith;
 private SSTableDeletingTask deletingTask;
 private Runnable runOnClose;
 
@@ -594,6 +595,14 @@ public class SSTableReader extends SSTable
 deleteFiles = !dfile.path.equals(replaces.dfile.path);
 }
 
+if (sharesBfWith != null)
+{
+closeBf = sharesBfWith.bf != bf;
+closeSummary = sharesBfWith.indexSummary != indexSummary;
+closeFiles = sharesBfWith.dfile != dfile;
+deleteFiles = !dfile.path.equals(sharesBfWith.dfile.path);
+}
+
 boolean deleteAll = false;
 if (release  isCompacted.get())
 {
@@ -928,6 +937,19 @@ public class SSTableReader extends SSTable
 }
 }
 
+/**
+ * this is used to avoid closing the bloom filter multiple times when 
finishing an SSTableRewriter
+ *
+ * note that the reason we don't use replacedBy is that we are not yet 
actually replaced
+ *
+ * @param newReader
+ */
+public void sharesBfWith(SSTableReader newReader)
+{
+assert openReason.equals(OpenReason.EARLY);
+this.sharesBfWith = newReader;
+}
+
 public SSTableReader cloneWithNewStart(DecoratedKey newStart, final 
Runnable runOnClose)
 {
 synchronized (replaceLock)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b1062929/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java
index 4d5a06f..d187e9d 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java
@@ -20,6 +20,7 @@ package org.apache.cassandra.io.sstable;
 import java.util.ArrayList;
 import java.util.Collections;
 import java.util.HashMap;
+import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
@@ -75,6 +76,7 @@ public class SSTableRewriter
 private final ColumnFamilyStore cfs;
 
 private final long maxAge;
+private final ListSSTableReader finished = new ArrayList();
 private final SetSSTableReader rewriting; // the readers we are 
rewriting (updated as they are replaced)
 private final MapDescriptor, DecoratedKey originalStarts = new 
HashMap(); // the start key for each reader we are rewriting
 private final MapDescriptor, Integer fileDescriptors = new 

[1/2] cassandra git commit: Handle abort() properly in SSTableRewriter

2014-11-26 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/trunk 999ce832d - 32b0a4e95


Handle abort() properly in SSTableRewriter

Patch by marcuse; reviewed by jmckenzie for CASSANDRA-8320


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b1062929
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b1062929
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b1062929

Branch: refs/heads/trunk
Commit: b1062929185690567e4567e0e657b361c5105482
Parents: 3faff8b
Author: Marcus Eriksson marc...@apache.org
Authored: Tue Nov 18 07:07:30 2014 +0100
Committer: Marcus Eriksson marc...@apache.org
Committed: Wed Nov 26 16:00:51 2014 +0100

--
 CHANGES.txt |   1 +
 .../cassandra/io/sstable/SSTableReader.java |  22 +++
 .../cassandra/io/sstable/SSTableRewriter.java   |  74 ++--
 .../io/sstable/SSTableRewriterTest.java | 180 +++
 4 files changed, 226 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b1062929/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index f022b19..e5f7c28 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.3
+ * Handle abort() in SSTableRewriter properly (CASSANDRA-8320)
  * Fix high size calculations for prepared statements (CASSANDRA-8231)
  * Centralize shared executors (CASSANDRA-8055)
  * Fix filtering for CONTAINS (KEY) relations on frozen collection

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b1062929/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableReader.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
index a3e3cf5..1fe4330 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
@@ -202,6 +202,7 @@ public class SSTableReader extends SSTable
 private Object replaceLock = new Object();
 private SSTableReader replacedBy;
 private SSTableReader replaces;
+private SSTableReader sharesBfWith;
 private SSTableDeletingTask deletingTask;
 private Runnable runOnClose;
 
@@ -594,6 +595,14 @@ public class SSTableReader extends SSTable
 deleteFiles = !dfile.path.equals(replaces.dfile.path);
 }
 
+if (sharesBfWith != null)
+{
+closeBf = sharesBfWith.bf != bf;
+closeSummary = sharesBfWith.indexSummary != indexSummary;
+closeFiles = sharesBfWith.dfile != dfile;
+deleteFiles = !dfile.path.equals(sharesBfWith.dfile.path);
+}
+
 boolean deleteAll = false;
 if (release  isCompacted.get())
 {
@@ -928,6 +937,19 @@ public class SSTableReader extends SSTable
 }
 }
 
+/**
+ * this is used to avoid closing the bloom filter multiple times when 
finishing an SSTableRewriter
+ *
+ * note that the reason we don't use replacedBy is that we are not yet 
actually replaced
+ *
+ * @param newReader
+ */
+public void sharesBfWith(SSTableReader newReader)
+{
+assert openReason.equals(OpenReason.EARLY);
+this.sharesBfWith = newReader;
+}
+
 public SSTableReader cloneWithNewStart(DecoratedKey newStart, final 
Runnable runOnClose)
 {
 synchronized (replaceLock)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b1062929/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java
index 4d5a06f..d187e9d 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java
@@ -20,6 +20,7 @@ package org.apache.cassandra.io.sstable;
 import java.util.ArrayList;
 import java.util.Collections;
 import java.util.HashMap;
+import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
@@ -75,6 +76,7 @@ public class SSTableRewriter
 private final ColumnFamilyStore cfs;
 
 private final long maxAge;
+private final ListSSTableReader finished = new ArrayList();
 private final SetSSTableReader rewriting; // the readers we are 
rewriting (updated as they are replaced)
 private final MapDescriptor, DecoratedKey originalStarts = new 
HashMap(); // the start key for each reader we are rewriting
 private final MapDescriptor, Integer fileDescriptors = new HashMap(); 
// the 

[jira] [Commented] (CASSANDRA-8267) Only stream from unrepaired sstables during incremental repair

2014-11-26 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226337#comment-14226337
 ] 

Marcus Eriksson commented on CASSANDRA-8267:


hmm, I'll see how hard that would be (CASSANDRA-8110 that is)

In the mean time, pushed a branch with a new IncrementalPrepareMessage which 
solves this issue: https://github.com/krummas/cassandra/commits/marcuse/8267 

Note that it does not yet fail gracefully if you issue an incremental repair on 
an upgraded node (we might have to bump MessagingVersion to be able to do that, 
really don't want to parse RELEASE_VERSION from gossip info). Might not have to 
fix it if #8110 works out here

 Only stream from unrepaired sstables during incremental repair
 --

 Key: CASSANDRA-8267
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8267
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 2.1.3


 Seems we stream from all sstables even if we do incremental repair, we should 
 limit this to only stream from the unrepaired sstables if we do incremental 
 repair



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8371) DateTieredCompactionStrategy is always compacting

2014-11-26 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226341#comment-14226341
 ] 

Tupshin Harper commented on CASSANDRA-8371:
---

And I'd also like to see an option to change max_sstable_age_days to be a 
smaller unit of time. Right now, you can only set it to integer days. 
Particularly with high ingestion rates, and low TTL, I see legitimate use cases 
where that could benefit from being as low as an hour, or even less, in order 
to minimize any write amplification.

Just switching to use seconds as the unit of time here would make a lot of 
sense to me. 365 days would then be expressible as 31536. :)


 DateTieredCompactionStrategy is always compacting 
 --

 Key: CASSANDRA-8371
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8371
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: mck
Assignee: Björn Hegerfors
  Labels: compaction, performance
 Attachments: java_gc_counts_rate-month.png, read-latency.png, 
 sstables.png, vg2_iad-month.png


 Running 2.0.11 and having switched a table to 
 [DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602] we've seen that 
 disk IO and gc count increase, along with the number of reads happening in 
 the compaction hump of cfhistograms.
 Data, and generally performance, looks good, but compactions are always 
 happening, and pending compactions are building up.
 The schema for this is 
 {code}CREATE TABLE search (
   loginid text,
   searchid timeuuid,
   description text,
   searchkey text,
   searchurl text,
   PRIMARY KEY ((loginid), searchid)
 );{code}
 We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
 CQL executed against this keyspace, and traffic patterns, can be seen in 
 slides 7+8 of https://prezi.com/b9-aj6p2esft
 Attached are sstables-per-read and read-latency graphs from cfhistograms, and 
 screenshots of our munin graphs as we have gone from STCS, to LCS (week ~44), 
 to DTCS (week ~46).
 These screenshots are also found in the prezi on slides 9-11.
 [~pmcfadin], [~Bj0rn], 
 Can this be a consequence of occasional deleted rows, as is described under 
 (3) in the description of CASSANDRA-6602 ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8371) DateTieredCompactionStrategy is always compacting

2014-11-26 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-8371:
---
Description: 
Running 2.0.11 and having switched a table to 
[DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602] we've seen that 
disk IO and gc count increase, along with the number of reads happening in the 
compaction hump of cfhistograms.

Data, and generally performance, looks good, but compactions are always 
happening, and pending compactions are building up.

The schema for this is 
{code}CREATE TABLE search (
  loginid text,
  searchid timeuuid,
  description text,
  searchkey text,
  searchurl text,
  PRIMARY KEY ((loginid), searchid)
);{code}

We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
CQL executed against this keyspace, and traffic patterns, can be seen in slides 
7+8 of https://prezi.com/b9-aj6p2esft/

Attached are sstables-per-read and read-latency graphs from cfhistograms, and 
screenshots of our munin graphs as we have gone from STCS, to LCS (week ~44), 
to DTCS (week ~46).

These screenshots are also found in the prezi on slides 9-11.

[~pmcfadin], [~Bj0rn], 

Can this be a consequence of occasional deleted rows, as is described under (3) 
in the description of CASSANDRA-6602 ?


  was:
Running 2.0.11 and having switched a table to 
[DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602] we've seen that 
disk IO and gc count increase, along with the number of reads happening in the 
compaction hump of cfhistograms.

Data, and generally performance, looks good, but compactions are always 
happening, and pending compactions are building up.

The schema for this is 
{code}CREATE TABLE search (
  loginid text,
  searchid timeuuid,
  description text,
  searchkey text,
  searchurl text,
  PRIMARY KEY ((loginid), searchid)
);{code}

We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
CQL executed against this keyspace, and traffic patterns, can be seen in slides 
7+8 of https://prezi.com/b9-aj6p2esft

Attached are sstables-per-read and read-latency graphs from cfhistograms, and 
screenshots of our munin graphs as we have gone from STCS, to LCS (week ~44), 
to DTCS (week ~46).

These screenshots are also found in the prezi on slides 9-11.

[~pmcfadin], [~Bj0rn], 

Can this be a consequence of occasional deleted rows, as is described under (3) 
in the description of CASSANDRA-6602 ?



 DateTieredCompactionStrategy is always compacting 
 --

 Key: CASSANDRA-8371
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8371
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: mck
Assignee: Björn Hegerfors
  Labels: compaction, performance
 Attachments: java_gc_counts_rate-month.png, read-latency.png, 
 sstables.png, vg2_iad-month.png


 Running 2.0.11 and having switched a table to 
 [DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602] we've seen that 
 disk IO and gc count increase, along with the number of reads happening in 
 the compaction hump of cfhistograms.
 Data, and generally performance, looks good, but compactions are always 
 happening, and pending compactions are building up.
 The schema for this is 
 {code}CREATE TABLE search (
   loginid text,
   searchid timeuuid,
   description text,
   searchkey text,
   searchurl text,
   PRIMARY KEY ((loginid), searchid)
 );{code}
 We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
 CQL executed against this keyspace, and traffic patterns, can be seen in 
 slides 7+8 of https://prezi.com/b9-aj6p2esft/
 Attached are sstables-per-read and read-latency graphs from cfhistograms, and 
 screenshots of our munin graphs as we have gone from STCS, to LCS (week ~44), 
 to DTCS (week ~46).
 These screenshots are also found in the prezi on slides 9-11.
 [~pmcfadin], [~Bj0rn], 
 Can this be a consequence of occasional deleted rows, as is described under 
 (3) in the description of CASSANDRA-6602 ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8371) DateTieredCompactionStrategy is always compacting

2014-11-26 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226399#comment-14226399
 ] 

mck commented on CASSANDRA-8371:


[~Bj0rn] i'll be able to try some different settings the week after the London 
Cassandra Summit. For the moment we have switched back to LCS. If i read your 
comment correctly i should try base_time_seconds=9483 (~2.6hrs), (based on that 
we've collected ~82Gb with a TTL of 6 months)? Given this is higher than one 
hour should i still experiment with it?

 DateTieredCompactionStrategy is always compacting 
 --

 Key: CASSANDRA-8371
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8371
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: mck
Assignee: Björn Hegerfors
  Labels: compaction, performance
 Attachments: java_gc_counts_rate-month.png, read-latency.png, 
 sstables.png, vg2_iad-month.png


 Running 2.0.11 and having switched a table to 
 [DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602] we've seen that 
 disk IO and gc count increase, along with the number of reads happening in 
 the compaction hump of cfhistograms.
 Data, and generally performance, looks good, but compactions are always 
 happening, and pending compactions are building up.
 The schema for this is 
 {code}CREATE TABLE search (
   loginid text,
   searchid timeuuid,
   description text,
   searchkey text,
   searchurl text,
   PRIMARY KEY ((loginid), searchid)
 );{code}
 We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
 CQL executed against this keyspace, and traffic patterns, can be seen in 
 slides 7+8 of https://prezi.com/b9-aj6p2esft/
 Attached are sstables-per-read and read-latency graphs from cfhistograms, and 
 screenshots of our munin graphs as we have gone from STCS, to LCS (week ~44), 
 to DTCS (week ~46).
 These screenshots are also found in the prezi on slides 9-11.
 [~pmcfadin], [~Bj0rn], 
 Can this be a consequence of occasional deleted rows, as is described under 
 (3) in the description of CASSANDRA-6602 ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8376) Add support for multiple configuration files (or conf.d)

2014-11-26 Thread Omri Bahumi (JIRA)
Omri Bahumi created CASSANDRA-8376:
--

 Summary: Add support for multiple configuration files (or conf.d)
 Key: CASSANDRA-8376
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8376
 Project: Cassandra
  Issue Type: New Feature
Reporter: Omri Bahumi


I'm using Chef to generate cassandra.yaml.
Part of this file is the seed_provider, which is based on the Chef inventory.
Changes to this file (due to Chef inventory change, when adding/removing 
Cassandra nodes) cause a restart, which is not desirable.

The Chef way of handling this is to split the config file into two config 
files, one containing only the seed_provider and the other containing the 
rest of the config.
Only the latter will cause a restart to Cassandra.

This is achievable by either:
1. Specifying multiple config files to Cassandra
2. Specifying a conf.d directory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8377) Coordinated Commitlog Replay

2014-11-26 Thread Nick Bailey (JIRA)
Nick Bailey created CASSANDRA-8377:
--

 Summary: Coordinated Commitlog Replay
 Key: CASSANDRA-8377
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8377
 Project: Cassandra
  Issue Type: New Feature
Reporter: Nick Bailey
 Fix For: 3.0


Commit log archiving and replay can be used to support point in time restores 
on a cluster. Unfortunately, at the moment that is only true when the topology 
of the cluster is exactly the same as when the commitlogs were archived. This 
is because commitlogs need to be replayed on a node that is a replica for those 
writes.

To support replaying commitlogs when the topology has changed we should have a 
tool that replays the writes in a commitlog as if they were writes from a 
client and will get coordinated to the correct replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7431) Hadoop integration does not perform reverse DNS lookup correctly on EC2

2014-11-26 Thread Olivier Michallat (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226452#comment-14226452
 ] 

Olivier Michallat commented on CASSANDRA-7431:
--

Just wanted to mention that there is a third option coming soon: Netty 4.1 will 
ship with a built-in DNS client, which also allows reverse lookups (I've tested 
with a nightly build).

In the driver, I'm using the JNDI approach for now, but will switch to Netty 
when we upgrade to 4.1.

 Hadoop integration does not perform reverse DNS lookup correctly on EC2
 ---

 Key: CASSANDRA-7431
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7431
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Paulo Motta
Assignee: Paulo Motta
 Attachments: 2.0-CASSANDRA-7431.txt


 The split assignment on AbstractColumnFamilyInputFormat:247 peforms a reverse 
 DNS lookup of Cassandra IPs in order to preserve locality in Hadoop (task 
 trackers are identified by hostnames).
 However, the reverse lookup of an EC2 IP does not yield the EC2 hostname of 
 that endpoint when running from an EC2 instance due to the use of 
 InetAddress.getHostname().
 In order to show this, consider the following piece of code:
 {code:title=DnsResolver.java|borderStyle=solid}
 public class DnsResolver {
 public static void main(String[] args) throws Exception {
 InetAddress namenodePublicAddress = InetAddress.getByName(args[0]);
 System.out.println(getHostAddress:  + 
 namenodePublicAddress.getHostAddress());
 System.out.println(getHostName:  + 
 namenodePublicAddress.getHostName());
 }
 }
 {code}
 When this code is run from my machine to perform reverse lookup of an EC2 IP, 
 the output is:
 {code:none}
 ➜  java DnsResolver 54.201.254.99
 getHostAddress: 54.201.254.99
 getHostName: ec2-54-201-254-99.compute-1.amazonaws.com
 {code}
 When this code is executed from inside an EC2 machine, the output is:
 {code:none}
 ➜  java DnsResolver 54.201.254.99
 getHostAddress: 54.201.254.99
 getHostName: 54.201.254.99
 {code}
 However, when using linux tools such as host or dig, the EC2 hostname is 
 properly resolved from the EC2 instance, so there's some problem with Java's 
 InetAddress.getHostname() and EC2.
 Two consequences of this bug during AbstractColumnFamilyInputFormat split 
 definition are:
 1) If the Hadoop cluster is configured to use EC2 public DNS, the locality 
 will be lost, because Hadoop will try to match the CFIF split location 
 (public IP) with the task tracker location (public DNS), so no matches will 
 be found.
 2) If the Cassandra nodes' broadcast_address is set to public IPs, all hadoop 
 communication will be done via the public IP, what will incurr additional 
 transference charges. If the public IP is mapped to the EC2 DNS during split 
 definition, when the task is executed, ColumnFamilyRecordReader will resolve 
 the public DNS to the private IP of the instance, so there will be not 
 additional charges.
 A similar bug was filed in the WHIRR project: 
 https://issues.apache.org/jira/browse/WHIRR-128



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8285) OOME in Cassandra 2.0.11

2014-11-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226462#comment-14226462
 ] 

Jonathan Ellis commented on CASSANDRA-8285:
---

Can you bisect?  (Perhaps with a tiny heap to reproduce OOM faster.)

 OOME in Cassandra 2.0.11
 

 Key: CASSANDRA-8285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8285
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.11 + java-driver 2.0.8-SNAPSHOT
 Cassandra 2.0.11 + ruby-driver 1.0-beta
Reporter: Pierre Laporte
Assignee: Aleksey Yeschenko
 Attachments: OOME_node_system.log, gc-1416849312.log.gz, gc.log.gz, 
 heap-usage-after-gc-zoom.png, heap-usage-after-gc.png, system.log.gz


 We ran drivers 3-days endurance tests against Cassandra 2.0.11 and C* crashed 
 with an OOME.  This happened both with ruby-driver 1.0-beta and java-driver 
 2.0.8-snapshot.
 Attached are :
 | OOME_node_system.log | The system.log of one Cassandra node that crashed |
 | gc.log.gz | The GC log on the same node |
 | heap-usage-after-gc.png | The heap occupancy evolution after every GC cycle 
 |
 | heap-usage-after-gc-zoom.png | A focus on when things start to go wrong |
 Workload :
 Our test executes 5 CQL statements (select, insert, select, delete, select) 
 for a given unique id, during 3 days, using multiple threads.  There is not 
 change in the workload during the test.
 Symptoms :
 In the attached log, it seems something starts in Cassandra between 
 2014-11-06 10:29:22 and 2014-11-06 10:45:32.  This causes an allocation that 
 fills the heap.  We eventually get stuck in a Full GC storm and get an OOME 
 in the logs.
 I have run the java-driver tests against Cassandra 1.2.19 and 2.1.1.  The 
 error does not occur.  It seems specific to 2.0.11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8360) In DTCS, always compact SSTables in the same time window, even if they are fewer than min_threshold

2014-11-26 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226463#comment-14226463
 ] 

Björn Hegerfors commented on CASSANDRA-8360:


OK, sounds fair. That essentially means that we want to treat the incoming 
window specially. A question worth asking is what we want the incoming window 
for. Currently it is keep the last unit of base_time_seconds compacted at all 
times. While it respects min_threshold, a value written early in the window 
will essentially be constantly recompacted once every (min_threshold - 1) 
subsequent sstable flushes. I'm fully aware that this might be a bad idea, or 
rather I wasn't sure if it was the right thing to do. Really, it's completely 
inspired by STCS's min_sstable_size which seems to do the same thing, i.e. not 
respect the logarithmic complexity tree-like merging on small enough SSTables. 
(Reminds me a bit of insertion sort being fastest on small enough arrays). So 
base_time_seconds has the same purpose. A problem is that it might be harder 
set a good default on time than on size.

Setting min_sstable_size in STCS to 0 has an near-equivalent in DTCS: setting 
base_time_seconds to 1. The windows will be powers of base_time_seconds (up to 
base_time_seconds of each size), starting at 1 second. Even with this setting, 
data that is an hour old will be in near-hour large windows. The only 
meaningful difference is that SSTables 2 seconds and 10 seconds old will not be 
in the same window. What I mean by this, is that setting base_time_seconds to 1 
is perfectly reasonable, it's just the same as setting min_sstable_size to 0 or 
1 in STCS. I just want to make it clear that base_time_seconds is not really 
something that you should set to 1 hour (3600) just because you want SSTables 
older than 1 hour to be in nice 1-hour chunks. If you set it to 900 with 
min_threshold=4, SSTables older than 1 hour will still be in perfect 1 hour 
chunks (because preceding up to 4 900-second chunks, comes a 4*900=3600-second 
chunk).

So I guess respecting min_threshold in the 'incoming window' is just as right 
as respecting min_threshold when compacting SSTables smaller than 
min_sstable_size in STCS. Which I believe it does. So there's my roundabout way 
of coming to the same conclusion as you, [~jbellis] :). I just have this 
feeling that the meaning of base_time_seconds isn't well understood.

 In DTCS, always compact SSTables in the same time window, even if they are 
 fewer than min_threshold
 ---

 Key: CASSANDRA-8360
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8360
 Project: Cassandra
  Issue Type: Improvement
Reporter: Björn Hegerfors
Priority: Minor

 DTCS uses min_threshold to decide how many time windows of the same size that 
 need to accumulate before merging into a larger window. The age of an SSTable 
 is determined as its min timestamp, and it always falls into exactly one of 
 the time windows. If multiple SSTables fall into the same window, DTCS 
 considers compacting them, but if they are fewer than min_threshold, it 
 decides not to do it.
 When do more than 1 but fewer than min_threshold SSTables end up in the same 
 time window (except for the current window), you might ask? In the current 
 state, DTCS can spill some extra SSTables into bigger windows when the 
 previous window wasn't fully compacted, which happens all the time when the 
 latest window stops being the current one. Also, repairs and hints can put 
 new SSTables in old windows.
 I think, and [~jjordan] agreed in a comment on CASSANDRA-6602, that DTCS 
 should ignore min_threshold and compact tables in the same windows regardless 
 of how few they are. I guess max_threshold should still be respected.
 [~jjordan] suggested that this should apply to all windows but the current 
 window, where all the new SSTables end up. That could make sense. I'm not 
 clear on whether compacting many SSTables at once is more cost efficient or 
 not, when it comes to the very newest and smallest SSTables. Maybe compacting 
 as soon as 2 SSTables are seen is fine if the initial window size is small 
 enough? I guess the opposite could be the case too; that the very newest 
 SSTables should be compacted very many at a time?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8378) Allow Java Driver to be used for executing queries in CQL unit tests

2014-11-26 Thread Tyler Hobbs (JIRA)
Tyler Hobbs created CASSANDRA-8378:
--

 Summary: Allow Java Driver to be used for executing queries in CQL 
unit tests
 Key: CASSANDRA-8378
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8378
 Project: Cassandra
  Issue Type: Improvement
  Components: Tests
Reporter: Tyler Hobbs
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 2.1.3


In CASSANDRA-7563, a CQLTester method was added to allow a CQL query to be 
executed through the Java driver with a given protocol version.  We should 
extend this to make it possible to execute all CQLTester queries through the 
driver with a specific protocol version if a {{-D}} argument is used (similar 
to the existing argument for using prepared statements).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226474#comment-14226474
 ] 

Tupshin Harper commented on CASSANDRA-7438:
---

[~xedin] I'm lost in too many layers of snark and indirection (not just yours). 
Can you elaborate on what strategy you actually find appealling?

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7159) sstablemetadata command should print some more stuff

2014-11-26 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226501#comment-14226501
 ] 

Philip Thompson commented on CASSANDRA-7159:


[~vsinjavin], a few changes to your use of braces will need to be made to meet 
the [code style|http://wiki.apache.org/cassandra/CodeStyle], and I'm not sure 
what the policy on wild card imports is. 

Functionally, when I run tools/bin/sstablemetadata with the patch, I'm getting 
a fatal ConfigurationException. {code}tools/bin/sstablemetadata 
/Users/philipthompson/cstar/cassandra/data/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-1-Data.db
12:14:07.999 [main] DEBUG o.a.c.i.s.m.MetadataSerializer - Load metadata for 
/Users/philipthompson/cstar/cassandra/data/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-1
SSTable: 
/Users/philipthompson/cstar/cassandra/data/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-1
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.01
12:14:08.217 [main] ERROR o.a.c.config.DatabaseDescriptor - Fatal configuration 
error
org.apache.cassandra.exceptions.ConfigurationException: Expecting URI in 
variable: [cassandra.config].  Please prefix the file with file:/// for local 
files or file://server/ for remote files.  Aborting.
at 
org.apache.cassandra.config.YamlConfigurationLoader.getStorageConfigURL(YamlConfigurationLoader.java:73)
 ~[main/:na]
at 
org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:84)
 ~[main/:na]
at 
org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:158)
 ~[main/:na]
at 
org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:133)
 ~[main/:na]
at org.apache.cassandra.io.util.Memory.clinit(Memory.java:36) 
[main/:na]
at 
org.apache.cassandra.io.sstable.IndexSummary$IndexSummarySerializer.deserialize(IndexSummary.java:254)
 [main/:na]
at 
org.apache.cassandra.tools.SSTableMetadataViewer.printMinMaxTokens(SSTableMetadataViewer.java:130)
 [main/:na]
at 
org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:68)
 [main/:na]
Expecting URI in variable: [cassandra.config].  Please prefix the file with 
file:/// for local files or file://server/ for remote files.  Aborting.
Fatal configuration error; unable to start. See log for stacktrace.{code}

Running without the patch is working fine for me. Is there something that needs 
to be done differently for the call to {{IndexSummary.serializer.deserialize}} 
to succeed?

 sstablemetadata command should print some more stuff
 

 Key: CASSANDRA-7159
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7159
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jeremiah Jordan
Assignee: Vladislav Sinjavin
Priority: Trivial
  Labels: lhf
 Fix For: 2.1.3

 Attachments: 
 CASSANDRA-7159_-_sstablemetadata_command_should_print_some_more_stuff.patch


 It would be nice if the sstablemetadata command printed out some more of the 
 stuff we track.  Like the Min/Max column names and the min/max token in the 
 file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-6198) Distinguish streaming traffic at network level

2014-11-26 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-6198:
---
Reviewer: Brandon Williams  (was: Philip Thompson)

As Brandon said, Boolean.getBoolean seems like it would be cleaner here. 
Otherwise I'm +1. Sending back to Brandon for final review.

 Distinguish streaming traffic at network level
 --

 Key: CASSANDRA-6198
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6198
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Norman Maurer
Priority: Minor
 Fix For: 2.1.3

 Attachments: 
 0001-CASSANDRA-6198-Set-IPTOS_THROUGHPUT-on-streaming-con.txt


 It would be nice to have some information in the TCP packet which network 
 teams can inspect to distinguish between streaming traffic and other organic 
 cassandra traffic. This is very useful for monitoring WAN traffic. 
 Here are some solutions:
 1) Use a different port for streaming. 
 2) Add some IP header. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226516#comment-14226516
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Some short notes about the last changes in OHC:

* changed from block-oriented allocation to Unsafe or JEMalloc (if available)
* added stamped locks in off-heap (quite simple and very efficient)
* triggering cleanup + rehash via cas-side trigger works fine
* extended the benchmark tool to specify different workload chacteristics 
(read/write ratio, key distribution, value length distribution - distribution 
code taken from cassandra-stress)
* still working on a good (mostly contention free) LRU strategy

One thing I noticed during benchmarking is that (concurrent?) allocations of 
large areas (several MB) take up to 50/60ms (OSX 10.10, 2.6GHz Core i7 - no 
swap, of course) - small regions are allocated quite fast (total roundtrip for 
a put ~0.1ms for 98 percentile). It might be viable to implement some mixture 
for memory allocation: Unsafe/JEMalloc for small regions (e.g.  1MB) and 
pre-allocated blocks for large regions. A configuration value could determine 
the amount of large region blocks to keep immediately available. Just an idea...


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226531#comment-14226531
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

When are large regions being allocated? How common is the use case? Large would 
normally only be for table resizing right? 

Could the row cache contain very large values with wide rows?

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8261) Clean up schema metadata classes

2014-11-26 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-8261:
-
Attachment: 8261-isolate-serialization-code.txt

 Clean up schema metadata classes
 

 Key: CASSANDRA-8261
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8261
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
Priority: Minor
 Fix For: 3.0

 Attachments: 8261-isolate-hadcoded-system-tables.txt, 
 8261-isolate-serialization-code.txt, 8261-isolate-thrift-code.txt


 While working on CASSANDRA-6717, I've made some general cleanup changes to 
 schema metadata classes - distracted from the core purpose. Also, being 
 distracted from it by other things, every time I come back to it gives me a 
 bit of a rebase hell.
 Thus I'm isolating those changes into a separate issue here, hoping to commit 
 them one by one, before I go back and finalize CASSANDRA-6717.
 The changes include:
 - moving all the toThrift/fromThrift conversion code to ThriftConversion, 
 where it belongs
 - moving the complied system CFMetaData objects away from CFMetaData (to 
 SystemKeyspace and TracesKeyspace)
 - isolating legacy toSchema/fromSchema code into a separate class 
 (LegacySchemaTables - former DefsTables)
 - refactoring CFMetaData/KSMetaData fields to match CQL CREATE TABLE syntax, 
 and encapsulating more things in 
 CompactionOptions/CompressionOptions/ReplicationOptions classes
 - moving the definition classes to the new 'schema' package



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8261) Clean up schema metadata classes

2014-11-26 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-8261:
-
Attachment: (was: 8261-isolate-serialization-code.txt)

 Clean up schema metadata classes
 

 Key: CASSANDRA-8261
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8261
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
Priority: Minor
 Fix For: 3.0

 Attachments: 8261-isolate-hadcoded-system-tables.txt, 
 8261-isolate-serialization-code.txt, 8261-isolate-thrift-code.txt


 While working on CASSANDRA-6717, I've made some general cleanup changes to 
 schema metadata classes - distracted from the core purpose. Also, being 
 distracted from it by other things, every time I come back to it gives me a 
 bit of a rebase hell.
 Thus I'm isolating those changes into a separate issue here, hoping to commit 
 them one by one, before I go back and finalize CASSANDRA-6717.
 The changes include:
 - moving all the toThrift/fromThrift conversion code to ThriftConversion, 
 where it belongs
 - moving the complied system CFMetaData objects away from CFMetaData (to 
 SystemKeyspace and TracesKeyspace)
 - isolating legacy toSchema/fromSchema code into a separate class 
 (LegacySchemaTables - former DefsTables)
 - refactoring CFMetaData/KSMetaData fields to match CQL CREATE TABLE syntax, 
 and encapsulating more things in 
 CompactionOptions/CompressionOptions/ReplicationOptions classes
 - moving the definition classes to the new 'schema' package



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8261) Clean up schema metadata classes

2014-11-26 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226547#comment-14226547
 ] 

Aleksey Yeschenko commented on CASSANDRA-8261:
--

Attached a rebased version.

 Clean up schema metadata classes
 

 Key: CASSANDRA-8261
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8261
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
Priority: Minor
 Fix For: 3.0

 Attachments: 8261-isolate-hadcoded-system-tables.txt, 
 8261-isolate-serialization-code.txt, 8261-isolate-thrift-code.txt


 While working on CASSANDRA-6717, I've made some general cleanup changes to 
 schema metadata classes - distracted from the core purpose. Also, being 
 distracted from it by other things, every time I come back to it gives me a 
 bit of a rebase hell.
 Thus I'm isolating those changes into a separate issue here, hoping to commit 
 them one by one, before I go back and finalize CASSANDRA-6717.
 The changes include:
 - moving all the toThrift/fromThrift conversion code to ThriftConversion, 
 where it belongs
 - moving the complied system CFMetaData objects away from CFMetaData (to 
 SystemKeyspace and TracesKeyspace)
 - isolating legacy toSchema/fromSchema code into a separate class 
 (LegacySchemaTables - former DefsTables)
 - refactoring CFMetaData/KSMetaData fields to match CQL CREATE TABLE syntax, 
 and encapsulating more things in 
 CompactionOptions/CompressionOptions/ReplicationOptions classes
 - moving the definition classes to the new 'schema' package



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226552#comment-14226552
 ] 

Vijay commented on CASSANDRA-7438:
--

{quote}One thing I noticed during benchmarking is that (concurrent?){quote}
Yes, use these options, feel free to make it more configurable if you need.
{code}
public static final String TYPE = c;
public static final String THREADS = t;
public static final String SIZE = s;
public static final String ITERATIONS = i;
public static final String PREFIX_SIZE = p;
{code} 

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-26 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226557#comment-14226557
 ] 

Tyler Hobbs commented on CASSANDRA-7563:


There's one oddity around renaming a field in a UDT that's used by a UDFs.  The 
function will continue to work, even if it uses the old name for the renamed 
field.  For example:

{noformat}
cqlsh CREATE TYPE ks1.type1 (a int);
cqlsh CREATE FUNCTION ks1.func1 (a frozenks1.type1) RETURNS int LANGUAGE 
java AS $$ return Integer.valueOf(a.getInt(a)); $$;
cqlsh CREATE TABLE ks1.table1 (a int PRIMARY KEY, b frozenks1.type1);
cqlsh INSERT INTO ks1.table1 (a, b) VALUES (0, {a: 0});
cqlsh SELECT ks1.func1(b) FROM ks1.table1;

 ks1.func1(b)
--
0

(1 rows)
cqlsh ALTER TEYPE ks1.type1 RENAME a TO b;
cqlsh SELECT ks1.func1(b) FROM ks1.table1;

 ks1.func1(b)
--
0

(1 rows)
{noformat}

Note that the function gets field a, which was renamed to b.  I think I'm 
okay with this behavior, but I'd like to check with [~slebresne] as well.

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt, 7563v7.txt, 7563v8-diff-diff.txt, 
 7563v8.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6198) Distinguish streaming traffic at network level

2014-11-26 Thread Norman Maurer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226587#comment-14226587
 ] 

Norman Maurer commented on CASSANDRA-6198:
--

Agree Boolean.getBoolean(...) would be better. Should I adjust the patch ?

 Distinguish streaming traffic at network level
 --

 Key: CASSANDRA-6198
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6198
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Norman Maurer
Priority: Minor
 Fix For: 2.1.3

 Attachments: 
 0001-CASSANDRA-6198-Set-IPTOS_THROUGHPUT-on-streaming-con.txt


 It would be nice to have some information in the TCP packet which network 
 teams can inspect to distinguish between streaming traffic and other organic 
 cassandra traffic. This is very useful for monitoring WAN traffic. 
 Here are some solutions:
 1) Use a different port for streaming. 
 2) Add some IP header. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8321) SStablesplit behavior changed

2014-11-26 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226589#comment-14226589
 ] 

Philip Thompson commented on CASSANDRA-8321:


I applied your patch to CCM.

 SStablesplit behavior changed
 -

 Key: CASSANDRA-8321
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8321
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Marcus Eriksson
Priority: Minor
 Fix For: 2.1.3

 Attachments: 0001-ccm-fix-file-finding.patch, 
 0001-remove-tmplink-for-offline-compactions.patch


 The dtest sstablesplit_test.py has begun failing due to an incorrect number 
 of sstables being created after running sstablesplit.
 http://cassci.datastax.com/job/cassandra-2.1_dtest/559/changes#detail1
 is the run where the failure began.
 In 2.1.x, the test expects 7 sstables to be created after split, but instead 
 12 are being created. All of the data is there, and the sstables add up to 
 the expected size, so this simply may be a change in default behavior. The 
 test runs sstablesplit without the --size argument, and the default has not 
 changed, so it is unexpected that the behavior would change in a minor point 
 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6198) Distinguish streaming traffic at network level

2014-11-26 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226593#comment-14226593
 ] 

Brandon Williams commented on CASSANDRA-6198:
-

Please do, and I will perform a final review and commit after the holiday.

 Distinguish streaming traffic at network level
 --

 Key: CASSANDRA-6198
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6198
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Norman Maurer
Priority: Minor
 Fix For: 2.1.3

 Attachments: 
 0001-CASSANDRA-6198-Set-IPTOS_THROUGHPUT-on-streaming-con.txt


 It would be nice to have some information in the TCP packet which network 
 teams can inspect to distinguish between streaming traffic and other organic 
 cassandra traffic. This is very useful for monitoring WAN traffic. 
 Here are some solutions:
 1) Use a different port for streaming. 
 2) Add some IP header. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a decent sstable leveling

2014-11-26 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226614#comment-14226614
 ] 

Nikolai Grigoriev commented on CASSANDRA-8301:
--

I have attempted to write a simple prototype (very ugly :) ) of such a tool. I 
am very interested in it because I do suffer from that problem. In fact, 
without such a tool I simply cannot bootstrap a node. I have tried and the node 
*never* recovers. 

So, anyway, I have tried my prototype on a freshly bootstrapped node and it 
seems to be working. Instead of initial 7,5K pending compactions I have got 
only about 600, few hours later it is down to ~450 and seems to be going down. 
cfstats also look quite good (to me ;) ):

{code}
SSTable count: 6311
SSTables in each level: [571/4, 10, 80, 1411/1000, 4239, 0, 0, 0, 0]
{code}

I do have some sstables at L0 because the node is taking normal (heavy) traffic 
at the same time. But this number is already down from ~700 original.

I think I could give it a try to make the prototype tool less ugly and submit 
it here, if you do not mind.

 Create a tool that given a bunch of sstables creates a decent sstable 
 leveling
 

 Key: CASSANDRA-8301
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson

 In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
 node, you will end up with a ton of files in L0 and it might be extremely 
 painful to get LCS to compact into a new leveling
 We could probably exploit the fact that we have many non-overlapping sstables 
 in L0, and offline-bump those sstables into higher levels. It does not need 
 to be perfect, just get the majority of the data into L1+ without creating 
 overlaps.
 So, suggestion is to create an offline tool that looks at the range each 
 sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8192) AssertionError in Memory.java

2014-11-26 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-8192:
---
Fix Version/s: 2.1.3

 AssertionError in Memory.java
 -

 Key: CASSANDRA-8192
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8192
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Windows-7-32 bit, 3GB RAM, Java 1.7.0_67
Reporter: Andreas Schnitzerling
Assignee: Joshua McKenzie
 Fix For: 2.1.3

 Attachments: cassandra.bat, cassandra.yaml, 
 logdata-onlinedata-ka-196504-CompressionInfo.zip, printChunkOffsetErrors.txt, 
 system-compactions_in_progress-ka-47594-CompressionInfo.zip, 
 system-sstable_activity-jb-25-Filter.zip, system.log, system_AssertionTest.log


 Since update of 1 of 12 nodes from 2.1.0-rel to 2.1.1-rel Exception during 
 start up.
 {panel:title=system.log}
 ERROR [SSTableBatchOpen:1] 2014-10-27 09:44:00,079 CassandraDaemon.java:153 - 
 Exception in thread Thread[SSTableBatchOpen:1,5,main]
 java.lang.AssertionError: null
   at org.apache.cassandra.io.util.Memory.size(Memory.java:307) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:135)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:83)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:50)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:48)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:766) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:725) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:402) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:302) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:438) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
 ~[na:1.7.0_55]
   at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
 [na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
 [na:1.7.0_55]
   at java.lang.Thread.run(Unknown Source) [na:1.7.0_55]
 {panel}
 In the attached log you can still see as well CASSANDRA-8069 and 
 CASSANDRA-6283.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8192) AssertionError in Memory.java

2014-11-26 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226626#comment-14226626
 ] 

Joshua McKenzie commented on CASSANDRA-8192:


Are either of you two able to test 2.1.1 or 2.1.2 with an empty dataset and see 
if you reproduce the Assertion?  Confirmation that it's the data will help 
narrow this down.
[~alsoloplan] - Write survey mode might be what you're looking for regarding 
testing out new nodes: 
[*Link*|http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling]
[~Andie78] If you could direct off-topic questions to the user's mailing list 
that will help keep JIRA cleaner for future readers.  :)

I'll poke around those files you attached and see if anything jumps out at me - 
thanks for attaching them.

 AssertionError in Memory.java
 -

 Key: CASSANDRA-8192
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8192
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Windows-7-32 bit, 3GB RAM, Java 1.7.0_67
Reporter: Andreas Schnitzerling
Assignee: Joshua McKenzie
 Fix For: 2.1.3

 Attachments: cassandra.bat, cassandra.yaml, 
 logdata-onlinedata-ka-196504-CompressionInfo.zip, printChunkOffsetErrors.txt, 
 system-compactions_in_progress-ka-47594-CompressionInfo.zip, 
 system-sstable_activity-jb-25-Filter.zip, system.log, system_AssertionTest.log


 Since update of 1 of 12 nodes from 2.1.0-rel to 2.1.1-rel Exception during 
 start up.
 {panel:title=system.log}
 ERROR [SSTableBatchOpen:1] 2014-10-27 09:44:00,079 CassandraDaemon.java:153 - 
 Exception in thread Thread[SSTableBatchOpen:1,5,main]
 java.lang.AssertionError: null
   at org.apache.cassandra.io.util.Memory.size(Memory.java:307) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:135)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:83)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:50)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:48)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:766) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:725) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:402) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:302) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:438) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
 ~[na:1.7.0_55]
   at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
 [na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
 [na:1.7.0_55]
   at java.lang.Thread.run(Unknown Source) [na:1.7.0_55]
 {panel}
 In the attached log you can still see as well CASSANDRA-8069 and 
 CASSANDRA-6283.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8337) mmap underflow during validation compaction

2014-11-26 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-8337:
---
Attachment: 8337_v1.txt

Attaching a patch that will print out the path of the corrupt sstable on either 
a stop_paranoid policy failure or die policy in JVMStabilityInspector.  If you 
can run with the above patch on a test cluster it should tell us which files 
you're having trouble with.

If we could get one of those attached to this ticket that would be a big help, 
given that scrub reports that the tables are ok.  If not, that's completely 
understandable, but thus far I've had no luck reproducing corrupted data like 
this.

 mmap underflow during validation compaction
 ---

 Key: CASSANDRA-8337
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8337
 Project: Cassandra
  Issue Type: Bug
Reporter: Alexander Sterligov
Assignee: Joshua McKenzie
 Attachments: 8337_v1.txt, thread_dump


 During full parallel repair I often get errors like the following
 {quote}
 [2014-11-19 01:02:39,355] Repair session 116beaf0-6f66-11e4-afbb-c1c082008cbe 
 for range (3074457345618263602,-9223372036854775808] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #116beaf0-6f66-11e4-afbb-c1c082008cbe on iss/target_state_history, 
 (3074457345618263602,-9223372036854775808]] Validation failed in 
 /95.108.242.19
 {quote}
 At the log of the node there are always same exceptions:
 {quote}
 ERROR [ValidationExecutor:2] 2014-11-19 01:02:10,847 
 JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting 
 forcefully due to:
 org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
 mmap segment underflow; remaining is 15 but 47 requested
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1518)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1385)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1315)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1706)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1694)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:276)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getScanners(WrappingCompactionStrategy.java:320)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:917)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:97)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:557)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_51]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_51]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_51]
 at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
 Caused by: java.io.IOException: mmap segment underflow; remaining is 15 but 
 47 requested
 at 
 org.apache.cassandra.io.util.MappedFileDataInput.readBytes(MappedFileDataInput.java:135)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:348) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:327)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1460)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 ... 13 common frames omitted
 {quote}
 Now i'm using die disk_failure_policy to determine such conditions faster, 
 but I get them even with stop policy.
 Streams related to host with such exception are hanged. Thread dump is 
 attached. Only restart helps.
 After retry I get errors from other nodes.
 scrub doesn't help and report that sstables are ok.
 Sequential repairs doesn't cause such exceptions.
 Load is about 1000 write rps and 50 read rps per node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8379) Remove filename and line number flags from default logging configuration

2014-11-26 Thread Matt Brown (JIRA)
Matt Brown created CASSANDRA-8379:
-

 Summary: Remove filename and line number flags from default 
logging configuration
 Key: CASSANDRA-8379
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8379
 Project: Cassandra
  Issue Type: Improvement
Reporter: Matt Brown
Priority: Minor


n the logging configuration that ships with the cassandra distribution 
(log4j-server.properties in 2.0, and logback.xml in 2.1), the rolling file 
appender is configured to print the file name and the line number of each 
logging event:

{code}log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line 
%L) %m%n{code}

Both the log4j and logback documentation warn that generating the filename/line 
information is not a cheap operation.

From the [log4j 
docs|http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html]:

 WARNING Generating caller location information is extremely slow and should 
 be avoided unless execution speed is not an issue.

From [logback docs|http://logback.qos.ch/manual/layouts.html]:

 Generating the file information is not particularly fast. Thus, its use 
 should be avoided unless execution speed is not an issue.

The implementation for both involves creating a new Throwable and then printing 
the stack trace for the throwable to find the file name or line number. I don't 
have data to back this up but the conventional advice that throwing exceptions 
is slow has to do with filling in the stacktrace.

It would make more sense for the logging configuration to simply use the 
logger/category name (%c) instead of the file name and to remove the line 
number part.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a decent sstable leveling

2014-11-26 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226708#comment-14226708
 ] 

Marcus Eriksson commented on CASSANDRA-8301:


cool, what is your heuristic for finding the level?

I thought a bit about it and figured that we could probably estimate level by 
ordering sstables by the number of other sstables they overlap, then putting 
the ones that overlap the most in the lowest levels

ie, an sstable in L1 is bound to overlap ~10 in L2, 100 in L3 etc, meaning it 
would overlap 110 sstables if we only have 3 levels, an sstable in L2 would 
overlap 10 in L3 and only one in L1, total 11, and sstables in the top level 
would only overlap one in L2 and one in L1. This assumes L0 was empty when 
bootstrapping which is most often wrong and I haven't given much thought on how 
to fix that

 Create a tool that given a bunch of sstables creates a decent sstable 
 leveling
 

 Key: CASSANDRA-8301
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson

 In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
 node, you will end up with a ton of files in L0 and it might be extremely 
 painful to get LCS to compact into a new leveling
 We could probably exploit the fact that we have many non-overlapping sstables 
 in L0, and offline-bump those sstables into higher levels. It does not need 
 to be perfect, just get the majority of the data into L1+ without creating 
 overlaps.
 So, suggestion is to create an offline tool that looks at the range each 
 sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8379) Remove filename and line number flags from default logging configuration

2014-11-26 Thread Matt Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Brown updated CASSANDRA-8379:
--
Attachment: trunk-8379.txt

 Remove filename and line number flags from default logging configuration
 

 Key: CASSANDRA-8379
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8379
 Project: Cassandra
  Issue Type: Improvement
Reporter: Matt Brown
Priority: Minor
 Fix For: 2.1.3

 Attachments: trunk-8379.txt


 n the logging configuration that ships with the cassandra distribution 
 (log4j-server.properties in 2.0, and logback.xml in 2.1), the rolling file 
 appender is configured to print the file name and the line number of each 
 logging event:
 {code}log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line 
 %L) %m%n{code}
 Both the log4j and logback documentation warn that generating the 
 filename/line information is not a cheap operation.
 From the [log4j 
 docs|http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html]:
  WARNING Generating caller location information is extremely slow and should 
  be avoided unless execution speed is not an issue.
 From [logback docs|http://logback.qos.ch/manual/layouts.html]:
  Generating the file information is not particularly fast. Thus, its use 
  should be avoided unless execution speed is not an issue.
 The implementation for both involves creating a new Throwable and then 
 printing the stack trace for the throwable to find the file name or line 
 number. I don't have data to back this up but the conventional advice that 
 throwing exceptions is slow has to do with filling in the stacktrace.
 It would make more sense for the logging configuration to simply use the 
 logger/category name (%c) instead of the file name and to remove the line 
 number part.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8379) Remove filename and line number flags from default logging configuration

2014-11-26 Thread Matt Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Brown updated CASSANDRA-8379:
--
Attachment: cassandra-2.0-8379.txt

 Remove filename and line number flags from default logging configuration
 

 Key: CASSANDRA-8379
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8379
 Project: Cassandra
  Issue Type: Improvement
Reporter: Matt Brown
Priority: Minor
 Fix For: 2.1.3

 Attachments: cassandra-2.0-8379.txt, trunk-8379.txt


 n the logging configuration that ships with the cassandra distribution 
 (log4j-server.properties in 2.0, and logback.xml in 2.1), the rolling file 
 appender is configured to print the file name and the line number of each 
 logging event:
 {code}log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line 
 %L) %m%n{code}
 Both the log4j and logback documentation warn that generating the 
 filename/line information is not a cheap operation.
 From the [log4j 
 docs|http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html]:
  WARNING Generating caller location information is extremely slow and should 
  be avoided unless execution speed is not an issue.
 From [logback docs|http://logback.qos.ch/manual/layouts.html]:
  Generating the file information is not particularly fast. Thus, its use 
  should be avoided unless execution speed is not an issue.
 The implementation for both involves creating a new Throwable and then 
 printing the stack trace for the throwable to find the file name or line 
 number. I don't have data to back this up but the conventional advice that 
 throwing exceptions is slow has to do with filling in the stacktrace.
 It would make more sense for the logging configuration to simply use the 
 logger/category name (%c) instead of the file name and to remove the line 
 number part.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a decent sstable leveling

2014-11-26 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226744#comment-14226744
 ] 

Nikolai Grigoriev commented on CASSANDRA-8301:
--

The logic I have built is very simple. And probably has some fundamental flaws 
:)

First I calculate the target size for each level (in bytes) to accommodate all 
my data - i.e. to distribute the total size of all my sstables. This also gives 
me the maximum level to target. Then I take all sstables for the given CF, sort 
them by the beginning (left) of their bounds. Then I start from the highest 
level (L4 in my example) and iterate over that list of sstables. I grab the 
first sstable, remember its bounds, put it to the current level. Then skip to 
the next one that does not intersect with these bounds, assign it to the 
current level and change the bounds. And so on until the end of the list or 
until I use all available size. Then I move to the lower level and repeat it on 
the remaining sstables. And so on. The remainder goes to L0 where overlaps are 
allowed (right?).

I had to also come up with some logic to exclude the sstables that cover large 
range of tokens. Most likely these are the ones that were recently written at 
L0 on the source node - they cover whatever was recently written into them, 
right? I ignore those from my logic and leave them for L0.

Or did I get it completely wrong?

 Create a tool that given a bunch of sstables creates a decent sstable 
 leveling
 

 Key: CASSANDRA-8301
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson

 In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
 node, you will end up with a ton of files in L0 and it might be extremely 
 painful to get LCS to compact into a new leveling
 We could probably exploit the fact that we have many non-overlapping sstables 
 in L0, and offline-bump those sstables into higher levels. It does not need 
 to be perfect, just get the majority of the data into L1+ without creating 
 overlaps.
 So, suggestion is to create an offline tool that looks at the range each 
 sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a decent sstable leveling

2014-11-26 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226744#comment-14226744
 ] 

Nikolai Grigoriev edited comment on CASSANDRA-8301 at 11/26/14 8:04 PM:


The logic I have built is very simple. And probably has some fundamental flaws 
:)

First I calculate the target size for each level (in bytes) to accommodate all 
my data - i.e. to distribute the total size of all my sstables. This also gives 
me the maximum level to target. Then I take all sstables for the given CF, sort 
them by the beginning (left) of their bounds. Then I start from the highest 
level (L4 in my example) and iterate over that list of sstables. I grab the 
first sstable, remember its bounds, put it to the current level. Then skip to 
the next one that does not intersect with these bounds, assign it to the 
current level and change the bounds. And so on until the end of the list or 
until I use all available size. Then I move to the lower level and repeat it on 
the remaining sstables. And so on. The remainder goes to L0 where overlaps are 
allowed (right?).

I had to also come up with some logic to exclude the sstables that cover large 
range of tokens. Most likely these are the ones that were recently written at 
L0 on the original node - they cover whatever was recently written into them, 
right? I ignore those from my logic and leave them for L0.

Or did I get it completely wrong?


was (Author: ngrigor...@gmail.com):
The logic I have built is very simple. And probably has some fundamental flaws 
:)

First I calculate the target size for each level (in bytes) to accommodate all 
my data - i.e. to distribute the total size of all my sstables. This also gives 
me the maximum level to target. Then I take all sstables for the given CF, sort 
them by the beginning (left) of their bounds. Then I start from the highest 
level (L4 in my example) and iterate over that list of sstables. I grab the 
first sstable, remember its bounds, put it to the current level. Then skip to 
the next one that does not intersect with these bounds, assign it to the 
current level and change the bounds. And so on until the end of the list or 
until I use all available size. Then I move to the lower level and repeat it on 
the remaining sstables. And so on. The remainder goes to L0 where overlaps are 
allowed (right?).

I had to also come up with some logic to exclude the sstables that cover large 
range of tokens. Most likely these are the ones that were recently written at 
L0 on the source node - they cover whatever was recently written into them, 
right? I ignore those from my logic and leave them for L0.

Or did I get it completely wrong?

 Create a tool that given a bunch of sstables creates a decent sstable 
 leveling
 

 Key: CASSANDRA-8301
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson

 In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
 node, you will end up with a ton of files in L0 and it might be extremely 
 painful to get LCS to compact into a new leveling
 We could probably exploit the fact that we have many non-overlapping sstables 
 in L0, and offline-bump those sstables into higher levels. It does not need 
 to be perfect, just get the majority of the data into L1+ without creating 
 overlaps.
 So, suggestion is to create an offline tool that looks at the range each 
 sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress

2014-11-26 Thread Ryan McGuire (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-7918:

Attachment: (was: 0001-CASSANDRA-7918-stress-graphing.patch)

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor

 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: Fix PaxosStateTest

2014-11-26 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/trunk 32b0a4e95 - c5cec0046


Fix PaxosStateTest


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c5cec004
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c5cec004
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c5cec004

Branch: refs/heads/trunk
Commit: c5cec004628c2b669d48c4cf565bfa5482a766f3
Parents: 32b0a4e
Author: Aleksey Yeschenko alek...@apache.org
Authored: Wed Nov 26 23:17:09 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Wed Nov 26 23:17:09 2014 +0300

--
 test/unit/org/apache/cassandra/service/PaxosStateTest.java | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c5cec004/test/unit/org/apache/cassandra/service/PaxosStateTest.java
--
diff --git a/test/unit/org/apache/cassandra/service/PaxosStateTest.java 
b/test/unit/org/apache/cassandra/service/PaxosStateTest.java
index d41d89f..7f4bc49 100644
--- a/test/unit/org/apache/cassandra/service/PaxosStateTest.java
+++ b/test/unit/org/apache/cassandra/service/PaxosStateTest.java
@@ -45,6 +45,7 @@ public class PaxosStateTest
 public static void setUpClass() throws Throwable
 {
 SchemaLoader.loadSchema();
+SchemaLoader.schemaDefinition(PaxosStateTest);
 }
 
 @AfterClass
@@ -56,7 +57,7 @@ public class PaxosStateTest
 @Test
 public void testCommittingAfterTruncation() throws Exception
 {
-ColumnFamilyStore cfs = 
Keyspace.open(Keyspace1).getColumnFamilyStore(Standard1);
+ColumnFamilyStore cfs = 
Keyspace.open(PaxosStateTestKeyspace1).getColumnFamilyStore(Standard1);
 DecoratedKey key = Util.dk(key + System.nanoTime());
 CellName name = Util.cellname(col);
 ByteBuffer value = ByteBufferUtil.bytes(0);



[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress

2014-11-26 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226766#comment-14226766
 ] 

Benedict commented on CASSANDRA-7918:
-

My plane journey was spent manically trying various graphing options to give 
everything you need to assess a branch in one view, and clearly. I'd hate that 
to go to waste. The new patch as it stands only produces the graphs we've 
always got - I'd like to see cstar and our bundled tool produce _better 
graphs_. Each one of the graphs in the gnuplot output is designed to let you 
see more information; it's all normalised, coloured and scattered so you can 
distinguish the results at each moment in time and overall. Too often with the 
web output I have to simply glance at the average to tell what's going on (or 
guess-and-peck numbers for zooming in), and have to click at each different 
stat which is laborious (and, let's be honest, we don't do it thoroughly, we 
just peck at a few... or perhaps I'm lazier than everyone else :))

To elaborate on the alternative, there are ten graphs in one view in the 
gnuplot version, scaled so you can tell everything they want you to know 
without clicking once. The left-most of each graph normalises each moment of 
each run against the base run, so that variability can be easily broken down 
across the run. The middle graph plots the raw data so you can get a feel for 
its shape, and the final graph plots the median, quartiles and deciles. The 
latencies are all plotted with selected scatters / lines to make distinguishing 
which p-range we're looking at, even when they cross. GC is also plotted 
specially as a cumulative run, since this tweaks out differences much more 
clearly also.

I have nothing against discarding the gnuplot approach, but I'd like to see 
whatever solution we produce deliver really great graphs that allow us to make 
decisions more easily and more accurately. Right now I'd prefer to put the 
gnuplot work into cstar than the other way around. Though I can tell the hatred 
for it runs deep!

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor

 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8087) Multiple non-DISTINCT rows returned when page_size set

2014-11-26 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-8087:
---
Attachment: 8087-2.0.txt

The root of the problem ended up being {{countCQL3}} rows erroneously being set 
to true in PagedRangeCommand because the logic in 
{{PagedRangeCommand.countCQL3Rows()}} didn't handle the DISTINCT with static 
columns case.  In 2.1 this isn't a problem because we serialize 
{{countCQL3Rows}} as part of the message.

The attached patch attempts to update the {{PagedRangeCommand.countCQL3Rows()}} 
logic to handle DISTINCT with statics.  I believe this logic is safe, but I'm 
not 100% sure.  (It doesn't seem to cause any regressions in the tests.)  The 
patch also adds a bit of documentation and some toStrings() to clarify things 
that were confusing to me when debugging.

Last, the patch fixes an overcounting problem in 
{{SliceQueryFilter.lastCounted()}}.  This fix ended up not being required for 
this ticket, but I figured that it's good to prevent a possible future bug.  
The overcounting happens because in 
{{SliceQueryFilter.collectReducedColumns()}}, we have to call 
{{columnCounter.count()}} _before_ adding cells to the container, and we only 
break once the count _exceeds_ the limit.  So, if we exceed the limit, the 
counter will have overcounted by one.  In practice, this doesn't seem to cause 
any problems (due to the conditions under which {{collectionReducedColumns()}} 
is called and when we set a limit on the slice), but it's definitely erroneous.

I also extended the failing dtest here: 
https://github.com/thobbs/cassandra-dtest/tree/CASSANDRA-8087

 Multiple non-DISTINCT rows returned when page_size set
 --

 Key: CASSANDRA-8087
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8087
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Adam Holmberg
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 2.0.12

 Attachments: 8087-2.0.txt


 Using the following statements to reproduce:
 {code}
 CREATE TABLE test (
 k int,
 p int,
 s int static,
 PRIMARY KEY (k, p)
 );
 INSERT INTO test (k, p) VALUES (1, 1);
 INSERT INTO test (k, p) VALUES (1, 2);
 SELECT DISTINCT k, s FROM test ;
 {code}
 Native clients that set result_page_size in the query message receive 
 multiple non-distinct rows back (one per clustered value p in row k).
 This is only reproduced on 2.0.10. Does not appear in 2.1.0
 It does not appear in cqlsh for 2.0.10 because thrift.
 See https://datastax-oss.atlassian.net/browse/PYTHON-164 for background



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress

2014-11-26 Thread Ryan McGuire (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-7918:

Attachment: 7918.patch

fwiw, I've updated my branch again to fix the case where you run without 
threadcounts specified. It automatically breaks it out into multiple runs with 
 - X threads appended to the revision name.

example: http://ryanmcguire.info/ds/jira/7918-multi-threads.html

I'll give your comments some more thought [~benedict], thanks.

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor
 Attachments: 7918.patch


 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226795#comment-14226795
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

Caching entire rows of very large rows seems like a problem workload for a 
variety of reasons. The overhead of repopulating each cache entry on insertion 
is not good.

Does the storage engine always materialize entire rows into memory for every 
query?

60 milliseconds is much longer than it takes to copy several megabytes so it is 
expensive even with large rows although the rest of the cost of materializing 
the row might dominate.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: Validate functionality of different JSR-223 providers in UDFs.

2014-11-26 Thread mishail
Repository: cassandra
Updated Branches:
  refs/heads/trunk c5cec0046 - e13121318


Validate functionality of different JSR-223 providers in UDFs.

patch by Robert Stupp; reviewed by Mikhail Stepura for CASSANDRA-7874


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e1312131
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e1312131
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e1312131

Branch: refs/heads/trunk
Commit: e13121318b1a3186f75a652c28ca317edac719d4
Parents: c5cec00
Author: Robert Stupp sn...@snazy.de
Authored: Wed Nov 26 12:51:37 2014 -0800
Committer: Mikhail Stepura mish...@apache.org
Committed: Wed Nov 26 12:51:37 2014 -0800

--
 .gitignore| 12 +
 bin/cassandra.bat | 23 
 bin/cassandra.in.bat  | 21 +++
 bin/cassandra.in.sh   | 21 +++
 conf/cassandra-env.ps1| 37 ++
 lib/jsr223/clojure/README.txt |  8 ++
 lib/jsr223/groovy/README.txt  | 35 
 lib/jsr223/jaskell/README.txt |  5 
 lib/jsr223/jruby/README.txt   | 54 ++
 lib/jsr223/jython/README.txt  | 33 +++
 lib/jsr223/scala/README.txt   | 37 ++
 11 files changed, 286 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e1312131/.gitignore
--
diff --git a/.gitignore b/.gitignore
index c7cf9fd..fd37407 100644
--- a/.gitignore
+++ b/.gitignore
@@ -57,3 +57,15 @@ target/
 *.tmp
 .DS_Store
 Thumbs.db
+
+# JSR223
+lib/jsr223/clojure/*.jar
+lib/jsr223/groovy/*.jar
+lib/jsr223/jaskell/*.jar
+lib/jsr223/jruby/*.jar
+lib/jsr223/jruby/jni
+lib/jsr223/jruby/ruby
+lib/jsr223/jython/*.jar
+lib/jsr223/jython/cachedir
+lib/jsr223/scala/*.jar
+

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e1312131/bin/cassandra.bat
--
diff --git a/bin/cassandra.bat b/bin/cassandra.bat
index a16bf1a..3445af2 100644
--- a/bin/cassandra.bat
+++ b/bin/cassandra.bat
@@ -85,6 +85,29 @@ goto :eof
 
 REM 
-
 :okClasspath
+
+REM JSR223 - collect all JSR223 engines' jars
+for /D %%P in (%CASSANDRA_HOME%\lib\jsr223\*.*) do (
+   for %%i in (%%P\*.jar) do call :append %%i
+)
+
+REM JSR223/JRuby - set ruby lib directory
+if EXIST %CASSANDRA_HOME%\lib\jsr223\jruby\ruby (
+set JAVA_OPTS=%JAVA_OPTS% -Djruby.lib=%CASSANDRA_HOME%\lib\jsr223\jruby
+)
+REM JSR223/JRuby - set ruby JNI libraries root directory
+if EXIST %CASSANDRA_HOME%\lib\jsr223\jruby\jni (
+set JAVA_OPTS=%JAVA_OPTS% 
-Djffi.boot.library.path=%CASSANDRA_HOME%\lib\jsr223\jruby\jni
+)
+REM JSR223/Jython - set python.home system property
+if EXIST %CASSANDRA_HOME%\lib\jsr223\jython\jython.jar (
+set JAVA_OPTS=%JAVA_OPTS% 
-Dpython.home=%CASSANDRA_HOME%\lib\jsr223\jython
+)
+REM JSR223/Scala - necessary system property
+if EXIST %CASSANDRA_HOME%\lib\jsr223\scala\scala-compiler.jar (
+set JAVA_OPTS=%JAVA_OPTS% -Dscala.usejavacp=true
+)
+
 REM Include the build\classes\main directory so it works in development
 set 
CASSANDRA_CLASSPATH=%CLASSPATH%;%CASSANDRA_HOME%\build\classes\main;%CASSANDRA_HOME%\build\classes\thrift
 set CASSANDRA_PARAMS=-Dcassandra -Dcassandra-foreground=yes

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e1312131/bin/cassandra.in.bat
--
diff --git a/bin/cassandra.in.bat b/bin/cassandra.in.bat
index a0eced5..1b4e38e 100644
--- a/bin/cassandra.in.bat
+++ b/bin/cassandra.in.bat
@@ -49,5 +49,26 @@ set 
CASSANDRA_CLASSPATH=%CLASSPATH%;%CASSANDRA_HOME%\build\classes\main;%CASSA
 REM Add the default storage location.  Can be overridden in conf\cassandra.yaml
 set CASSANDRA_PARAMS=%CASSANDRA_PARAMS% 
-Dcassandra.storagedir=%CASSANDRA_HOME%\data
 
+REM JSR223 - collect all JSR223 engines' jars
+for /r %%P in (%CASSANDRA_HOME%\lib\jsr223\*.jar) do (
+set CLASSPATH=%CLASSPATH%;%%~fP
+)
+REM JSR223/JRuby - set ruby lib directory
+if EXIST %CASSANDRA_HOME%\lib\jsr223\jruby\ruby (
+set JAVA_OPTS=%JAVA_OPTS% -Djruby.lib=%CASSANDRA_HOME%\lib\jsr223\jruby
+)
+REM JSR223/JRuby - set ruby JNI libraries root directory
+if EXIST %CASSANDRA_HOME%\lib\jsr223\jruby\jni (
+set JAVA_OPTS=%JAVA_OPTS% 
-Djffi.boot.library.path=%CASSANDRA_HOME%\lib\jsr223\jruby\jni
+)
+REM JSR223/Jython - set python.home system property
+if EXIST %$CASSANDRA_HOME%\lib\jsr223\jython\jython.jar (
+set JAVA_OPTS=%JAVA_OPTS% 
-Dpython.home=%CASSANDRA_HOME%\lib\jsr223\jython
+)
+REM JSR223/Scala - necessary system property
+if EXIST 

[jira] [Commented] (CASSANDRA-8370) cqlsh doesn't handle LIST statements correctly

2014-11-26 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226800#comment-14226800
 ] 

Mikhail Stepura commented on CASSANDRA-8370:


I doubt it was broken by CASSANDRA-6307, rather by CASSANDRA-6910

 cqlsh doesn't handle LIST statements correctly
 --

 Key: CASSANDRA-8370
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8370
 Project: Cassandra
  Issue Type: Bug
Reporter: Sam Tunnicliffe
Assignee: Sam Tunnicliffe
Priority: Minor
  Labels: cqlsh
 Fix For: 2.1.3

 Attachments: 8370.txt


 {{LIST USERS}} and {{LIST PERMISSIONS}} statements are not handled correctly 
 by cqlsh in 2.1 (since CASSANDRA-6307).
 Running such a query results in errors along the lines of:
 {noformat}
 sam@easy:~/projects/cassandra$ bin/cqlsh --debug -u cassandra -p cassandra
 Using CQL driver: module 'cassandra' from 
 '/home/sam/projects/cassandra/bin/../lib/cassandra-driver-internal-only-2.1.2.zip/cassandra-driver-2.1.2/cassandra/__init__.py'
 Connected to Test Cluster at 127.0.0.1:9042.
 [cqlsh 5.0.1 | Cassandra 2.1.2-SNAPSHOT | CQL spec 3.2.0 | Native protocol v3]
 Use HELP for help.
 cassandra@cqlsh list users;
 Traceback (most recent call last):
   File bin/cqlsh, line 879, in onecmd
 self.handle_statement(st, statementtext)
   File bin/cqlsh, line 920, in handle_statement
 return self.perform_statement(cqlruleset.cql_extract_orig(tokens, srcstr))
   File bin/cqlsh, line 953, in perform_statement
 result = self.perform_simple_statement(stmt)
   File bin/cqlsh, line 989, in perform_simple_statement
 self.print_result(rows, self.parse_for_table_meta(statement.query_string))
   File bin/cqlsh, line 970, in parse_for_table_meta
 return self.get_table_meta(ks, cf)
   File bin/cqlsh, line 732, in get_table_meta
 ksmeta = self.get_keyspace_meta(ksname)
   File bin/cqlsh, line 717, in get_keyspace_meta
 raise KeyspaceNotFound('Keyspace %r not found.' % ksname)
 KeyspaceNotFound: Keyspace None not found.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226776#comment-14226776
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

The row cache can contain very large rows AFAIK.
Idea is to pre-allocate some portion of the configured capacity for large 
blocks - new blocks could be allocated on demand (edge-trigger).
OTOH if it stores that amount of data on a cache, that amount of time 
(20...60ms) might be irrelevant compared to the time needed for serialization - 
so maybe it would be wasted effort. Not sure about that.
Table resizing may take as long as it takes - I do not really bother about 
allocation time for that, because no reads or writes are locked while 
allocating the new partition(segment) table.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8380) Only record trace if query exceeds latency threshold.

2014-11-26 Thread Matt Stump (JIRA)
Matt Stump created CASSANDRA-8380:
-

 Summary: Only record trace if query exceeds latency threshold.
 Key: CASSANDRA-8380
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8380
 Project: Cassandra
  Issue Type: Improvement
Reporter: Matt Stump
Priority: Critical


Probabilistic trace isn't very useful because you're typically trying to only 
find the badly performing queries which may only be .01% of requests. I would 
like to enable probabilistic trace on a sample of queries, but then only record 
the trace if the request exceeded a time threshold. This would allow us to more 
easily isolate performance problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8381) CFStats should record keys of largest N requests for time interval

2014-11-26 Thread Matt Stump (JIRA)
Matt Stump created CASSANDRA-8381:
-

 Summary: CFStats should record keys of largest N requests for time 
interval
 Key: CASSANDRA-8381
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8381
 Project: Cassandra
  Issue Type: Improvement
Reporter: Matt Stump
Priority: Critical


Isolating the problem partition for a CF is right now incredibly difficult. If 
we could keep the primary key of the largest N read or write requests for the 
pervious interval or since counter has been cleared it would be extremely 
useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-26 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7563:

Attachment: 7563v9.txt
7563v9-diff-diff.txt

v9 (with changes in diff-diff) attached
* listens for UDT changes and updates the UserType instances in {{UDFunction}}
* change in CQLTester to unwrap {{InvalidRequestException}} from 
{{RuntimeException}} (wrapped in {{executeOnceInternal}})

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt, 7563v7.txt, 7563v8-diff-diff.txt, 
 7563v8.txt, 7563v9-diff-diff.txt, 7563v9.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8381) CFStats should record keys of largest N requests for time interval

2014-11-26 Thread Rich Rein (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226822#comment-14226822
 ] 

Rich Rein commented on CASSANDRA-8381:
--

It would be extremely useful to have 1 or a few recent key values for each step 
of the histogram.
This would allow developers to see partitioning sizes as a side affect of key 
values and frequency correlated with partition sizes.

But the worse case sizes mentioned by matt is critical.

 CFStats should record keys of largest N requests for time interval
 --

 Key: CASSANDRA-8381
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8381
 Project: Cassandra
  Issue Type: Improvement
Reporter: Matt Stump
Priority: Critical

 Isolating the problem partition for a CF is right now incredibly difficult. 
 If we could keep the primary key of the largest N read or write requests for 
 the pervious interval or since counter has been cleared it would be extremely 
 useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-26 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7563:

Attachment: (was: 7563v9-diff-diff.txt)

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt, 7563v7.txt, 7563v8-diff-diff.txt, 
 7563v8.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-26 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7563:

Attachment: (was: 7563v9.txt)

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt, 7563v7.txt, 7563v8-diff-diff.txt, 
 7563v8.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226861#comment-14226861
 ] 

Pavel Yaskevich commented on CASSANDRA-7438:


[~tupshin] Original idea of this was to get the thing we know does the job, 
which is memcached, strip out some of the unnecessary parts and pack it is a 
lib we can use over JNI, the same way snappy and others do. But now we are 
getting into a business of re-inventing things that are pretty hard to get 
right and properly test, so if the argument against having lruc in it's 
original form was that it would be hard to test/maintain that, in my opinion, 
is no longer valid.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8370) cqlsh doesn't handle LIST statements correctly

2014-11-26 Thread Mikhail Stepura (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Stepura updated CASSANDRA-8370:
---
Attachment: 8370v2.patch

[~beobal] I suggest moving all that logic into {{parse_for_table_meta}}. 
Attaching v2 for that. What do you think?

 cqlsh doesn't handle LIST statements correctly
 --

 Key: CASSANDRA-8370
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8370
 Project: Cassandra
  Issue Type: Bug
Reporter: Sam Tunnicliffe
Assignee: Sam Tunnicliffe
Priority: Minor
  Labels: cqlsh
 Fix For: 2.1.3

 Attachments: 8370.txt, 8370v2.patch


 {{LIST USERS}} and {{LIST PERMISSIONS}} statements are not handled correctly 
 by cqlsh in 2.1 (since CASSANDRA-6307).
 Running such a query results in errors along the lines of:
 {noformat}
 sam@easy:~/projects/cassandra$ bin/cqlsh --debug -u cassandra -p cassandra
 Using CQL driver: module 'cassandra' from 
 '/home/sam/projects/cassandra/bin/../lib/cassandra-driver-internal-only-2.1.2.zip/cassandra-driver-2.1.2/cassandra/__init__.py'
 Connected to Test Cluster at 127.0.0.1:9042.
 [cqlsh 5.0.1 | Cassandra 2.1.2-SNAPSHOT | CQL spec 3.2.0 | Native protocol v3]
 Use HELP for help.
 cassandra@cqlsh list users;
 Traceback (most recent call last):
   File bin/cqlsh, line 879, in onecmd
 self.handle_statement(st, statementtext)
   File bin/cqlsh, line 920, in handle_statement
 return self.perform_statement(cqlruleset.cql_extract_orig(tokens, srcstr))
   File bin/cqlsh, line 953, in perform_statement
 result = self.perform_simple_statement(stmt)
   File bin/cqlsh, line 989, in perform_simple_statement
 self.print_result(rows, self.parse_for_table_meta(statement.query_string))
   File bin/cqlsh, line 970, in parse_for_table_meta
 return self.get_table_meta(ks, cf)
   File bin/cqlsh, line 732, in get_table_meta
 ksmeta = self.get_keyspace_meta(ksname)
   File bin/cqlsh, line 717, in get_keyspace_meta
 raise KeyspaceNotFound('Keyspace %r not found.' % ksname)
 KeyspaceNotFound: Keyspace None not found.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8288) cqlsh describe needs to show 'sstable_compression': ''

2014-11-26 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-8288:
---
Fix Version/s: 2.1.3

 cqlsh describe needs to show 'sstable_compression': ''
 --

 Key: CASSANDRA-8288
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8288
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jeremiah Jordan
Assignee: Tyler Hobbs
  Labels: cqlsh
 Fix For: 2.1.3


 For uncompressed tables cqlsh describe schema should show AND compression = 
 {'sstable_compression': ''} otherwise when you replay the schema you get the 
 default of LZ4.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8288) cqlsh describe needs to show 'sstable_compression': ''

2014-11-26 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226913#comment-14226913
 ] 

Aleksey Yeschenko commented on CASSANDRA-8288:
--

Longer term (3.0?) I think we'd want to make this consistent with compaction. 
Add 'enabled' option, and let people set it to false to disable compression. 
Empty string for 'sstable_compression' is a very ugly hack.

 cqlsh describe needs to show 'sstable_compression': ''
 --

 Key: CASSANDRA-8288
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8288
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jeremiah Jordan
Assignee: Tyler Hobbs
  Labels: cqlsh
 Fix For: 2.1.3


 For uncompressed tables cqlsh describe schema should show AND compression = 
 {'sstable_compression': ''} otherwise when you replay the schema you get the 
 default of LZ4.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8288) cqlsh describe needs to show 'sstable_compression': ''

2014-11-26 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226916#comment-14226916
 ] 

Tyler Hobbs commented on CASSANDRA-8288:


Related python driver ticket: 
[PYTHON-187|https://datastax-oss.atlassian.net/browse/PYTHON-187]

 cqlsh describe needs to show 'sstable_compression': ''
 --

 Key: CASSANDRA-8288
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8288
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jeremiah Jordan
Assignee: Tyler Hobbs
  Labels: cqlsh
 Fix For: 2.1.3

  Time Spent: 4m
  Remaining Estimate: 0h

 For uncompressed tables cqlsh describe schema should show AND compression = 
 {'sstable_compression': ''} otherwise when you replay the schema you get the 
 default of LZ4.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8061) tmplink files are not removed

2014-11-26 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226972#comment-14226972
 ] 

Benedict commented on CASSANDRA-8061:
-

[~JoshuaMcKenzie] nice spot, that's definitely a bug. It would require the 
partitions to be circa 500K in size, but it couldn't leave a file intact and 
undeleted, it could only potentially leak a file descriptor. So it's possible 
it's related to CASSANDRA-8248, but definitely not this. We should probably 
reopen 8248 and file against that.


 tmplink files are not removed
 -

 Key: CASSANDRA-8061
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8061
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Linux
Reporter: Gianluca Borello
Assignee: Joshua McKenzie
Priority: Critical
 Fix For: 2.1.3

 Attachments: 8061_v1.txt, 8248-thread_dump.txt


 After installing 2.1.0, I'm experiencing a bunch of tmplink files that are 
 filling my disk. I found https://issues.apache.org/jira/browse/CASSANDRA-7803 
 and that is very similar, and I confirm it happens both on 2.1.0 as well as 
 from the latest commit on the cassandra-2.1 branch 
 (https://github.com/apache/cassandra/commit/aca80da38c3d86a40cc63d9a122f7d45258e4685
  from the cassandra-2.1)
 Even starting with a clean keyspace, after a few hours I get:
 {noformat}
 $ sudo find /raid0 | grep tmplink | xargs du -hs
 2.7G  
 /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Data.db
 13M   
 /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Index.db
 1.8G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Data.db
 12M   
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Index.db
 5.2M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Index.db
 822M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Data.db
 7.3M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Index.db
 1.2G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Data.db
 6.7M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Index.db
 1.1G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Data.db
 11M   
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Index.db
 1.7G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Data.db
 812K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Index.db
 122M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-208-Data.db
 744K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-739-Index.db
 660K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Index.db
 796K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Index.db
 137M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Data.db
 161M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Data.db
 139M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Data.db
 940K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Index.db
 936K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Index.db
 161M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Data.db
 672K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-197-Index.db
 113M  
 

[jira] [Commented] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2014-11-26 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226992#comment-14226992
 ] 

Benedict commented on CASSANDRA-8325:
-

You might be right. The javadoc does make it quite explicit that this should 
not be permitted, however the hotspot code in library_call.cpp 
(inline_unsafe_access and classify_unsafe_addr) _seems_ to indicate it should 
be valid and behave the same, but it's hard to say for sure without getting the 
project working better to explore the code more fully.

However given that it is commented as not valid usafe, it does seem sensible to 
change it. But this means a potential performance penalty in one of the most 
heavily used codepaths.

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] cassandra git commit: Support for UDTs, tuples, and collections in UDFs

2014-11-26 Thread tylerhobbs
Support for UDTs, tuples, and collections in UDFs

Patch by Robert Stupp; reviewed by Tyler Hobbs for CASSANDRA-7563


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/794d68b5
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/794d68b5
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/794d68b5

Branch: refs/heads/trunk
Commit: 794d68b51b77c2a3cb09374010b6f84231ead604
Parents: e131213
Author: Robert Stupp sn...@snazy.de
Authored: Wed Nov 26 17:49:45 2014 -0600
Committer: Tyler Hobbs ty...@datastax.com
Committed: Wed Nov 26 17:49:45 2014 -0600

--
 CHANGES.txt |2 +
 .../org/apache/cassandra/cql3/CQL3Type.java |   12 +-
 .../cql3/functions/BytesConversionFcts.java |8 +-
 .../cassandra/cql3/functions/FunctionCall.java  |9 +-
 .../cassandra/cql3/functions/FunctionName.java  |1 -
 .../cassandra/cql3/functions/Functions.java |   50 +-
 .../cql3/functions/JavaSourceUDFFactory.java|   51 +-
 .../cql3/functions/ScalarFunction.java  |3 +-
 .../cql3/functions/ScriptBasedUDF.java  |   13 +-
 .../cassandra/cql3/functions/TimeuuidFcts.java  |   10 +-
 .../cassandra/cql3/functions/TokenFct.java  |2 +-
 .../cassandra/cql3/functions/UDFunction.java|  177 ++-
 .../cassandra/cql3/functions/UuidFcts.java  |2 +-
 .../selection/AggregateFunctionSelector.java|8 +-
 .../cassandra/cql3/selection/FieldSelector.java |8 +-
 .../cql3/selection/ScalarFunctionSelector.java  |   10 +-
 .../cassandra/cql3/selection/Selection.java |   30 +-
 .../cassandra/cql3/selection/Selector.java  |6 +-
 .../cql3/selection/SimpleSelector.java  |5 +-
 .../cql3/selection/WritetimeOrTTLSelector.java  |4 +-
 .../statements/CreateFunctionStatement.java |   23 +-
 .../cql3/statements/DropFunctionStatement.java  |   10 +-
 .../cql3/statements/DropTypeStatement.java  |   11 +
 .../cql3/statements/ModificationStatement.java  |2 +-
 .../cql3/statements/SelectStatement.java|8 +-
 .../cassandra/hadoop/cql3/CqlRecordReader.java  |2 +-
 .../org/apache/cassandra/transport/Server.java  |1 +
 .../org/apache/cassandra/cql3/CQLTester.java|  232 ++-
 test/unit/org/apache/cassandra/cql3/UFTest.java | 1356 ++
 tools/lib/cassandra-driver-core-2.0.5.jar   |  Bin 544552 - 0 bytes
 30 files changed, 1643 insertions(+), 413 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/794d68b5/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 162d579..55c86dd 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,6 @@
 3.0
+ * Support UDTs, tuples, and collections in user-defined
+   functions (CASSANDRA-7563)
  * Fix aggregate fn results on empty selection, result column name,
and cqlsh parsing (CASSANDRA-8229)
  * Mark sstables as repaired after full repair (CASSANDRA-7586)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/794d68b5/src/java/org/apache/cassandra/cql3/CQL3Type.java
--
diff --git a/src/java/org/apache/cassandra/cql3/CQL3Type.java 
b/src/java/org/apache/cassandra/cql3/CQL3Type.java
index b656de8..98d1b15 100644
--- a/src/java/org/apache/cassandra/cql3/CQL3Type.java
+++ b/src/java/org/apache/cassandra/cql3/CQL3Type.java
@@ -315,6 +315,11 @@ public interface CQL3Type
 return false;
 }
 
+public String keyspace()
+{
+return null;
+}
+
 public void freeze() throws InvalidRequestException
 {
 String message = String.format(frozen is only allowed on 
collections, tuples, and user-defined types (got %s), this);
@@ -474,6 +479,11 @@ public interface CQL3Type
 this.name = name;
 }
 
+public String keyspace()
+{
+return name.getKeyspace();
+}
+
 public void freeze()
 {
 frozen = true;
@@ -485,7 +495,7 @@ public interface CQL3Type
 {
 // The provided keyspace is the one of the current 
statement this is part of. If it's different from the keyspace of
 // the UTName, we reject since we want to limit user types 
to their own keyspace (see #6643)
-if (!keyspace.equals(name.getKeyspace()))
+if (keyspace != null  
!keyspace.equals(name.getKeyspace()))
 throw new 
InvalidRequestException(String.format(Statement on keyspace %s cannot refer to 
a user type in keyspace %s; 
 + 
user types can only be used 

[1/2] cassandra git commit: Support for UDTs, tuples, and collections in UDFs

2014-11-26 Thread tylerhobbs
Repository: cassandra
Updated Branches:
  refs/heads/trunk e13121318 - 794d68b51


http://git-wip-us.apache.org/repos/asf/cassandra/blob/794d68b5/test/unit/org/apache/cassandra/cql3/UFTest.java
--
diff --git a/test/unit/org/apache/cassandra/cql3/UFTest.java 
b/test/unit/org/apache/cassandra/cql3/UFTest.java
index 4975ca9..824719b 100644
--- a/test/unit/org/apache/cassandra/cql3/UFTest.java
+++ b/test/unit/org/apache/cassandra/cql3/UFTest.java
@@ -19,50 +19,51 @@ package org.apache.cassandra.cql3;
 
 import java.math.BigDecimal;
 import java.math.BigInteger;
-import java.util.Date;
+import java.util.*;
 
-import org.junit.After;
 import org.junit.Assert;
-import org.junit.Before;
 import org.junit.Test;
 
+import com.datastax.driver.core.*;
 import org.apache.cassandra.cql3.functions.FunctionName;
 import org.apache.cassandra.cql3.functions.Functions;
 import org.apache.cassandra.exceptions.InvalidRequestException;
 import org.apache.cassandra.service.ClientState;
+import org.apache.cassandra.transport.Server;
 import org.apache.cassandra.transport.messages.ResultMessage;
 
 public class UFTest extends CQLTester
 {
-private static final String KS_FOO = cqltest_foo;
 
-@Before
-public void createKsFoo() throws Throwable
+public static FunctionName parseFunctionName(String qualifiedName)
 {
-execute(CREATE KEYSPACE IF NOT EXISTS +KS_FOO+ WITH replication = 
{'class': 'SimpleStrategy', 'replication_factor': 3};);
-}
-
-@After
-public void dropKsFoo() throws Throwable
-{
-execute(DROP KEYSPACE IF EXISTS +KS_FOO+;);
+int i = qualifiedName.indexOf('.');
+return i == -1
+   ? FunctionName.nativeFunction(qualifiedName)
+   : new FunctionName(qualifiedName.substring(0, i).trim(), 
qualifiedName.substring(i+1).trim());
 }
 
 @Test
 public void testFunctionDropOnKeyspaceDrop() throws Throwable
 {
-execute(CREATE FUNCTION  + KS_FOO + .sin ( input double ) RETURNS 
double LANGUAGE java AS 'return 
Double.valueOf(Math.sin(input.doubleValue()));');
+String fSin = createFunction(KEYSPACE_PER_TEST, double,
+ CREATE FUNCTION %s ( input double )  +
+ RETURNS double  +
+ LANGUAGE java  +
+ AS 'return 
Double.valueOf(Math.sin(input.doubleValue()));');
 
-Assert.assertEquals(1, Functions.find(new FunctionName(KS_FOO, 
sin)).size());
+FunctionName fSinName = parseFunctionName(fSin);
 
-assertRows(execute(SELECT function_name, language FROM 
system.schema_functions WHERE keyspace_name=?, KS_FOO),
-   row(sin, java));
+Assert.assertEquals(1, Functions.find(parseFunctionName(fSin)).size());
 
-execute(DROP KEYSPACE +KS_FOO+;);
+assertRows(execute(SELECT function_name, language FROM 
system.schema_functions WHERE keyspace_name=?, KEYSPACE_PER_TEST),
+   row(fSinName.name, java));
 
-assertRows(execute(SELECT function_name, language FROM 
system.schema_functions WHERE keyspace_name=?, KS_FOO));
+dropPerTestKeyspace();
 
-Assert.assertEquals(0, Functions.find(new FunctionName(KS_FOO, 
sin)).size());
+assertRows(execute(SELECT function_name, language FROM 
system.schema_functions WHERE keyspace_name=?, KEYSPACE_PER_TEST));
+
+Assert.assertEquals(0, Functions.find(fSinName).size());
 }
 
 @Test
@@ -70,27 +71,40 @@ public class UFTest extends CQLTester
 {
 createTable(CREATE TABLE %s (key int PRIMARY KEY, d double));
 
-execute(CREATE FUNCTION  + KS_FOO + .sin ( input double ) RETURNS 
double LANGUAGE java AS 'return 
Double.valueOf(Math.sin(input.doubleValue()));');
+String fSin = createFunction(KEYSPACE_PER_TEST, double,
+ CREATE FUNCTION %s ( input double )  +
+ RETURNS double  +
+ LANGUAGE java  +
+ AS 'return 
Double.valueOf(Math.sin(input.doubleValue()));');
+
+FunctionName fSinName = parseFunctionName(fSin);
 
-Assert.assertEquals(1, Functions.find(new FunctionName(KS_FOO, 
sin)).size());
+Assert.assertEquals(1, Functions.find(parseFunctionName(fSin)).size());
 
-ResultMessage.Prepared prepared = QueryProcessor.prepare(SELECT key, 
+KS_FOO+.sin(d) FROM +KEYSPACE+'.'+currentTable(), 
ClientState.forInternalCalls(), false);
+ResultMessage.Prepared prepared = QueryProcessor.prepare(
+String.format(SELECT key, 
%s(d) FROM %s.%s, fSin, KEYSPACE, currentTable()),
+
ClientState.forInternalCalls(), false);
 

[jira] [Commented] (CASSANDRA-6717) Modernize schema tables

2014-11-26 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227017#comment-14227017
 ] 

Tyler Hobbs commented on CASSANDRA-6717:


As of CASSANDRA-7563, whenever a UDT has a field renamed and a UDF uses that 
UDT, we're updating {{schema_functions}} to modify the argTypes and returnType. 
 This is required because the AbstractType representation of UDTs includes the 
field name.  With the changes in this ticket, we should be able to remove that 
hack ({{Functions.FunctionsMigrationListener}}).

 Modernize schema tables
 ---

 Key: CASSANDRA-6717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6717
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Aleksey Yeschenko
Priority: Minor
 Fix For: 3.0


 There is a few problems/improvements that can be done with the way we store 
 schema:
 # CASSANDRA-4988: as explained on the ticket, storing the comparator is now 
 redundant (or almost, we'd need to store whether the table is COMPACT or not 
 too, which we don't currently is easy and probably a good idea anyway), it 
 can be entirely reconstructed from the infos in schema_columns (the same is 
 true of key_validator and subcomparator, and replacing default_validator by a 
 COMPACT_VALUE column in all case is relatively simple). And storing the 
 comparator as an opaque string broke concurrent updates of sub-part of said 
 comparator (concurrent collection addition or altering 2 separate clustering 
 columns typically) so it's really worth removing it.
 # CASSANDRA-4603: it's time to get rid of those ugly json maps. I'll note 
 that schema_keyspaces is a problem due to its use of COMPACT STORAGE, but I 
 think we should fix it once and for-all nonetheless (see below).
 # For CASSANDRA-6382 and to allow indexing both map keys and values at the 
 same time, we'd need to be able to have more than one index definition for a 
 given column.
 # There is a few mismatches in table options between the one stored in the 
 schema and the one used when declaring/altering a table which would be nice 
 to fix. The compaction, compression and replication maps are one already 
 mentioned from CASSANDRA-4603, but also for some reason 
 'dclocal_read_repair_chance' in CQL is called just 'local_read_repair_chance' 
 in the schema table, and 'min/max_compaction_threshold' are column families 
 option in the schema but just compaction options for CQL (which makes more 
 sense).
 None of those issues are major, and we could probably deal with them 
 independently but it might be simpler to just fix them all in one shot so I 
 wanted to sum them all up here. In particular, the fact that 
 'schema_keyspaces' uses COMPACT STORAGE is annoying (for the replication map, 
 but it may limit future stuff too) which suggest we should migrate it to a 
 new, non COMPACT table. And while that's arguably a detail, it wouldn't hurt 
 to rename schema_columnfamilies to schema_tables for the years to come since 
 that's the prefered vernacular for CQL.
 Overall, what I would suggest is to move all schema tables to a new keyspace, 
 named 'schema' for instance (or 'system_schema' but I prefer the shorter 
 version), and fix all the issues above at once. Since we currently don't 
 exchange schema between nodes of different versions, all we'd need to do that 
 is a one shot startup migration, and overall, I think it could be simpler for 
 clients to deal with one clear migration than to have to handle minor 
 individual changes all over the place. I also think it's somewhat cleaner 
 conceptually to have schema tables in their own keyspace since they are 
 replicated through a different mechanism than other system tables.
 If we do that, we could, for instance, migrate to the following schema tables 
 (details up for discussion of course):
 {noformat}
 CREATE TYPE user_type (
   name text,
   column_names listtext,
   column_types listtext
 )
 CREATE TABLE keyspaces (
   name text PRIMARY KEY,
   durable_writes boolean,
   replication mapstring, string,
   user_types mapstring, user_type
 )
 CREATE TYPE trigger_definition (
   name text,
   options maptex, text
 )
 CREATE TABLE tables (
   keyspace text,
   name text,
   id uuid,
   table_type text, // COMPACT, CQL or SUPER
   dropped_columns maptext, bigint,
   triggers maptext, trigger_definition,
   // options
   comment text,
   compaction maptext, text,
   compression maptext, text,
   read_repair_chance double,
   dclocal_read_repair_chance double,
   gc_grace_seconds int,
   caching text,
   rows_per_partition_to_cache text,
   default_time_to_live int,
   min_index_interval int,
   max_index_interval int,
   speculative_retry text,
   populate_io_cache_on_flush boolean,
   bloom_filter_fp_chance double
   

[jira] [Updated] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-26 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7563:

Attachment: 7563v9.txt

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt, 7563v7.txt, 7563v8-diff-diff.txt, 
 7563v8.txt, 7563v9.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7039) DirectByteBuffer compatible LZ4 methods

2014-11-26 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227043#comment-14227043
 ] 

Adrien Grand commented on CASSANDRA-7039:
-

The new 1.3.0 release now supports (de)compression on top of the ByteBuffer API.

 DirectByteBuffer compatible LZ4 methods
 ---

 Key: CASSANDRA-7039
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7039
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Branimir Lambov
Priority: Minor
  Labels: performance
 Fix For: 3.0


 As we move more things off-heap, it's becoming more and more essential to be 
 able to use DirectByteBuffer (or native pointers) in various places. 
 Unfortunately LZ4 doesn't currently support this operation, despite being JNI 
 based - this means we both have to perform unnecessary copies to de/compress 
 data from DBB, but also we can stall GC as any JNI method operating over a 
 java array using the GetPrimitiveArrayCritical enters a critical section that 
 prevents GC for its duration. This means STWs will be at least as long any 
 running compression/decompression (and no GC will happen until they complete, 
 so it's additive).
 We should temporarily fork (and then resubmit upstream) jpountz-lz4 to 
 support operating over a native pointer, so that we can pass a DBB or a raw 
 pointer we have allocated ourselves. This will help improve performance when 
 flushing the new offheap memtables, as well as enable us to implement 
 CASSANDRA-6726 and finish CASSANDRA-4338.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8379) Remove filename and line number flags from default logging configuration

2014-11-26 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227046#comment-14227046
 ] 

Chris Lohfink commented on CASSANDRA-8379:
--

Is there really so many logging events that this is slowing anything down? 
There really isnt any in the read/write path unless logging turned up.  There 
are a number of issues I have only been able to figure out due to the 
class/line numbers.  Cost of maintainability is pretty high imho unless 
benchmarks show some significant improvements.

 Remove filename and line number flags from default logging configuration
 

 Key: CASSANDRA-8379
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8379
 Project: Cassandra
  Issue Type: Improvement
Reporter: Matt Brown
Priority: Minor
 Fix For: 2.1.3

 Attachments: cassandra-2.0-8379.txt, trunk-8379.txt


 n the logging configuration that ships with the cassandra distribution 
 (log4j-server.properties in 2.0, and logback.xml in 2.1), the rolling file 
 appender is configured to print the file name and the line number of each 
 logging event:
 {code}log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line 
 %L) %m%n{code}
 Both the log4j and logback documentation warn that generating the 
 filename/line information is not a cheap operation.
 From the [log4j 
 docs|http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html]:
  WARNING Generating caller location information is extremely slow and should 
  be avoided unless execution speed is not an issue.
 From [logback docs|http://logback.qos.ch/manual/layouts.html]:
  Generating the file information is not particularly fast. Thus, its use 
  should be avoided unless execution speed is not an issue.
 The implementation for both involves creating a new Throwable and then 
 printing the stack trace for the throwable to find the file name or line 
 number. I don't have data to back this up but the conventional advice that 
 throwing exceptions is slow has to do with filling in the stacktrace.
 It would make more sense for the logging configuration to simply use the 
 logger/category name (%c) instead of the file name and to remove the line 
 number part.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-11-26 Thread Rajanarayanan Thottuvaikkatumana (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227047#comment-14227047
 ] 

Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-7124:
-

[~yukim] In the StorageService.java, I have used {{AtomicInteger}} variable 
{{nextRepairCommand}} even for the cleanup job. I have noticed it now. Should I 
add one for the cleanup jobs for example {{nextCleanupCommand}} ? Any other 
changes needed for the cleanup? Please let me know. 

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: cassandra-trunk-cleanup-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2014-11-26 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227049#comment-14227049
 ] 

graham sanderson commented on CASSANDRA-8325:
-

Yeah the code in library_call.cpp is quite hard to introspect, however my sense 
is this probably comes down to a mixture of the case where a null (0) 32 bit 
compressedOop pointer is not equivalent to a null 64 bit address (i.e. the case 
where the heap does not fit in the 32 gig of address space) possibly coupled 
with whether the hotspot compiler could prove at compile time whether the value 
was statically null.

It seems like a slightly strange distinction to make given that you are using a 
64 bit offset anyway (and that other methods do using them interchangeably) but 
that doesn't change the fact that they made the distinction and apparently it 
matters at least in some implementations.

The question I was trying to get a sense of, is whether this explicitly 
disallows the sorts of pointer arithmetic C* is trying to achieve, or whether 
C* is just doing it wrong. Maybe that is actually quite simple and there are 
clearly distinctions between baseless pointers and static or otherwise field 
level offsets, but someone would have to have to dive in (probably with the 
ability to test) to make sure.

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2014-11-26 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227049#comment-14227049
 ] 

graham sanderson edited comment on CASSANDRA-8325 at 11/27/14 12:30 AM:


Yeah the code in library_call.cpp is quite hard to introspect, however my sense 
is this probably comes down to a mixture of the case where a null (0) 32 bit 
compressedOop pointer is not equivalent to a null 64 bit address (i.e. the case 
where the heap does not fit in the 32 gig of address space) possibly coupled 
with whether the hotspot compiler could prove at compile time whether the value 
was statically null.

It seems like a slightly strange distinction to make given that you are using a 
64 bit offset anyway (and that other methods do using them interchangeably) but 
that doesn't change the fact that they made the distinction and apparently it 
matters at least in some implementations.

The question I was trying to get a sense of, is whether this explicitly 
disallows the sorts of pointer arithmetic C* is trying to achieve, or whether 
C* is just doing it wrong. Maybe that is actually quite simple and there are 
clearly distinctions in the C* usage between baseless pointers and static or 
otherwise field level offsets, but someone would have to have to dive in 
(probably with the ability to test) to make sure.


was (Author: graham sanderson):
Yeah the code in library_call.cpp is quite hard to introspect, however my sense 
is this probably comes down to a mixture of the case where a null (0) 32 bit 
compressedOop pointer is not equivalent to a null 64 bit address (i.e. the case 
where the heap does not fit in the 32 gig of address space) possibly coupled 
with whether the hotspot compiler could prove at compile time whether the value 
was statically null.

It seems like a slightly strange distinction to make given that you are using a 
64 bit offset anyway (and that other methods do using them interchangeably) but 
that doesn't change the fact that they made the distinction and apparently it 
matters at least in some implementations.

The question I was trying to get a sense of, is whether this explicitly 
disallows the sorts of pointer arithmetic C* is trying to achieve, or whether 
C* is just doing it wrong. Maybe that is actually quite simple and there are 
clearly distinctions between baseless pointers and static or otherwise field 
level offsets, but someone would have to have to dive in (probably with the 
ability to test) to make sure.

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-11-26 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227058#comment-14227058
 ] 

Yuki Morishita commented on CASSANDRA-7124:
---

You can rename {{nextRepairCommand}} to be more generic, and using it across 
StorageService is fine.

Other comments:

- Don't create new ListeningExecutorService every time. Use CompactionExecutor.
- If you do {{Future#get}} in async operation, it just blocks there. It ends up 
the same as sync method.

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: cassandra-trunk-cleanup-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-11-26 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-7124:
-
Flagged: Impediment

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: cassandra-trunk-cleanup-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-11-26 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-7124:
-
Flagged:   (was: Impediment)

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: cassandra-trunk-cleanup-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress

2014-11-26 Thread Jonathan Shook (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227159#comment-14227159
 ] 

Jonathan Shook commented on CASSANDRA-7918:
---

It would be nice to have the ability to send metrics from the client to common 
monitoring systems. It would be especially nice if you could simply use the 
same reporter configuration format that you can already use for wiring 
Cassandra to other monitoring systems, like graphite. For serious users of 
stress, this would be the preferred approach to capturing results.


 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor
 Attachments: 7918.patch


 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2014-11-26 Thread graham sanderson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

graham sanderson updated CASSANDRA-8325:

Attachment: unsafeCopy1.txt

unsafeCopy1.txt has a minimal code change that *might* based on the 
documentation for Unsafe.java fix this one code path (others would remain in 
the compators).

basically this is untested (even not on FreeBSD) so YMMV

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log, unsafeCopy1.txt


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2014-11-26 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227227#comment-14227227
 ] 

graham sanderson edited comment on CASSANDRA-8325 at 11/27/14 3:52 AM:
---

unsafeCopy1.txt has a minimal code change that *might* based on the 
documentation for Unsafe.java fix this one code path (others would remain in 
the compators).

basically this is untested (even not on FreeBSD) so YMMV. Feel free to try it 
and see if it fixes the exception in the copy() method


was (Author: graham sanderson):
unsafeCopy1.txt has a minimal code change that *might* based on the 
documentation for Unsafe.java fix this one code path (others would remain in 
the compators).

basically this is untested (even not on FreeBSD) so YMMV

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log, unsafeCopy1.txt


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227248#comment-14227248
 ] 

Jonathan Ellis commented on CASSANDRA-7438:
---

bq. The row cache can contain very large rows [partitions] AFAIK

Well, it *can*, but it's almost always a bad idea.  Not something we should 
optimize for.  (http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1)

bq. Does the storage engine always materialize entire rows [partitions] into 
memory for every query?

Only when it's pulling them from the off-heap cache.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227248#comment-14227248
 ] 

Jonathan Ellis edited comment on CASSANDRA-7438 at 11/27/14 4:26 AM:
-

bq. The row cache can contain very large rows [partitions] AFAIK

Well, it *can*, but it's almost always a bad idea.  Not something we should 
optimize for.  (http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1)

bq. Does the storage engine always materialize entire rows [partitions] into 
memory for every query?

Only when it's pulling them from the off-heap cache.  (It deserializes onto the 
heap to filter out the requested results.)


was (Author: jbellis):
bq. The row cache can contain very large rows [partitions] AFAIK

Well, it *can*, but it's almost always a bad idea.  Not something we should 
optimize for.  (http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1)

bq. Does the storage engine always materialize entire rows [partitions] into 
memory for every query?

Only when it's pulling them from the off-heap cache.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227250#comment-14227250
 ] 

Jonathan Ellis commented on CASSANDRA-7438:
---

Looking at the discussion, I wonder if we're overcomplicating things.  I think 
it got a bit lost in the noise when Ariel said earlier,

bq. I also wonder if splitting the cache into several instances each with a 
coarse lock per instance wouldn't result in simpler, fast-enough code. I don't 
want to advocate doing something different for performance, but rather that 
there is the possibility of a relatively simple implementation via Unsafe.

Why not start with something like that and see if it's Good Enough?  I suspect 
that at that point other bottlenecks will be much more important, so paying a 
high complexity cost to optimize the cache further would be a bad trade overall.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2014-11-26 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227257#comment-14227257
 ] 

graham sanderson commented on CASSANDRA-8325:
-

Note also Benedict that I noticed in passing that MIN_COPY_THRESHOLD is only 
checked in some code paths - was this deliberate?

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log, unsafeCopy1.txt


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8267) Only stream from unrepaired sstables during incremental repair

2014-11-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227303#comment-14227303
 ] 

Jonathan Ellis commented on CASSANDRA-8267:
---

Unless 8110 a lot simpler than I think, I don't think we should be trying to 
cram it into 2.1.x.

 Only stream from unrepaired sstables during incremental repair
 --

 Key: CASSANDRA-8267
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8267
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 2.1.3


 Seems we stream from all sstables even if we do incremental repair, we should 
 limit this to only stream from the unrepaired sstables if we do incremental 
 repair



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8371) DateTieredCompactionStrategy is always compacting

2014-11-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227305#comment-14227305
 ] 

Jonathan Ellis commented on CASSANDRA-8371:
---

bq. And I'd also like to see an option to change max_sstable_age_days to be a 
smaller unit of time

Maybe the simplest backwards-compatible solution would be to allow it to be a 
float, so 0.042 would be an hour.  Doesn't lend itself to round hour numbers 
but it works. :)

 DateTieredCompactionStrategy is always compacting 
 --

 Key: CASSANDRA-8371
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8371
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: mck
Assignee: Björn Hegerfors
  Labels: compaction, performance
 Attachments: java_gc_counts_rate-month.png, read-latency.png, 
 sstables.png, vg2_iad-month.png


 Running 2.0.11 and having switched a table to 
 [DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602] we've seen that 
 disk IO and gc count increase, along with the number of reads happening in 
 the compaction hump of cfhistograms.
 Data, and generally performance, looks good, but compactions are always 
 happening, and pending compactions are building up.
 The schema for this is 
 {code}CREATE TABLE search (
   loginid text,
   searchid timeuuid,
   description text,
   searchkey text,
   searchurl text,
   PRIMARY KEY ((loginid), searchid)
 );{code}
 We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
 CQL executed against this keyspace, and traffic patterns, can be seen in 
 slides 7+8 of https://prezi.com/b9-aj6p2esft/
 Attached are sstables-per-read and read-latency graphs from cfhistograms, and 
 screenshots of our munin graphs as we have gone from STCS, to LCS (week ~44), 
 to DTCS (week ~46).
 These screenshots are also found in the prezi on slides 9-11.
 [~pmcfadin], [~Bj0rn], 
 Can this be a consequence of occasional deleted rows, as is described under 
 (3) in the description of CASSANDRA-6602 ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8382) Procedure to Change IP Address without Data streaming is Missing in Cassandra Documentation

2014-11-26 Thread Anuj (JIRA)
Anuj created CASSANDRA-8382:
---

 Summary: Procedure to Change IP Address without Data streaming is 
Missing in Cassandra Documentation
 Key: CASSANDRA-8382
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8382
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation  website
 Environment: Red Hat Linux , Cassandra 2.0.3
Reporter: Anuj


Use Case: 
We have a Geo-Red setup with 2 DCs (DC1 and DC2) having 3 nodes each. Listen 
address and seeds of all nodes are Public IPs while rpc addresses are private 
IPs.  Now, we want to decommission a DC2 and change public IPs in listen 
address/seeds of DC1 nodes to private IPs as it will be a single DC setup.

Issue: 
Cassandra doesn’t provide any standard procedure for changing IP address of 
nodes in a cluster. We can bring down nodes, one by one, change their IP 
address and perform the procedure mentioned in “ Replacing a Dead Node” at 
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html
  by mentioning public IP of the node in replace_address option. But procedure 
recommends that you must set the auto_bootstrap option to true.  We don’t want 
any bootstrap and data streaming to happen as data is already there on nodes. 
So, our questions is : What’s the standard procedure for changing IP address of 
Cassandra nodes while making sure that no data streaming occurs and gossip 
state is not corrupted.

We are using vnodes.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8150) Revaluate Default JVM tuning parameters

2014-11-26 Thread Oleg Anastasyev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227361#comment-14227361
 ] 

Oleg Anastasyev commented on CASSANDRA-8150:


Let me add 2 -cents- GC options from our production clusters, which i believe 
could be useful for all:
-XX:+CMSScavengeBeforeRemark
to reduce rescan phase even of CMSWait exceeded

-XX:+ParallelRefProcEnabled
-XX:+CMSParallelInitialMarkEnabled
to improve CMS parallelization 

-XX:CMSMaxAbortablePrecleanTime=2

I'd also propose add -XX:+DisableExplicitGC , b/c java RMI runtime invokes a 
System.gc every hour or increasing sun.rmi.dgc.server.gcInterval to infinity. 
We found, that on Sun JVMs 7 and 8 CMSWait durations are not governed by CMS 
when GC is initiated by System.gc and this was the reason for most long rescan 
phase pauses.

 Revaluate Default JVM tuning parameters
 ---

 Key: CASSANDRA-8150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Matt Stump
Assignee: Brandon Williams
 Attachments: upload.png


 It's been found that the old twitter recommendations of 100m per core up to 
 800m is harmful and should no longer be used.
 Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 
 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 
 1/3 is probably better for releases greater than 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8150) Revaluate Default JVM tuning parameters

2014-11-26 Thread Oleg Anastasyev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227361#comment-14227361
 ] 

Oleg Anastasyev edited comment on CASSANDRA-8150 at 11/27/14 7:58 AM:
--

Let me add 2 -cents- GC options from our production clusters, which i believe 
could be useful for all:

-XX:+ParallelRefProcEnabled
-XX:+CMSParallelInitialMarkEnabled
to improve CMS parallelization 

I'd also propose add -XX:+DisableExplicitGC , b/c java RMI runtime invokes a 
System.gc every hour or increasing sun.rmi.dgc.server.gcInterval to infinity. 
We found, that on Sun JVMs 7 and 8 CMSWait durations are not governed by CMS 
when GC is initiated by System.gc and this was the reason for most long rescan 
phase pauses.


was (Author: m0nstermind):
Let me add 2 -cents- GC options from our production clusters, which i believe 
could be useful for all:
-XX:+CMSScavengeBeforeRemark
to reduce rescan phase even of CMSWait exceeded

-XX:+ParallelRefProcEnabled
-XX:+CMSParallelInitialMarkEnabled
to improve CMS parallelization 

-XX:CMSMaxAbortablePrecleanTime=2

I'd also propose add -XX:+DisableExplicitGC , b/c java RMI runtime invokes a 
System.gc every hour or increasing sun.rmi.dgc.server.gcInterval to infinity. 
We found, that on Sun JVMs 7 and 8 CMSWait durations are not governed by CMS 
when GC is initiated by System.gc and this was the reason for most long rescan 
phase pauses.

 Revaluate Default JVM tuning parameters
 ---

 Key: CASSANDRA-8150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Matt Stump
Assignee: Brandon Williams
 Attachments: upload.png


 It's been found that the old twitter recommendations of 100m per core up to 
 800m is harmful and should no longer be used.
 Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 
 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 
 1/3 is probably better for releases greater than 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >