[jira] [Commented] (CASSANDRA-4182) multithreaded compaction very slow with large single data file and a few tiny data files

2012-04-27 Thread MaHaiyang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263407#comment-13263407
 ] 

MaHaiyang commented on CASSANDRA-4182:
--

273m58.740s (multihtreaded disabled)  for 500 GB , just about 30M/sec . looks 
like a little low for SSDs.
Whether there are some other bottleneck ? like cpu .



 multithreaded compaction very slow with large single data file and a few tiny 
 data files
 

 Key: CASSANDRA-4182
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4182
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.9
 Environment: Redhat
 Sun JDK 1.6.0_20-b02
Reporter: Karl Mueller

 Turning on multithreaded compaction makes compaction time take nearly twice 
 as long in our environment, which includes a very large SStable and a few 
 smaller ones, relative to either 0.8.x with MT turned off or 1.0.x with MT 
 turned off.  
 compaction_throughput_mb_per_sec is set to 0.  
 We currently compact about 500 GB of data nightly due to overwrites.  
 (LevelDB will probably be enabled on the busy CFs once 1.0.x is rolled out 
 completely)  The time it takes to do the compaction is:
 451m13.284s (multithreaded)
 273m58.740s (multihtreaded disabled)
 Our nodes run on SSDs and therefore have a high read and write rate available 
 to them. The primary CF they're compacting right now, with most of the data, 
 is localized to a very large file (~300+GB) and a few tiny files (1-10GB) 
 since the CF has become far less active.  
 I would expect the multithreaded compaction to be no worse than the single 
 threaded compaction, or perhaps a higher cost in CPU for the same 
 performance, but it's half the speed with the same CPU usage, or more CPU. 
 I have two graphs available from testing 2 or 3 compactions which demonstrate 
 some interesting characteristics.  1.0.9 was installed on the 21st with MT 
 turned on.  Prior stuff is 0.8.7 with MT turned off, but 1.0.9 with MT turned 
 off seems to perform as well as 0.8.7.
 http://www.xney.com/temp/cass-irq.png  (interrupts)
 http://www.xney.com/temp/cass-iostat.png (io bandwidth of disks)
 This demonstrates a large increase in rescheduling interrupts and only half 
 the bandwidth used on the disks.  I suspect this is because some kind of 
 threads are thrashing or something like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4177) Little improvement on the messages of the exceptions thrown by ExternalClient

2012-04-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263412#comment-13263412
 ] 

Michał Michalski commented on CASSANDRA-4177:
-

Good point. This is what I get:

{noformat}12/04/27 08:58:05 INFO mapred.JobClient: Task Id : 
attempt_201204100823_2167_m_00_0, Status : FAILED
java.lang.RuntimeException: Could not retrieve endpoint ranges: 
at 
org.apache.cassandra.hadoop.BulkRecordWriter$ExternalClient.init(BulkRecordWriter.java:263)
at 
org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:117)
at 
org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:112)
at 
org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:193)
at 
org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:178)
at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:540)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caus
{noformat}

So, I guess that this Caus in the end is what you are asking about, but for 
some reason it's truncated. Of course I bet that it's because of my app, but I 
will have to investigate it a bit.

Anyway, I still think that the exception with a fixed message like this is 
inappriopriate and misguiding in some way if we can get more detailed 
Exceptions of many kinds here. I've checked how does it looks like in other 
classes with authentication and what I found is that i.e. in 
ColumnFamilyRecordReader.java it's made in the way I proposed (throw new 
RuntimeException(e);). So, I'm not arguing that it's a kind of a must-have 
thing, but I still think it's a bit more proper way of handling this case :)

However, answering your question - yes, you're right :)

 Little improvement on the messages of the exceptions thrown by ExternalClient
 -

 Key: CASSANDRA-4177
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4177
 Project: Cassandra
  Issue Type: Improvement
Reporter: Michał Michalski
Assignee: Michał Michalski
Priority: Trivial
 Attachments: trunk-4177.txt


 After adding BulkRecordWriter (or actually ExternalClient) the ability to 
 make use of authentication I've noticed that exceptions that are thrown on 
 login failure are very misguiding - there's always a Could not retrieve 
 endpoint ranges RuntimeException being thrown, no matter what really 
 happens. This hides the real reason of all authentication problems. I've 
 changed this line a bit, so all the messages are passed without any change, 
 so now I get - for example - AuthenticationException(why:Given password in 
 password mode MD5 could not be validated for user operator) or - in worst 
 case - Unexpected authentication problem, which is waaay more helpful, so I 
 submit this trivial, but useful improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-4194) CQL3: improve experience with time uuid

2012-04-27 Thread Sylvain Lebresne (JIRA)
Sylvain Lebresne created CASSANDRA-4194:
---

 Summary: CQL3: improve experience with time uuid
 Key: CASSANDRA-4194
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4194
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 1.1.1


This ticket proposes to add a timeuuid type to CQL3. I know that the uuid type 
does support version 1 UUID (which is fine), but my rational is that time 
series is a very common use case for Cassandra. But when modeling time series, 
it seems to me that you'd almost always want to use time uuids rather than 
timestamps to avoid having to care about collision. In those case, using a 
timeuuid type would imo have the following advantages over simply uuid:
# the type convey the idea that this is really a date (but need to avoid 
collision). In other words, the 'time' in timeuuid has a documentation purpose.
# it validates that you do only insert a UUID v1. Inserting non-time based UUID 
when you really care about the time ordering is a important mistake, it's nice 
to validate this doesn't happen (it's one of the goal of the type after all)
# it'll allow to parse date values (which TimeUUIDType already does). Since 
timeuuid is really a date, it's useful and convenient to allow '2012-04-27 
11:32:02' as a value.

I'll note that imho there really is no reason not to at least allow 3) and even 
if there is strong opposition to adding a new timeuuid type (though I don't see 
why that would be a big deal) we could add the parsing of date to uuid. But I 
do think personally that 1) and 2) are equally important and warrant the 
addition of timeuuid (and it'll feel less random to parse date as timeuuid than 
to do it for uuid).


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4182) multithreaded compaction very slow with large single data file and a few tiny data files

2012-04-27 Thread Karl Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263492#comment-13263492
 ] 

Karl Mueller commented on CASSANDRA-4182:
-

Yes it maxes one CPU.  That's what one CPU can do.

 multithreaded compaction very slow with large single data file and a few tiny 
 data files
 

 Key: CASSANDRA-4182
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4182
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.9
 Environment: Redhat
 Sun JDK 1.6.0_20-b02
Reporter: Karl Mueller

 Turning on multithreaded compaction makes compaction time take nearly twice 
 as long in our environment, which includes a very large SStable and a few 
 smaller ones, relative to either 0.8.x with MT turned off or 1.0.x with MT 
 turned off.  
 compaction_throughput_mb_per_sec is set to 0.  
 We currently compact about 500 GB of data nightly due to overwrites.  
 (LevelDB will probably be enabled on the busy CFs once 1.0.x is rolled out 
 completely)  The time it takes to do the compaction is:
 451m13.284s (multithreaded)
 273m58.740s (multihtreaded disabled)
 Our nodes run on SSDs and therefore have a high read and write rate available 
 to them. The primary CF they're compacting right now, with most of the data, 
 is localized to a very large file (~300+GB) and a few tiny files (1-10GB) 
 since the CF has become far less active.  
 I would expect the multithreaded compaction to be no worse than the single 
 threaded compaction, or perhaps a higher cost in CPU for the same 
 performance, but it's half the speed with the same CPU usage, or more CPU. 
 I have two graphs available from testing 2 or 3 compactions which demonstrate 
 some interesting characteristics.  1.0.9 was installed on the 21st with MT 
 turned on.  Prior stuff is 0.8.7 with MT turned off, but 1.0.9 with MT turned 
 off seems to perform as well as 0.8.7.
 http://www.xney.com/temp/cass-irq.png  (interrupts)
 http://www.xney.com/temp/cass-iostat.png (io bandwidth of disks)
 This demonstrates a large increase in rescheduling interrupts and only half 
 the bandwidth used on the disks.  I suspect this is because some kind of 
 threads are thrashing or something like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4194) CQL3: improve experience with time uuid

2012-04-27 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-4194:


Attachment: 0003-Adds-x-days-ago-notation-for-convenience.txt
0002-Refactor-DateType-and-TimeUUIDType-to-share-code.txt
0001-Add-CQL3-timeuuid-type.txt

Adding 3 patches for this. The first one is the actual addition of timeuuid 
(btw, i'm fine if someone prefer some other name). The other two are related 
small improvement: the first one refactor the date parsing code to have it 
shared by both DateType and TimeUUIDType and the second one adds the parsing of 
things like '4 days ago' for convenience sake (I'm not claiming it's a super 
useful thing but I figured this could be nice for interactive sessions; I don't 
care too much about it though).

 CQL3: improve experience with time uuid
 ---

 Key: CASSANDRA-4194
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4194
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: cql3
 Fix For: 1.1.1

 Attachments: 0001-Add-CQL3-timeuuid-type.txt, 
 0002-Refactor-DateType-and-TimeUUIDType-to-share-code.txt, 
 0003-Adds-x-days-ago-notation-for-convenience.txt


 This ticket proposes to add a timeuuid type to CQL3. I know that the uuid 
 type does support version 1 UUID (which is fine), but my rational is that 
 time series is a very common use case for Cassandra. But when modeling time 
 series, it seems to me that you'd almost always want to use time uuids rather 
 than timestamps to avoid having to care about collision. In those case, using 
 a timeuuid type would imo have the following advantages over simply uuid:
 # the type convey the idea that this is really a date (but need to avoid 
 collision). In other words, the 'time' in timeuuid has a documentation 
 purpose.
 # it validates that you do only insert a UUID v1. Inserting non-time based 
 UUID when you really care about the time ordering is a important mistake, 
 it's nice to validate this doesn't happen (it's one of the goal of the type 
 after all)
 # it'll allow to parse date values (which TimeUUIDType already does). Since 
 timeuuid is really a date, it's useful and convenient to allow '2012-04-27 
 11:32:02' as a value.
 I'll note that imho there really is no reason not to at least allow 3) and 
 even if there is strong opposition to adding a new timeuuid type (though I 
 don't see why that would be a big deal) we could add the parsing of date to 
 uuid. But I do think personally that 1) and 2) are equally important and 
 warrant the addition of timeuuid (and it'll feel less random to parse date as 
 timeuuid than to do it for uuid).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-4193) cql delete does not delete

2012-04-27 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne reassigned CASSANDRA-4193:
---

Assignee: Sylvain Lebresne

 cql delete does not delete 
 ---

 Key: CASSANDRA-4193
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4193
 Project: Cassandra
  Issue Type: Bug
Reporter: Jackson Chung
Assignee: Sylvain Lebresne

 tested in 1.1 and trunk branch on a single node:
 {panel}
 cqlsh:test create table testcf_old ( username varchar , id int , name 
 varchar , stuff varchar, primary key(username,id,name)) with compact storage;
 cqlsh:test insert into testcf_old ( username , id , name , stuff ) values 
 ('abc', 2, 'rst', 'some other bunch of craps');
 cqlsh:test select * from testcf_old;
  username | id | name | stuff
 --++--+---
   abc |  2 |  rst | some other bunch of craps
   abc |  4 |  xyz |  a bunch of craps
 cqlsh:test delete from testcf_old where username = 'abc' and id =2;
 cqlsh:test select * from testcf_old;
  username | id | name | stuff
 --++--+---
   abc |  2 |  rst | some other bunch of craps
   abc |  4 |  xyz |  a bunch of craps
 {panel}
 same also when not using compact:
 {panel}
 cqlsh:test create table testcf ( username varchar , id int , name varchar , 
 stuff varchar, primary key(username,id));
 cqlsh:test select * from testcf;
  username | id | name  | stuff
 --++---+--
   abc |  2 | some other bunch of craps |  rst
   abc |  4 |   xyz | a bunch of craps
 cqlsh:test delete from testcf where username = 'abc' and id =2;
 cqlsh:test select * from testcf;
  username | id | name  | stuff
 --++---+--
   abc |  2 | some other bunch of craps |  rst
   abc |  4 |   xyz | a bunch of craps
 {panel}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4193) cql delete does not delete

2012-04-27 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263638#comment-13263638
 ] 

Sylvain Lebresne commented on CASSANDRA-4193:
-

So for the compact case, this is a dupe of CASSANDRA-3708, it in fact requires 
a range tombstone (they could be millions of record having the 'abc' and 2 as 
first components). The reason for the non-compact case is very similar, this 
amount internally to remove multiple columns. However in that second we could 
implement a workaround as we know which columns are defined for the table. 
However, for this too CASSANDRA-3708 will offer a better fix, as it will be 
more efficient to have 1 (range) tombstone rather than n where n is the number 
of columns in the table and less special code once CASSANDRA-3708 is in.

So what I propose is for now to throw an error on the compact case and support 
the second one by deleting each column individually. Once CASSANDRA-3708 is in, 
we'll use it to replace the second part. Patch attached to do that.

 cql delete does not delete 
 ---

 Key: CASSANDRA-4193
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4193
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Jackson Chung
Assignee: Sylvain Lebresne
  Labels: cql3
 Fix For: 1.1.1

 Attachments: 4193.txt


 tested in 1.1 and trunk branch on a single node:
 {panel}
 cqlsh:test create table testcf_old ( username varchar , id int , name 
 varchar , stuff varchar, primary key(username,id,name)) with compact storage;
 cqlsh:test insert into testcf_old ( username , id , name , stuff ) values 
 ('abc', 2, 'rst', 'some other bunch of craps');
 cqlsh:test select * from testcf_old;
  username | id | name | stuff
 --++--+---
   abc |  2 |  rst | some other bunch of craps
   abc |  4 |  xyz |  a bunch of craps
 cqlsh:test delete from testcf_old where username = 'abc' and id =2;
 cqlsh:test select * from testcf_old;
  username | id | name | stuff
 --++--+---
   abc |  2 |  rst | some other bunch of craps
   abc |  4 |  xyz |  a bunch of craps
 {panel}
 same also when not using compact:
 {panel}
 cqlsh:test create table testcf ( username varchar , id int , name varchar , 
 stuff varchar, primary key(username,id));
 cqlsh:test select * from testcf;
  username | id | name  | stuff
 --++---+--
   abc |  2 | some other bunch of craps |  rst
   abc |  4 |   xyz | a bunch of craps
 cqlsh:test delete from testcf where username = 'abc' and id =2;
 cqlsh:test select * from testcf;
  username | id | name  | stuff
 --++---+--
   abc |  2 | some other bunch of craps |  rst
   abc |  4 |   xyz | a bunch of craps
 {panel}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: Make identifier and value grammar for CQL3 stricter

2012-04-27 Thread slebresne
Updated Branches:
  refs/heads/cassandra-1.1 698a2bbea - 60aa1d03e


Make identifier and value grammar for CQL3 stricter

patch by slebresne; reviewed by jbellis for CASSANDRA-4184


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/60aa1d03
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/60aa1d03
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/60aa1d03

Branch: refs/heads/cassandra-1.1
Commit: 60aa1d03e424af03537e50d41a6b5dccd03db0b5
Parents: 698a2bb
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Apr 27 15:28:01 2012 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Apr 27 15:28:01 2012 +0200

--
 src/java/org/apache/cassandra/cql3/Cql.g |8 
 1 files changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/60aa1d03/src/java/org/apache/cassandra/cql3/Cql.g
--
diff --git a/src/java/org/apache/cassandra/cql3/Cql.g 
b/src/java/org/apache/cassandra/cql3/Cql.g
index f1b4718..9051d61 100644
--- a/src/java/org/apache/cassandra/cql3/Cql.g
+++ b/src/java/org/apache/cassandra/cql3/Cql.g
@@ -410,8 +410,8 @@ truncateStatement returns [TruncateStatement stmt]
 
 // Column Identifiers
 cident returns [ColumnIdentifier id]
-: t=( IDENT | UUID | INTEGER ) { $id = new ColumnIdentifier($t.text, 
false); }
-| t=QUOTED_NAME{ $id = new ColumnIdentifier($t.text, 
true); }
+: t=IDENT   { $id = new ColumnIdentifier($t.text, false); }
+| t=QUOTED_NAME { $id = new ColumnIdentifier($t.text, true); }
 ;
 
 // Keyspace  Column family names
@@ -437,8 +437,8 @@ cidentList returns [ListColumnIdentifier items]
 
 // Values (includes prepared statement markers)
 term returns [Term term]
-: t=(STRING_LITERAL | UUID | IDENT | INTEGER | FLOAT ) { $term = new 
Term($t.text, $t.type); }
-| t=QMARK  { $term = new 
Term($t.text, $t.type, ++currentBindMarkerIdx); }
+: t=(STRING_LITERAL | UUID | INTEGER | FLOAT ) { $term = new Term($t.text, 
$t.type); }
+| t=QMARK  { $term = new Term($t.text, 
$t.type, ++currentBindMarkerIdx); }
 ;
 
 intTerm returns [Term integer]



[jira] [Commented] (CASSANDRA-4182) multithreaded compaction very slow with large single data file and a few tiny data files

2012-04-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263660#comment-13263660
 ] 

Jonathan Ellis commented on CASSANDRA-4182:
---

bq. a very large SStable and a few smaller ones

parallel compaction is designed around thread-per-sstable, so this is basically 
a worst-case scenario for it: you're limited by the thread handling the large 
one.

there isn't a whole lot we can do to throw multiple threads at a single 
sstable, especially when wide rows are involved.

bq. but this will impact a mixed deployment of classic compactions and the new 
leveldb

You can still have multiple concurrent compactions, each in one thread.

 multithreaded compaction very slow with large single data file and a few tiny 
 data files
 

 Key: CASSANDRA-4182
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4182
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.9
 Environment: Redhat
 Sun JDK 1.6.0_20-b02
Reporter: Karl Mueller

 Turning on multithreaded compaction makes compaction time take nearly twice 
 as long in our environment, which includes a very large SStable and a few 
 smaller ones, relative to either 0.8.x with MT turned off or 1.0.x with MT 
 turned off.  
 compaction_throughput_mb_per_sec is set to 0.  
 We currently compact about 500 GB of data nightly due to overwrites.  
 (LevelDB will probably be enabled on the busy CFs once 1.0.x is rolled out 
 completely)  The time it takes to do the compaction is:
 451m13.284s (multithreaded)
 273m58.740s (multihtreaded disabled)
 Our nodes run on SSDs and therefore have a high read and write rate available 
 to them. The primary CF they're compacting right now, with most of the data, 
 is localized to a very large file (~300+GB) and a few tiny files (1-10GB) 
 since the CF has become far less active.  
 I would expect the multithreaded compaction to be no worse than the single 
 threaded compaction, or perhaps a higher cost in CPU for the same 
 performance, but it's half the speed with the same CPU usage, or more CPU. 
 I have two graphs available from testing 2 or 3 compactions which demonstrate 
 some interesting characteristics.  1.0.9 was installed on the 21st with MT 
 turned on.  Prior stuff is 0.8.7 with MT turned off, but 1.0.9 with MT turned 
 off seems to perform as well as 0.8.7.
 http://www.xney.com/temp/cass-irq.png  (interrupts)
 http://www.xney.com/temp/cass-iostat.png (io bandwidth of disks)
 This demonstrates a large increase in rescheduling interrupts and only half 
 the bandwidth used on the disks.  I suspect this is because some kind of 
 threads are thrashing or something like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2889) Avoids having replicate on write tasks stacking up at CL.ONE

2012-04-27 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2889:


Attachment: 2889.txt

Forgot a bit about this issue. Attaching a simple patch to simply limit the 
queue size for the replicate_on_write stage. My intuition is that this is 
probably good enough so not sure if it's worth getting much more fancy.

 Avoids having replicate on write tasks stacking up at CL.ONE
 

 Key: CASSANDRA-2889
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2889
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: counters
 Fix For: 1.1.1

 Attachments: 2889.txt


 The counter design involves a read on the first replica during a write. At 
 CL.ONE, this read is not involved in the latency of the operation (the write 
 is acknowledged before). This means it is fairly easy to insert too quickly 
 at CL.ONE and have the replicate on write tasks falling behind. The goal of 
 this ticket is to protect against that.
 An option could be to bound the replicate on write task queue so that write 
 start to block once we have too much of those in the queue. Another option 
 could be to drop the oldest tasks when they are too old, but it's probably a 
 more unsafe option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: eliminate clientutil dependency on commons-lang

2012-04-27 Thread eevans
Updated Branches:
  refs/heads/cassandra-1.0 eb9f96146 - a725f80fc


eliminate clientutil dependency on commons-lang

Patch by Dave Brosius; reviewed by eevans for CASSANDRA-3665


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a725f80f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a725f80f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a725f80f

Branch: refs/heads/cassandra-1.0
Commit: a725f80fce2be6880db9572b35339d68acc398a7
Parents: eb9f961
Author: Eric Evans eev...@apache.org
Authored: Fri Apr 27 09:40:56 2012 -0500
Committer: Eric Evans eev...@apache.org
Committed: Fri Apr 27 09:40:56 2012 -0500

--
 build.xml  |1 -
 .../org/apache/cassandra/utils/ByteBufferUtil.java |   10 +++---
 2 files changed, 7 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a725f80f/build.xml
--
diff --git a/build.xml b/build.xml
index ebd80f6..233006a 100644
--- a/build.xml
+++ b/build.xml
@@ -1023,7 +1023,6 @@
 /fileset
 fileset dir=${build.lib}
   include name=**/guava*.jar /
-  include name=**/commons-lang*.jar /
 /fileset
   /classpath
 /junit

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a725f80f/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
--
diff --git a/src/java/org/apache/cassandra/utils/ByteBufferUtil.java 
b/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
index c584205..0470abf 100644
--- a/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
+++ b/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
@@ -18,6 +18,12 @@
  */
 package org.apache.cassandra.utils;
 
+/*
+ * BE ADVISED: New imports added here might introduce new dependencies for
+ * the clientutil jar.  If in doubt, run the `ant test-clientutil-jar' target
+ * afterward, and ensure the tests still pass.
+ */
+
 import java.io.DataInput;
 import java.io.DataOutput;
 import java.io.IOException;
@@ -32,8 +38,6 @@ import static com.google.common.base.Charsets.UTF_8;
 import org.apache.cassandra.io.util.FileDataInput;
 import org.apache.cassandra.io.util.FileUtils;
 
-import org.apache.commons.lang.ArrayUtils;
-
 /**
  * Utility methods to make ByteBuffers less painful
  * The following should illustrate the different ways byte buffers can be used 
@@ -74,7 +78,7 @@ import org.apache.commons.lang.ArrayUtils;
  */
 public class ByteBufferUtil
 {
-public static final ByteBuffer EMPTY_BYTE_BUFFER = 
ByteBuffer.wrap(ArrayUtils.EMPTY_BYTE_ARRAY);
+public static final ByteBuffer EMPTY_BYTE_BUFFER = ByteBuffer.wrap(new 
byte[0]);
 
 public static int compareUnsigned(ByteBuffer o1, ByteBuffer o2)
 {



git commit: eliminate clientutil dependency on commons-lang

2012-04-27 Thread eevans
Updated Branches:
  refs/heads/cassandra-1.1 60aa1d03e - 2af8591bd


eliminate clientutil dependency on commons-lang

Patch by Dave Brosius; reviewed by eevans for CASSANDRA-3665


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2af8591b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2af8591b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2af8591b

Branch: refs/heads/cassandra-1.1
Commit: 2af8591bd71838b9782c552c6724f1efe1cc836e
Parents: 60aa1d0
Author: Eric Evans eev...@apache.org
Authored: Fri Apr 27 09:40:56 2012 -0500
Committer: Eric Evans eev...@apache.org
Committed: Fri Apr 27 09:44:59 2012 -0500

--
 build.xml  |1 -
 .../org/apache/cassandra/utils/ByteBufferUtil.java |   10 +++---
 2 files changed, 7 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2af8591b/build.xml
--
diff --git a/build.xml b/build.xml
index b7df8d2..1ef21f0 100644
--- a/build.xml
+++ b/build.xml
@@ -1077,7 +1077,6 @@
 /fileset
 fileset dir=${build.lib}
   include name=**/guava*.jar /
-  include name=**/commons-lang*.jar /
 /fileset
   /classpath
 /junit

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2af8591b/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
--
diff --git a/src/java/org/apache/cassandra/utils/ByteBufferUtil.java 
b/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
index 93f28df..e35e137 100644
--- a/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
+++ b/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
@@ -18,6 +18,12 @@
  */
 package org.apache.cassandra.utils;
 
+/*
+ * BE ADVISED: New imports added here might introduce new dependencies for
+ * the clientutil jar.  If in doubt, run the `ant test-clientutil-jar' target
+ * afterward, and ensure the tests still pass.
+ */
+
 import java.io.*;
 import java.nio.ByteBuffer;
 import java.nio.charset.CharacterCodingException;
@@ -29,8 +35,6 @@ import static com.google.common.base.Charsets.UTF_8;
 import org.apache.cassandra.io.util.FileDataInput;
 import org.apache.cassandra.io.util.FileUtils;
 
-import org.apache.commons.lang.ArrayUtils;
-
 /**
  * Utility methods to make ByteBuffers less painful
  * The following should illustrate the different ways byte buffers can be used
@@ -71,7 +75,7 @@ import org.apache.commons.lang.ArrayUtils;
  */
 public class ByteBufferUtil
 {
-public static final ByteBuffer EMPTY_BYTE_BUFFER = 
ByteBuffer.wrap(ArrayUtils.EMPTY_BYTE_ARRAY);
+public static final ByteBuffer EMPTY_BYTE_BUFFER = ByteBuffer.wrap(new 
byte[0]);
 
 public static int compareUnsigned(ByteBuffer o1, ByteBuffer o2)
 {



[jira] [Commented] (CASSANDRA-3665) [patch] allow for clientutil.jar to be used without the base cassandra.jar for client applications

2012-04-27 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263719#comment-13263719
 ] 

Eric Evans commented on CASSANDRA-3665:
---

bq. remove the dependency on commons_lang for clientutils for something silly.

This one is applied, thanks.

As for the one meant to support {{UUIDGen.getTimeUUIDBytes()}}, make sure you 
are using {{ant test-clientutil-jar}} when you test.  That target has been 
setup to _only_ include the dependencies we're expecting, and nothing else.  
That test fails with your other patches because, as I mentioned elsewhere, 
{{FBUtilities}} is a quagmire that requires pulling in many new dependencies.

I'm going to go ahead and close this issue, since the original scope has been 
met.  If you come up with a strategy to better support the non-essential parts 
of UUIDGen, or to further disentangle the dependencies, feel free to reopen it, 
or better yet open another, more specific issue.

 [patch] allow for clientutil.jar to be used without the base cassandra.jar 
 for client applications
 --

 Key: CASSANDRA-3665
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3665
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.0.8
Reporter: Dave Brosius
Assignee: Eric Evans
Priority: Minor
 Fix For: 1.0.10

 Attachments: fail_client_utils_test.diff, fix_client_util_jar.diff, 
 remove_commons_lang_dep.diff, 
 v1-0001-CASSANDRA-3665-test-to-expose-missing-dependencies.txt, 
 v1-0002-eliminate-dependency-on-FBUtilities.txt


 clientutil.jar can't be run from a client by itself without the presence of 
 cassandra.jar which seems wrong. Added needed classes to run by itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2012-04-27 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263750#comment-13263750
 ] 

Yuki Morishita commented on CASSANDRA-3668:
---

Sylvain, thanks for review. I will post updates based on your feedback.

Before that, here is the result of my performance test.
Used 1 node for vanilla C* (0.8.10, 1.1.0, 1.1.0-patched), and 1 node for 
sstableloader. Generated sstables to stream(8 files, 1.6GB total) using stress 
tool against v0.8.10 cassandra, and performed sstableloader(streaming 
throttling is disabled). Same sstable files were used for all tests. Each test 
ran twice and took average.

|| test case || time to complete (sec) || avg. bps (MB/S) ||
| v0.8.10| 52.38(*1) | 53 |
| v1.1.0 | 52.30 | 15 |
| v1.1.0-patched n=1 | 52.61 | 15 |
| v1.1.0-patched n=2 | 32.42 | 27 |
| v1.1.0-patched n=4 | 30.80 | 27 |
| v1.1.0-patched n=8 | 33.88 | 25 |

n value for v1.1.0-patched is number of threads to use.
v0.8.10 sleeps 30 sec to get gossip info, so (*1) is actually 22.38.

It's not as fast as 0.8 with this feature, but can boost loading compared to 
current versions.

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.0
Reporter: Manish Zope
Assignee: Yuki Morishita
Priority: Minor
  Labels: streaming
 Fix For: 1.1.1

 Attachments: 0001-Allow-multiple-connection-in-StreamInSession.patch, 
 0002-Allow-concurrent-stream-in-StreamOutSession.patch, 
 0003-Add-threads-option-to-sstableloader.patch, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2889) Avoids having replicate on write tasks stacking up at CL.ONE

2012-04-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263758#comment-13263758
 ] 

Jonathan Ellis commented on CASSANDRA-2889:
---

Who is going to block if ROW queue fills up?  Read stage?

 Avoids having replicate on write tasks stacking up at CL.ONE
 

 Key: CASSANDRA-2889
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2889
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: counters
 Fix For: 1.1.1

 Attachments: 2889.txt


 The counter design involves a read on the first replica during a write. At 
 CL.ONE, this read is not involved in the latency of the operation (the write 
 is acknowledged before). This means it is fairly easy to insert too quickly 
 at CL.ONE and have the replicate on write tasks falling behind. The goal of 
 this ticket is to protect against that.
 An option could be to bound the replicate on write task queue so that write 
 start to block once we have too much of those in the queue. Another option 
 could be to drop the oldest tasks when they are too old, but it's probably a 
 more unsafe option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2889) Avoids having replicate on write tasks stacking up at CL.ONE

2012-04-27 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263761#comment-13263761
 ] 

Sylvain Lebresne commented on CASSANDRA-2889:
-

No, the write stage (it's the one pushing the replicate task)

 Avoids having replicate on write tasks stacking up at CL.ONE
 

 Key: CASSANDRA-2889
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2889
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: counters
 Fix For: 1.1.1

 Attachments: 2889.txt


 The counter design involves a read on the first replica during a write. At 
 CL.ONE, this read is not involved in the latency of the operation (the write 
 is acknowledged before). This means it is fairly easy to insert too quickly 
 at CL.ONE and have the replicate on write tasks falling behind. The goal of 
 this ticket is to protect against that.
 An option could be to bound the replicate on write task queue so that write 
 start to block once we have too much of those in the queue. Another option 
 could be to drop the oldest tasks when they are too old, but it's probably a 
 more unsafe option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2889) Avoids having replicate on write tasks stacking up at CL.ONE

2012-04-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263765#comment-13263765
 ] 

Jonathan Ellis commented on CASSANDRA-2889:
---

+1

 Avoids having replicate on write tasks stacking up at CL.ONE
 

 Key: CASSANDRA-2889
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2889
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: counters
 Fix For: 1.1.1

 Attachments: 2889.txt


 The counter design involves a read on the first replica during a write. At 
 CL.ONE, this read is not involved in the latency of the operation (the write 
 is acknowledged before). This means it is fairly easy to insert too quickly 
 at CL.ONE and have the replicate on write tasks falling behind. The goal of 
 this ticket is to protect against that.
 An option could be to bound the replicate on write task queue so that write 
 start to block once we have too much of those in the queue. Another option 
 could be to drop the oldest tasks when they are too old, but it's probably a 
 more unsafe option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4120) Gossip identifies hosts by UUID

2012-04-27 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263793#comment-13263793
 ] 

Eric Evans commented on CASSANDRA-4120:
---

bq. Throw them away, I would say, since we've done this before when upgrading 
majors.

OK, https://github.com/acunu/cassandra/commit/f7864d and 
https://github.com/acunu/cassandra/commit/397644 should cover that (patch 
output reflects these changes as well).

bq. No. After the big removetoken rewrite, you couldn't removetoken in a mixed 
0.8.2/0.8.3 cluster, so this isn't a big deal, as long as we document it in 
NEWS.

https://github.com/acunu/cassandra/commit/17846f


 Gossip identifies hosts by UUID
 ---

 Key: CASSANDRA-4120
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4120
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Sam Overton
Assignee: Eric Evans
  Labels: virtualnodes, vnodes

 Since there is no longer a one-to-one mapping of host to token, a UUID should 
 be used to identify a host. This impacts:
 * Gossip
 * Hinted Hand-off
 * some JMX operations (eg. assassinateEndpointUnsafe)
 _Edit: Identify host by UUID, not IP_
 _Edit: Added table of patch links._
 h3. Patches
 ||Compare||Raw diff||Description||Last updated||
 |[01-4120|https://github.com/acunu/cassandra/compare/trunk...p/4120/01_create_store_host_uuid]|[01-4120.diff|https://github.com/acunu/cassandra/compare/trunk...p/4120/01_create_store_host_uuid.diff]|Get/set
  host ID (UUID)|2012-04-26|
 |[02-4120|https://github.com/acunu/cassandra/compare/p/4120/01_create_store_host_uuid...p/4120/02_uuid_in_application_status]|[02-4120.diff|https://github.com/acunu/cassandra/compare/p/4120/01_create_store_host_uuid...p/4120/02_uuid_in_application_status.diff]|Gossip
  host ID and maintain cluster mapping|2012-04-26|
 |[03-4120|https://github.com/acunu/cassandra/compare/p/4120/02_uuid_in_application_status...p/4120/03_reference_node_by_hostid]|[03-4120.diff|https://github.com/acunu/cassandra/compare/p/4120/02_uuid_in_application_status...p/4120/03_reference_node_by_hostid.diff]|Store
  hints by host ID|2012-04-26|
 |[04-4120|https://github.com/acunu/cassandra/compare/p/4120/03_reference_node_by_hostid...p/4120/04_remove_node_by_hostid]|[04-4120.diff|https://github.com/acunu/cassandra/compare/p/4120/03_reference_node_by_hostid...p/4120/04_remove_node_by_hostid.diff]|Refactor
  {{SS.removeToken}} to use ID instead of Token|2012-04-26|
 
 _Note: These are branches managed with TopGit. If you are applying the patch 
 output manually, you will either need to filter the TopGit metadata files 
 (i.e. {{wget -O - url | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), 
 or remove them afterward ({{rm .topmsg .topdeps}})._

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4120) Gossip identifies hosts by UUID

2012-04-27 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-4120:
--

Description: 
Since there is no longer a one-to-one mapping of host to token, a UUID should 
be used to identify a host. This impacts:

* Gossip
* Hinted Hand-off
* some JMX operations (eg. assassinateEndpointUnsafe)

_Edit: Identify host by UUID, not IP_
_Edit: Added table of patch links._

h3. Patches
||Compare||Raw diff||Description||Last updated||
|[01-4120|https://github.com/acunu/cassandra/compare/trunk...p/4120/01_create_store_host_uuid]|[01-4120.diff|https://github.com/acunu/cassandra/compare/trunk...p/4120/01_create_store_host_uuid.diff]|Get/set
 host ID (UUID)|2012-04-26|
|[02-4120|https://github.com/acunu/cassandra/compare/p/4120/01_create_store_host_uuid...p/4120/02_uuid_in_application_status]|[02-4120.diff|https://github.com/acunu/cassandra/compare/p/4120/01_create_store_host_uuid...p/4120/02_uuid_in_application_status.diff]|Gossip
 host ID and maintain cluster mapping|2012-04-26|
|[03-4120|https://github.com/acunu/cassandra/compare/p/4120/02_uuid_in_application_status...p/4120/03_reference_node_by_hostid]|[03-4120.diff|https://github.com/acunu/cassandra/compare/p/4120/02_uuid_in_application_status...p/4120/03_reference_node_by_hostid.diff]|Store
 hints by host ID|2012-04-27|
|[04-4120|https://github.com/acunu/cassandra/compare/p/4120/03_reference_node_by_hostid...p/4120/04_remove_node_by_hostid]|[04-4120.diff|https://github.com/acunu/cassandra/compare/p/4120/03_reference_node_by_hostid...p/4120/04_remove_node_by_hostid.diff]|Refactor
 {{SS.removeToken}} to use ID instead of Token|2012-04-27|



_Note: These are branches managed with TopGit. If you are applying the patch 
output manually, you will either need to filter the TopGit metadata files (i.e. 
{{wget -O - url | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), or remove 
them afterward ({{rm .topmsg .topdeps}})._

  was:
Since there is no longer a one-to-one mapping of host to token, a UUID should 
be used to identify a host. This impacts:

* Gossip
* Hinted Hand-off
* some JMX operations (eg. assassinateEndpointUnsafe)

_Edit: Identify host by UUID, not IP_
_Edit: Added table of patch links._

h3. Patches
||Compare||Raw diff||Description||Last updated||
|[01-4120|https://github.com/acunu/cassandra/compare/trunk...p/4120/01_create_store_host_uuid]|[01-4120.diff|https://github.com/acunu/cassandra/compare/trunk...p/4120/01_create_store_host_uuid.diff]|Get/set
 host ID (UUID)|2012-04-26|
|[02-4120|https://github.com/acunu/cassandra/compare/p/4120/01_create_store_host_uuid...p/4120/02_uuid_in_application_status]|[02-4120.diff|https://github.com/acunu/cassandra/compare/p/4120/01_create_store_host_uuid...p/4120/02_uuid_in_application_status.diff]|Gossip
 host ID and maintain cluster mapping|2012-04-26|
|[03-4120|https://github.com/acunu/cassandra/compare/p/4120/02_uuid_in_application_status...p/4120/03_reference_node_by_hostid]|[03-4120.diff|https://github.com/acunu/cassandra/compare/p/4120/02_uuid_in_application_status...p/4120/03_reference_node_by_hostid.diff]|Store
 hints by host ID|2012-04-26|
|[04-4120|https://github.com/acunu/cassandra/compare/p/4120/03_reference_node_by_hostid...p/4120/04_remove_node_by_hostid]|[04-4120.diff|https://github.com/acunu/cassandra/compare/p/4120/03_reference_node_by_hostid...p/4120/04_remove_node_by_hostid.diff]|Refactor
 {{SS.removeToken}} to use ID instead of Token|2012-04-26|



_Note: These are branches managed with TopGit. If you are applying the patch 
output manually, you will either need to filter the TopGit metadata files (i.e. 
{{wget -O - url | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), or remove 
them afterward ({{rm .topmsg .topdeps}})._


 Gossip identifies hosts by UUID
 ---

 Key: CASSANDRA-4120
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4120
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Sam Overton
Assignee: Eric Evans
  Labels: virtualnodes, vnodes

 Since there is no longer a one-to-one mapping of host to token, a UUID should 
 be used to identify a host. This impacts:
 * Gossip
 * Hinted Hand-off
 * some JMX operations (eg. assassinateEndpointUnsafe)
 _Edit: Identify host by UUID, not IP_
 _Edit: Added table of patch links._
 h3. Patches
 ||Compare||Raw diff||Description||Last updated||
 |[01-4120|https://github.com/acunu/cassandra/compare/trunk...p/4120/01_create_store_host_uuid]|[01-4120.diff|https://github.com/acunu/cassandra/compare/trunk...p/4120/01_create_store_host_uuid.diff]|Get/set
  host ID (UUID)|2012-04-26|
 

[jira] [Updated] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

2012-04-27 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-4142:


Attachment: 4142.txt

Attaching patch to only open 1 sstableScanner per level during repair. It only 
use the trick for repair because that is the only place where we open lots of 
sstableScanner, but I suppose we could apply the same to normal compaction and 
lower slightly the memory consumption there too.

Anyway, the patch includes a unit test to exercise the new code.

 OOM Exception during repair session with LeveledCompactionStrategy
 --

 Key: CASSANDRA-4142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.0
 Environment: OS: Linux CentOs 6 
 JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
 Node configuration:
 Quad-core
 10 GB RAM
 Xmx set to 2,5 GB (as computed by default).
Reporter: Romain Hardouin
Assignee: Sylvain Lebresne
 Fix For: 1.1.1

 Attachments: 4142.txt


 We encountered an OOM Exception on 2 nodes during repair session.
 Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
 These two options used together maybe the key to the problem.
 Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been 
 generated.
 Nonetheless a memory analysis on a live node doing a repair reveals an 
 hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as 
 many objects as there are SSTables on disk. 
 This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore 
 each object is about 140 KB.
 Eclipse Memory Analyzer's denominator tree shows that 99% of a 
 SSTableBoundedScanner object's memory is consumed by a 
 CompressedRandomAccessReader which contains two big byte arrays.
 Cluster information:
 9 nodes
 Each node handles 35 GB (RandomPartitioner)
 This JIRA was created following this discussion:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: Avoids having replicate on write tasks stacking up at CL.ONE

2012-04-27 Thread slebresne
Updated Branches:
  refs/heads/cassandra-1.1 2af8591bd - 2ca2fb3fd


Avoids having replicate on write tasks stacking up at CL.ONE

patch by slebresne; reviewed by jbellis for CASSANDRA-2889


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2ca2fb3f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2ca2fb3f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2ca2fb3f

Branch: refs/heads/cassandra-1.1
Commit: 2ca2fb3fdc1636e2d3d7feb446f66f6ed8043cf4
Parents: 2af8591
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Apr 27 19:39:31 2012 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Apr 27 19:39:31 2012 +0200

--
 CHANGES.txt|1 +
 .../apache/cassandra/concurrent/StageManager.java  |   14 +-
 2 files changed, 14 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2ca2fb3f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 91a8fbe..918c146 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -19,6 +19,7 @@
  * Move CfDef and KsDef validation out of thrift (CASSANDRA-4037)
  * Expose repairing by a user provided range (CASSANDRA-3912)
  * Add way to force the cassandra-cli to refresh it's schema (CASSANDRA-4052)
+ * Avoids having replicate on write tasks stacking up at CL.ONE 
(CASSANDRA-2889)
 Merged from 1.0:
  * Fix super columns bug where cache is not updated (CASSANDRA-4190)
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2ca2fb3f/src/java/org/apache/cassandra/concurrent/StageManager.java
--
diff --git a/src/java/org/apache/cassandra/concurrent/StageManager.java 
b/src/java/org/apache/cassandra/concurrent/StageManager.java
index c57b593..4bcb75d 100644
--- a/src/java/org/apache/cassandra/concurrent/StageManager.java
+++ b/src/java/org/apache/cassandra/concurrent/StageManager.java
@@ -37,13 +37,15 @@ public class StageManager
 
 public static final long KEEPALIVE = 60; // seconds to keep extra 
threads alive for when idle
 
+public static final int MAX_REPLICATE_ON_WRITE_TASKS = 1024 * 
Runtime.getRuntime().availableProcessors();
+
 static
 {
 stages.put(Stage.MUTATION, 
multiThreadedConfigurableStage(Stage.MUTATION, getConcurrentWriters()));
 stages.put(Stage.READ, multiThreadedConfigurableStage(Stage.READ, 
getConcurrentReaders()));
 stages.put(Stage.REQUEST_RESPONSE, 
multiThreadedStage(Stage.REQUEST_RESPONSE, 
Runtime.getRuntime().availableProcessors()));
 stages.put(Stage.INTERNAL_RESPONSE, 
multiThreadedStage(Stage.INTERNAL_RESPONSE, 
Runtime.getRuntime().availableProcessors()));
-stages.put(Stage.REPLICATE_ON_WRITE, 
multiThreadedConfigurableStage(Stage.REPLICATE_ON_WRITE, 
getConcurrentReplicators()));
+stages.put(Stage.REPLICATE_ON_WRITE, 
multiThreadedConfigurableStage(Stage.REPLICATE_ON_WRITE, 
getConcurrentReplicators(), MAX_REPLICATE_ON_WRITE_TASKS));
 // the rest are all single-threaded
 stages.put(Stage.STREAM, new 
JMXEnabledThreadPoolExecutor(Stage.STREAM));
 stages.put(Stage.GOSSIP, new 
JMXEnabledThreadPoolExecutor(Stage.GOSSIP));
@@ -73,6 +75,16 @@ public class StageManager
  stage.getJmxType());
 }
 
+private static ThreadPoolExecutor multiThreadedConfigurableStage(Stage 
stage, int numThreads, int maxTasksBeforeBlock)
+{
+return new JMXConfigurableThreadPoolExecutor(numThreads,
+ KEEPALIVE,
+ TimeUnit.SECONDS,
+ new 
LinkedBlockingQueueRunnable(maxTasksBeforeBlock),
+ new 
NamedThreadFactory(stage.getJmxName()),
+ stage.getJmxType());
+}
+
 /**
  * Retrieve a stage from the StageManager
  * @param stage name of the stage to be retrieved.



[jira] [Updated] (CASSANDRA-4164) Cqlsh should support DESCRIBE on cql3-style composite CFs

2012-04-27 Thread paul cannon (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

paul cannon updated CASSANDRA-4164:
---

Attachment: (was: 4164.patch.txt)

 Cqlsh should support DESCRIBE on cql3-style composite CFs
 -

 Key: CASSANDRA-4164
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4164
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Nick Bailey
Assignee: paul cannon
  Labels: cql3
 Fix For: 1.1.1

 Attachments: 4164.patch-2.txt


 There is a discrepancy between create column family commands and then the 
 output of the describe command:
 {noformat}
 cqlsh:test CREATE TABLE timeline (
 ... user_id varchar,
 ... tweet_id uuid,
 ... author varchar,
 ... body varchar,
 ... PRIMARY KEY (user_id, tweet_id)
 ... );
 cqlsh:test describe columnfamily timeline;
 CREATE COLUMNFAMILY timeline (
   user_id text PRIMARY KEY
 ) WITH
   comment='' AND
   
 comparator='CompositeType(org.apache.cassandra.db.marshal.UUIDType,org.apache.cassandra.db.marshal.UTF8Type)'
  AND
   read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   default_validation=text AND
   min_compaction_threshold=4 AND
   max_compaction_threshold=32 AND
   replicate_on_write=True AND
   compaction_strategy_class='SizeTieredCompactionStrategy' AND
   
 compression_parameters:sstable_compression='org.apache.cassandra.io.compress.SnappyCompressor';
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4164) Cqlsh should support DESCRIBE on cql3-style composite CFs

2012-04-27 Thread paul cannon (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

paul cannon updated CASSANDRA-4164:
---

Attachment: 4164.patch-2.txt

Thanks, Sylvain. Branch is updated; new tag is here:

https://github.com/thepaul/cassandra/tree/pending/4164-2

New comprehensive patch attached.

 Cqlsh should support DESCRIBE on cql3-style composite CFs
 -

 Key: CASSANDRA-4164
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4164
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Nick Bailey
Assignee: paul cannon
  Labels: cql3
 Fix For: 1.1.1

 Attachments: 4164.patch-2.txt


 There is a discrepancy between create column family commands and then the 
 output of the describe command:
 {noformat}
 cqlsh:test CREATE TABLE timeline (
 ... user_id varchar,
 ... tweet_id uuid,
 ... author varchar,
 ... body varchar,
 ... PRIMARY KEY (user_id, tweet_id)
 ... );
 cqlsh:test describe columnfamily timeline;
 CREATE COLUMNFAMILY timeline (
   user_id text PRIMARY KEY
 ) WITH
   comment='' AND
   
 comparator='CompositeType(org.apache.cassandra.db.marshal.UUIDType,org.apache.cassandra.db.marshal.UTF8Type)'
  AND
   read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   default_validation=text AND
   min_compaction_threshold=4 AND
   max_compaction_threshold=32 AND
   replicate_on_write=True AND
   compaction_strategy_class='SizeTieredCompactionStrategy' AND
   
 compression_parameters:sstable_compression='org.apache.cassandra.io.compress.SnappyCompressor';
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4173) cqlsh: in cql3 mode, use cql3 quoting when outputting cql

2012-04-27 Thread paul cannon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263872#comment-13263872
 ] 

paul cannon commented on CASSANDRA-4173:


Updating patch against new 4164.patch-2.txt from CASSANDRA-4164.

 cqlsh: in cql3 mode, use cql3 quoting when outputting cql
 -

 Key: CASSANDRA-4173
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4173
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 1.1.0
Reporter: paul cannon
Assignee: paul cannon
Priority: Minor
  Labels: cql3, cqlsh
 Fix For: 1.1.1

 Attachments: 4173.patch.txt


 when cqlsh needs to output a column name or other term which needs quoting 
 (say, if you run DESCRIBE KEYSPACE and some column name has a space in it), 
 it currently only knows how to quote in the cql2 way. That is,
 {noformat}
 cqlsh:foo describe columnfamily bar
 CREATE COLUMNFAMILY bar (
   a int PRIMARY KEY,
   'b c' text
 ) WITH
 ...
 {noformat}
 cql3 does not recognize single quotes around column names, or columnfamily or 
 keyspace names either. cqlsh ought to learn how to use double-quotes instead 
 when in cql3 mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4182) multithreaded compaction very slow with large single data file and a few tiny data files

2012-04-27 Thread Karl Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263907#comment-13263907
 ] 

Karl Mueller commented on CASSANDRA-4182:
-

Yes I figure it's a worst-case scenario pretty much.  I didn't expect it to be 
any faster than single-threaded, possibly a bit slower or taking more CPU.  

However, it's a LOT slower (~80% slower). 

I'd be happy if it were the same speed as the single thread for the worst case 
with more CPU.  


 multithreaded compaction very slow with large single data file and a few tiny 
 data files
 

 Key: CASSANDRA-4182
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4182
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.9
 Environment: Redhat
 Sun JDK 1.6.0_20-b02
Reporter: Karl Mueller

 Turning on multithreaded compaction makes compaction time take nearly twice 
 as long in our environment, which includes a very large SStable and a few 
 smaller ones, relative to either 0.8.x with MT turned off or 1.0.x with MT 
 turned off.  
 compaction_throughput_mb_per_sec is set to 0.  
 We currently compact about 500 GB of data nightly due to overwrites.  
 (LevelDB will probably be enabled on the busy CFs once 1.0.x is rolled out 
 completely)  The time it takes to do the compaction is:
 451m13.284s (multithreaded)
 273m58.740s (multihtreaded disabled)
 Our nodes run on SSDs and therefore have a high read and write rate available 
 to them. The primary CF they're compacting right now, with most of the data, 
 is localized to a very large file (~300+GB) and a few tiny files (1-10GB) 
 since the CF has become far less active.  
 I would expect the multithreaded compaction to be no worse than the single 
 threaded compaction, or perhaps a higher cost in CPU for the same 
 performance, but it's half the speed with the same CPU usage, or more CPU. 
 I have two graphs available from testing 2 or 3 compactions which demonstrate 
 some interesting characteristics.  1.0.9 was installed on the 21st with MT 
 turned on.  Prior stuff is 0.8.7 with MT turned off, but 1.0.9 with MT turned 
 off seems to perform as well as 0.8.7.
 http://www.xney.com/temp/cass-irq.png  (interrupts)
 http://www.xney.com/temp/cass-iostat.png (io bandwidth of disks)
 This demonstrates a large increase in rescheduling interrupts and only half 
 the bandwidth used on the disks.  I suspect this is because some kind of 
 threads are thrashing or something like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[4/6] git commit: Avoids having replicate on write tasks stacking up at CL.ONE

2012-04-27 Thread brandonwilliams
Avoids having replicate on write tasks stacking up at CL.ONE

patch by slebresne; reviewed by jbellis for CASSANDRA-2889


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2ca2fb3f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2ca2fb3f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2ca2fb3f

Branch: refs/heads/trunk
Commit: 2ca2fb3fdc1636e2d3d7feb446f66f6ed8043cf4
Parents: 2af8591
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Apr 27 19:39:31 2012 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Apr 27 19:39:31 2012 +0200

--
 CHANGES.txt|1 +
 .../apache/cassandra/concurrent/StageManager.java  |   14 +-
 2 files changed, 14 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2ca2fb3f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 91a8fbe..918c146 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -19,6 +19,7 @@
  * Move CfDef and KsDef validation out of thrift (CASSANDRA-4037)
  * Expose repairing by a user provided range (CASSANDRA-3912)
  * Add way to force the cassandra-cli to refresh it's schema (CASSANDRA-4052)
+ * Avoids having replicate on write tasks stacking up at CL.ONE 
(CASSANDRA-2889)
 Merged from 1.0:
  * Fix super columns bug where cache is not updated (CASSANDRA-4190)
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2ca2fb3f/src/java/org/apache/cassandra/concurrent/StageManager.java
--
diff --git a/src/java/org/apache/cassandra/concurrent/StageManager.java 
b/src/java/org/apache/cassandra/concurrent/StageManager.java
index c57b593..4bcb75d 100644
--- a/src/java/org/apache/cassandra/concurrent/StageManager.java
+++ b/src/java/org/apache/cassandra/concurrent/StageManager.java
@@ -37,13 +37,15 @@ public class StageManager
 
 public static final long KEEPALIVE = 60; // seconds to keep extra 
threads alive for when idle
 
+public static final int MAX_REPLICATE_ON_WRITE_TASKS = 1024 * 
Runtime.getRuntime().availableProcessors();
+
 static
 {
 stages.put(Stage.MUTATION, 
multiThreadedConfigurableStage(Stage.MUTATION, getConcurrentWriters()));
 stages.put(Stage.READ, multiThreadedConfigurableStage(Stage.READ, 
getConcurrentReaders()));
 stages.put(Stage.REQUEST_RESPONSE, 
multiThreadedStage(Stage.REQUEST_RESPONSE, 
Runtime.getRuntime().availableProcessors()));
 stages.put(Stage.INTERNAL_RESPONSE, 
multiThreadedStage(Stage.INTERNAL_RESPONSE, 
Runtime.getRuntime().availableProcessors()));
-stages.put(Stage.REPLICATE_ON_WRITE, 
multiThreadedConfigurableStage(Stage.REPLICATE_ON_WRITE, 
getConcurrentReplicators()));
+stages.put(Stage.REPLICATE_ON_WRITE, 
multiThreadedConfigurableStage(Stage.REPLICATE_ON_WRITE, 
getConcurrentReplicators(), MAX_REPLICATE_ON_WRITE_TASKS));
 // the rest are all single-threaded
 stages.put(Stage.STREAM, new 
JMXEnabledThreadPoolExecutor(Stage.STREAM));
 stages.put(Stage.GOSSIP, new 
JMXEnabledThreadPoolExecutor(Stage.GOSSIP));
@@ -73,6 +75,16 @@ public class StageManager
  stage.getJmxType());
 }
 
+private static ThreadPoolExecutor multiThreadedConfigurableStage(Stage 
stage, int numThreads, int maxTasksBeforeBlock)
+{
+return new JMXConfigurableThreadPoolExecutor(numThreads,
+ KEEPALIVE,
+ TimeUnit.SECONDS,
+ new 
LinkedBlockingQueueRunnable(maxTasksBeforeBlock),
+ new 
NamedThreadFactory(stage.getJmxName()),
+ stage.getJmxType());
+}
+
 /**
  * Retrieve a stage from the StageManager
  * @param stage name of the stage to be retrieved.



[1/6] git commit: Merge branch 'cassandra-1.1' into trunk

2012-04-27 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.1 2ca2fb3fd - b81f5723e
  refs/heads/trunk a6c5d3c43 - 047106291


Merge branch 'cassandra-1.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/04710629
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/04710629
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/04710629

Branch: refs/heads/trunk
Commit: 04710629100510e97a55f40e18bd5cdbe383de58
Parents: a6c5d3c b81f572
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Apr 27 16:07:35 2012 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Apr 27 16:07:35 2012 -0500

--
 CHANGES.txt|1 +
 bin/cqlsh  |  146 ++-
 build.xml  |1 -
 pylib/cqlshlib/cqlhandling.py  |   18 ++-
 .../apache/cassandra/concurrent/StageManager.java  |   14 ++-
 src/java/org/apache/cassandra/cql3/Cql.g   |8 +-
 .../org/apache/cassandra/utils/ByteBufferUtil.java |   10 +-
 7 files changed, 181 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/04710629/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/04710629/build.xml
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/04710629/src/java/org/apache/cassandra/concurrent/StageManager.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/04710629/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
--



[5/6] git commit: eliminate clientutil dependency on commons-lang

2012-04-27 Thread brandonwilliams
eliminate clientutil dependency on commons-lang

Patch by Dave Brosius; reviewed by eevans for CASSANDRA-3665


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2af8591b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2af8591b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2af8591b

Branch: refs/heads/trunk
Commit: 2af8591bd71838b9782c552c6724f1efe1cc836e
Parents: 60aa1d0
Author: Eric Evans eev...@apache.org
Authored: Fri Apr 27 09:40:56 2012 -0500
Committer: Eric Evans eev...@apache.org
Committed: Fri Apr 27 09:44:59 2012 -0500

--
 build.xml  |1 -
 .../org/apache/cassandra/utils/ByteBufferUtil.java |   10 +++---
 2 files changed, 7 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2af8591b/build.xml
--
diff --git a/build.xml b/build.xml
index b7df8d2..1ef21f0 100644
--- a/build.xml
+++ b/build.xml
@@ -1077,7 +1077,6 @@
 /fileset
 fileset dir=${build.lib}
   include name=**/guava*.jar /
-  include name=**/commons-lang*.jar /
 /fileset
   /classpath
 /junit

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2af8591b/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
--
diff --git a/src/java/org/apache/cassandra/utils/ByteBufferUtil.java 
b/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
index 93f28df..e35e137 100644
--- a/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
+++ b/src/java/org/apache/cassandra/utils/ByteBufferUtil.java
@@ -18,6 +18,12 @@
  */
 package org.apache.cassandra.utils;
 
+/*
+ * BE ADVISED: New imports added here might introduce new dependencies for
+ * the clientutil jar.  If in doubt, run the `ant test-clientutil-jar' target
+ * afterward, and ensure the tests still pass.
+ */
+
 import java.io.*;
 import java.nio.ByteBuffer;
 import java.nio.charset.CharacterCodingException;
@@ -29,8 +35,6 @@ import static com.google.common.base.Charsets.UTF_8;
 import org.apache.cassandra.io.util.FileDataInput;
 import org.apache.cassandra.io.util.FileUtils;
 
-import org.apache.commons.lang.ArrayUtils;
-
 /**
  * Utility methods to make ByteBuffers less painful
  * The following should illustrate the different ways byte buffers can be used
@@ -71,7 +75,7 @@ import org.apache.commons.lang.ArrayUtils;
  */
 public class ByteBufferUtil
 {
-public static final ByteBuffer EMPTY_BYTE_BUFFER = 
ByteBuffer.wrap(ArrayUtils.EMPTY_BYTE_ARRAY);
+public static final ByteBuffer EMPTY_BYTE_BUFFER = ByteBuffer.wrap(new 
byte[0]);
 
 public static int compareUnsigned(ByteBuffer o1, ByteBuffer o2)
 {



[6/6] git commit: Make identifier and value grammar for CQL3 stricter

2012-04-27 Thread brandonwilliams
Make identifier and value grammar for CQL3 stricter

patch by slebresne; reviewed by jbellis for CASSANDRA-4184


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/60aa1d03
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/60aa1d03
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/60aa1d03

Branch: refs/heads/trunk
Commit: 60aa1d03e424af03537e50d41a6b5dccd03db0b5
Parents: 698a2bb
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Apr 27 15:28:01 2012 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Apr 27 15:28:01 2012 +0200

--
 src/java/org/apache/cassandra/cql3/Cql.g |8 
 1 files changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/60aa1d03/src/java/org/apache/cassandra/cql3/Cql.g
--
diff --git a/src/java/org/apache/cassandra/cql3/Cql.g 
b/src/java/org/apache/cassandra/cql3/Cql.g
index f1b4718..9051d61 100644
--- a/src/java/org/apache/cassandra/cql3/Cql.g
+++ b/src/java/org/apache/cassandra/cql3/Cql.g
@@ -410,8 +410,8 @@ truncateStatement returns [TruncateStatement stmt]
 
 // Column Identifiers
 cident returns [ColumnIdentifier id]
-: t=( IDENT | UUID | INTEGER ) { $id = new ColumnIdentifier($t.text, 
false); }
-| t=QUOTED_NAME{ $id = new ColumnIdentifier($t.text, 
true); }
+: t=IDENT   { $id = new ColumnIdentifier($t.text, false); }
+| t=QUOTED_NAME { $id = new ColumnIdentifier($t.text, true); }
 ;
 
 // Keyspace  Column family names
@@ -437,8 +437,8 @@ cidentList returns [ListColumnIdentifier items]
 
 // Values (includes prepared statement markers)
 term returns [Term term]
-: t=(STRING_LITERAL | UUID | IDENT | INTEGER | FLOAT ) { $term = new 
Term($t.text, $t.type); }
-| t=QMARK  { $term = new 
Term($t.text, $t.type, ++currentBindMarkerIdx); }
+: t=(STRING_LITERAL | UUID | INTEGER | FLOAT ) { $term = new Term($t.text, 
$t.type); }
+| t=QMARK  { $term = new Term($t.text, 
$t.type, ++currentBindMarkerIdx); }
 ;
 
 intTerm returns [Term integer]



[2/6] git commit: Cqlsh supports DESCRIBE on cql3-style composite CFs. Patch by paul cannon, reviewed by brandonwilliams for CASSANDRA-4164

2012-04-27 Thread brandonwilliams
Cqlsh supports DESCRIBE on cql3-style composite CFs.
Patch by paul cannon, reviewed by brandonwilliams for CASSANDRA-4164


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b81f5723
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b81f5723
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b81f5723

Branch: refs/heads/cassandra-1.1
Commit: b81f5723efdf5ed79b6e152087c78e2befc9777f
Parents: 2ca2fb3
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Apr 27 16:04:39 2012 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Apr 27 16:05:42 2012 -0500

--
 bin/cqlsh |  146 ++-
 pylib/cqlshlib/cqlhandling.py |   18 -
 2 files changed, 156 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b81f5723/bin/cqlsh
--
diff --git a/bin/cqlsh b/bin/cqlsh
index c26324a..00c6c0d 100755
--- a/bin/cqlsh
+++ b/bin/cqlsh
@@ -50,6 +50,7 @@ import ConfigParser
 import codecs
 import re
 import platform
+import warnings
 
 # cqlsh should run correctly when run out of a Cassandra source tree,
 # out of an unpacked Cassandra tarball, and after a proper package install.
@@ -57,9 +58,11 @@ cqlshlibdir = 
os.path.join(os.path.dirname(os.path.realpath(__file__)), '..', 'p
 if os.path.isdir(cqlshlibdir):
 sys.path.insert(0, cqlshlibdir)
 
-from cqlshlib import cqlhandling, pylexotron, wcwidth
+from cqlshlib import cqlhandling, cql3handling, pylexotron, wcwidth
 from cqlshlib.cqlhandling import (token_dequote, cql_dequote, cql_escape,
   maybe_cql_escape, cql_typename)
+from cqlshlib.cql3handling import (CqlTableDef, maybe_cql3_escape_name,
+   cql3_escape_value)
 
 try:
 import readline
@@ -167,6 +170,7 @@ cqlhandling.commands_end_with_newline.update((
 'assume',
 'source',
 'capture',
+'debug',
 'exit',
 'quit'
 ))
@@ -180,6 +184,7 @@ cqlhandling.CqlRuleSet.append_rules(r'''
| assumeCommand
| sourceCommand
| captureCommand
+   | debugCommand
| helpCommand
| exitCommand
;
@@ -209,6 +214,9 @@ cqlhandling.CqlRuleSet.append_rules(r'''
 captureCommand ::= CAPTURE ( fname=( stringLiteral | OFF ) )?
;
 
+debugCommand ::= DEBUG
+ ;
+
 helpCommand ::= ( HELP | ? ) [topic]=( identifier | stringLiteral )*
 ;
 
@@ -456,6 +464,16 @@ def format_value(val, casstype, output_encoding, 
addcolor=False, time_format='',
 
 return FormattedValue(bval, coloredval, displaywidth)
 
+def show_warning_without_quoting_line(message, category, filename, lineno, 
file=None, line=None):
+if file is None:
+file = sys.stderr
+try:
+file.write(warnings.formatwarning(message, category, filename, lineno, 
line=''))
+except IOError:
+pass
+warnings.showwarning = show_warning_without_quoting_line
+warnings.filterwarnings('always', 
category=cql3handling.UnexpectedTableStructure)
+
 class Shell(cmd.Cmd):
 default_prompt  = cqlsh 
 continue_prompt =... 
@@ -589,6 +607,18 @@ class Shell(cmd.Cmd):
 return {'build': 'unknown', 'cql': 'unknown', 'thrift': thrift_ver}
 return vers
 
+def fetchdict(self):
+row = self.cursor.fetchone()
+desc = self.cursor.description
+return dict(zip([d[0] for d in desc], row))
+
+def fetchdict_all(self):
+dicts = []
+for row in self.cursor:
+desc = self.cursor.description
+dicts.append(dict(zip([d[0] for d in desc], row)))
+return dicts
+
 def get_keyspace_names(self):
 return [k.name for k in self.get_keyspaces()]
 
@@ -674,6 +704,21 @@ class Shell(cmd.Cmd):
 
 # = end thrift-dependent parts =
 
+# = cql3-dependent parts =
+
+def get_columnfamily_layout(self, ksname, cfname):
+self.cursor.execute(select * from system.schema_columnfamilies
+where keyspace=:ks and columnfamily=:cf,
+{'ks': ksname, 'cf': cfname})
+layout = self.fetchdict()
+self.cursor.execute(select * from system.schema_columns
+where keyspace=:ks and columnfamily=:cf,
+{'ks': ksname, 'cf': cfname})
+cols = self.fetchdict_all()
+return CqlTableDef.from_layout(layout, cols)
+
+# = end cql3-dependent parts =
+
 def reset_statement(self):
 self.reset_prompt()
 self.statement.truncate(0)
@@ -1030,12 

[3/6] git commit: Cqlsh supports DESCRIBE on cql3-style composite CFs. Patch by paul cannon, reviewed by brandonwilliams for CASSANDRA-4164

2012-04-27 Thread brandonwilliams
Cqlsh supports DESCRIBE on cql3-style composite CFs.
Patch by paul cannon, reviewed by brandonwilliams for CASSANDRA-4164


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b81f5723
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b81f5723
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b81f5723

Branch: refs/heads/trunk
Commit: b81f5723efdf5ed79b6e152087c78e2befc9777f
Parents: 2ca2fb3
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Apr 27 16:04:39 2012 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Apr 27 16:05:42 2012 -0500

--
 bin/cqlsh |  146 ++-
 pylib/cqlshlib/cqlhandling.py |   18 -
 2 files changed, 156 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b81f5723/bin/cqlsh
--
diff --git a/bin/cqlsh b/bin/cqlsh
index c26324a..00c6c0d 100755
--- a/bin/cqlsh
+++ b/bin/cqlsh
@@ -50,6 +50,7 @@ import ConfigParser
 import codecs
 import re
 import platform
+import warnings
 
 # cqlsh should run correctly when run out of a Cassandra source tree,
 # out of an unpacked Cassandra tarball, and after a proper package install.
@@ -57,9 +58,11 @@ cqlshlibdir = 
os.path.join(os.path.dirname(os.path.realpath(__file__)), '..', 'p
 if os.path.isdir(cqlshlibdir):
 sys.path.insert(0, cqlshlibdir)
 
-from cqlshlib import cqlhandling, pylexotron, wcwidth
+from cqlshlib import cqlhandling, cql3handling, pylexotron, wcwidth
 from cqlshlib.cqlhandling import (token_dequote, cql_dequote, cql_escape,
   maybe_cql_escape, cql_typename)
+from cqlshlib.cql3handling import (CqlTableDef, maybe_cql3_escape_name,
+   cql3_escape_value)
 
 try:
 import readline
@@ -167,6 +170,7 @@ cqlhandling.commands_end_with_newline.update((
 'assume',
 'source',
 'capture',
+'debug',
 'exit',
 'quit'
 ))
@@ -180,6 +184,7 @@ cqlhandling.CqlRuleSet.append_rules(r'''
| assumeCommand
| sourceCommand
| captureCommand
+   | debugCommand
| helpCommand
| exitCommand
;
@@ -209,6 +214,9 @@ cqlhandling.CqlRuleSet.append_rules(r'''
 captureCommand ::= CAPTURE ( fname=( stringLiteral | OFF ) )?
;
 
+debugCommand ::= DEBUG
+ ;
+
 helpCommand ::= ( HELP | ? ) [topic]=( identifier | stringLiteral )*
 ;
 
@@ -456,6 +464,16 @@ def format_value(val, casstype, output_encoding, 
addcolor=False, time_format='',
 
 return FormattedValue(bval, coloredval, displaywidth)
 
+def show_warning_without_quoting_line(message, category, filename, lineno, 
file=None, line=None):
+if file is None:
+file = sys.stderr
+try:
+file.write(warnings.formatwarning(message, category, filename, lineno, 
line=''))
+except IOError:
+pass
+warnings.showwarning = show_warning_without_quoting_line
+warnings.filterwarnings('always', 
category=cql3handling.UnexpectedTableStructure)
+
 class Shell(cmd.Cmd):
 default_prompt  = cqlsh 
 continue_prompt =... 
@@ -589,6 +607,18 @@ class Shell(cmd.Cmd):
 return {'build': 'unknown', 'cql': 'unknown', 'thrift': thrift_ver}
 return vers
 
+def fetchdict(self):
+row = self.cursor.fetchone()
+desc = self.cursor.description
+return dict(zip([d[0] for d in desc], row))
+
+def fetchdict_all(self):
+dicts = []
+for row in self.cursor:
+desc = self.cursor.description
+dicts.append(dict(zip([d[0] for d in desc], row)))
+return dicts
+
 def get_keyspace_names(self):
 return [k.name for k in self.get_keyspaces()]
 
@@ -674,6 +704,21 @@ class Shell(cmd.Cmd):
 
 # = end thrift-dependent parts =
 
+# = cql3-dependent parts =
+
+def get_columnfamily_layout(self, ksname, cfname):
+self.cursor.execute(select * from system.schema_columnfamilies
+where keyspace=:ks and columnfamily=:cf,
+{'ks': ksname, 'cf': cfname})
+layout = self.fetchdict()
+self.cursor.execute(select * from system.schema_columns
+where keyspace=:ks and columnfamily=:cf,
+{'ks': ksname, 'cf': cfname})
+cols = self.fetchdict_all()
+return CqlTableDef.from_layout(layout, cols)
+
+# = end cql3-dependent parts =
+
 def reset_statement(self):
 self.reset_prompt()
 self.statement.truncate(0)
@@ -1030,12 +1075,45 @@ 

git commit: test-failure-fix-SSTableReader loadNewSSTable

2012-04-27 Thread vijay
Updated Branches:
  refs/heads/trunk 047106291 - 5fd586424


test-failure-fix-SSTableReader loadNewSSTable


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5fd58642
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5fd58642
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5fd58642

Branch: refs/heads/trunk
Commit: 5fd5864249707e2a88bf27641d40ed675c8c9e14
Parents: 0471062
Author: Vijay Parthasarathy vijay2...@gmail.com
Authored: Fri Apr 27 14:16:41 2012 -0700
Committer: Vijay Parthasarathy vijay2...@gmail.com
Committed: Fri Apr 27 14:16:41 2012 -0700

--
 .../apache/cassandra/io/sstable/SSTableWriter.java |3 ++-
 .../org/apache/cassandra/utils/FBUtilities.java|5 +
 2 files changed, 7 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5fd58642/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
index 1a225e4..d505151 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
@@ -349,10 +349,11 @@ public class SSTableWriter extends SSTable
 try
 {
 // do -Data last because -Data present should mean the sstable was 
completely renamed before crash
-// don't rename -Summary component as it is not created yet and 
created when SSTable is loaded.
 for (Component component : Sets.difference(components, 
Sets.newHashSet(Component.DATA, Component.SUMMARY)))
 FBUtilities.renameWithConfirm(tmpdesc.filenameFor(component), 
newdesc.filenameFor(component));
 FBUtilities.renameWithConfirm(tmpdesc.filenameFor(Component.DATA), 
newdesc.filenameFor(Component.DATA));
+// rename it without confirmation because summary can be available 
for loadNewSSTables but not for closeAndOpenReader
+
FBUtilities.renameWithOutConfirm(tmpdesc.filenameFor(Component.SUMMARY), 
newdesc.filenameFor(Component.SUMMARY));
 }
 catch (IOException e)
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5fd58642/src/java/org/apache/cassandra/utils/FBUtilities.java
--
diff --git a/src/java/org/apache/cassandra/utils/FBUtilities.java 
b/src/java/org/apache/cassandra/utils/FBUtilities.java
index 16eda56..ac55d08 100644
--- a/src/java/org/apache/cassandra/utils/FBUtilities.java
+++ b/src/java/org/apache/cassandra/utils/FBUtilities.java
@@ -236,6 +236,11 @@ public class FBUtilities
 }
 }
 
+public static void renameWithOutConfirm(String tmpFilename, String 
filename) throws IOException
+{
+new File(tmpFilename).renameTo(new File(filename));
+}
+
 public static void serialize(TSerializer serializer, TBase struct, 
DataOutput out)
 throws IOException
 {



[jira] [Updated] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

2012-04-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4142:
--

Attachment: 4142-v2.txt

v2 attached:

- Renames KeyScanner to o.a.c.db.compaction.ICompactionScanner
- Remove AbstractCompactionIterable.getScanners in favor of ACS.getScanners, 
which wires it in to normal compactions as well.  (This lets 
ParallelCompactionIterator be more agressive about how much memory it lets each 
scanner use, too.)
- Updated LCS.getScanners constructor to use a Multimap, which simplifies 
things a bit.

 OOM Exception during repair session with LeveledCompactionStrategy
 --

 Key: CASSANDRA-4142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.0
 Environment: OS: Linux CentOs 6 
 JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
 Node configuration:
 Quad-core
 10 GB RAM
 Xmx set to 2,5 GB (as computed by default).
Reporter: Romain Hardouin
Assignee: Sylvain Lebresne
 Fix For: 1.1.1

 Attachments: 4142-v2.txt, 4142.txt


 We encountered an OOM Exception on 2 nodes during repair session.
 Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
 These two options used together maybe the key to the problem.
 Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been 
 generated.
 Nonetheless a memory analysis on a live node doing a repair reveals an 
 hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as 
 many objects as there are SSTables on disk. 
 This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore 
 each object is about 140 KB.
 Eclipse Memory Analyzer's denominator tree shows that 99% of a 
 SSTableBoundedScanner object's memory is consumed by a 
 CompressedRandomAccessReader which contains two big byte arrays.
 Cluster information:
 9 nodes
 Each node handles 35 GB (RandomPartitioner)
 This JIRA was created following this discussion:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2012-04-27 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-3668:
--

Attachment: (was: 
0002-Allow-concurrent-stream-in-StreamOutSession.patch)

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.0
Reporter: Manish Zope
Assignee: Yuki Morishita
Priority: Minor
  Labels: streaming
 Fix For: 1.1.1

 Attachments: 3668-1.1.txt, 3688-reply_before_closing_writer.txt, 
 sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2012-04-27 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-3668:
--

Attachment: (was: 0003-Add-threads-option-to-sstableloader.patch)

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.0
Reporter: Manish Zope
Assignee: Yuki Morishita
Priority: Minor
  Labels: streaming
 Fix For: 1.1.1

 Attachments: 3668-1.1.txt, 3688-reply_before_closing_writer.txt, 
 sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2012-04-27 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-3668:
--

Attachment: (was: 
0001-Allow-multiple-connection-in-StreamInSession.patch)

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.0
Reporter: Manish Zope
Assignee: Yuki Morishita
Priority: Minor
  Labels: streaming
 Fix For: 1.1.1

 Attachments: 3668-1.1.txt, 3688-reply_before_closing_writer.txt, 
 sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2012-04-27 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-3668:
--

Attachment: 3668-1.1.txt

I rebased and attached patch. (also pushed to 
https://github.com/yukim/cassandra/tree/3668-2).

bq. I don't think the 'retries' field should be moved from StreamInSession to 
IncomingStreamReader. The ISR is attached to the streaming of one file, so when 
the node on the other side does a retry, a new ISR will be created (while the 
StreamInSession persists until closed), and so retries field will be reseted 
and we will retry infinitely. But it probably does mean it'll make more sense 
to have one retry counter per-file (changing StreamInSession.currentFiles to a 
MapString, Integer maybe?)

I modified to have per session retries, since we are closing session after 
retries exceed the limit.

bq. Currently, every messages related to the streaming goes through the ISR 
socket except for the message of session failure (in 
StreamInSession.closeInternal).

I moved SESSION_FAILURE to ISR.

bq. StreamOutSession.begin() shouldn't assume the queue has at least one 
element (at least the current code don't expect it)

Fixed.

bq. MessagingService: Looking at ThreadPoolExecutor code it seems that this is 
harmless but bumping the executor core pool size would make it exceed the max 
pool size.

Agree. I changed max pool size to Integer.MAX_VALUE. I also changed the name of 
variable streamingThreadsPerNode to streamExecutorCoreSize and set its default 
value to 0, in order to preserve current behavior.

bq. IncomingTcpConnection: sets the name when starting streaming but don't 
unset it. Probably not worth changing the name, the thread name is probably not 
the best place anyway to log activity (but it's a good idea to set a meaningful 
name in the first place).

For now I reverted this change. I just felt weird to see the thread name of 
IncomingTCPConn as default (like 'Thread-10') on the other hand OutboundTCPConn 
gets name like 'WRITE-192.168.1.141'.

bq. In FileStreamTask.runMayThrow, instead of logging an error, we could wrap 
the IOException in a new one with the message. That way we wouldn't have 2 
messages in the log for the same error.

You are right. Fixed.

bq. Would feel more coherent to me to move the sending of the SESSION_FINISHED 
message in ISR (for instance SIS.closeIfFinished could return a boolean if the 
session is indeed finished (and we would just have to pass the file name to 
this method instead of the ISR instance)).

Moved to ISR.

bq. Not this patch fault but in SIS.getIncomingFiles(), adding the currentFiles 
is useless

This was actually necessary because pending files in 'files' field do not get 
updated. But having separate field for currently streaming files seems 
redundant, so in this version I modified to replace pending file in 'files' 
with actually streaming one.

bq. In StreamOutSession.begin() there is a commented debug line (could be 
removed)

Removed.

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.0
Reporter: Manish Zope
Assignee: Yuki Morishita
Priority: Minor
  Labels: streaming
 Fix For: 1.1.1

 Attachments: 3668-1.1.txt, 3688-reply_before_closing_writer.txt, 
 sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4173) cqlsh: in cql3 mode, use cql3 quoting when outputting cql

2012-04-27 Thread paul cannon (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

paul cannon updated CASSANDRA-4173:
---

Attachment: 4173-2.patch.txt

New patch attached, github branch rebased, new tag at:

https://github.com/thepaul/cassandra/tree/pending/4173-2

 cqlsh: in cql3 mode, use cql3 quoting when outputting cql
 -

 Key: CASSANDRA-4173
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4173
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 1.1.0
Reporter: paul cannon
Assignee: paul cannon
Priority: Minor
  Labels: cql3, cqlsh
 Fix For: 1.1.1

 Attachments: 4173-2.patch.txt, 4173.patch.txt


 when cqlsh needs to output a column name or other term which needs quoting 
 (say, if you run DESCRIBE KEYSPACE and some column name has a space in it), 
 it currently only knows how to quote in the cql2 way. That is,
 {noformat}
 cqlsh:foo describe columnfamily bar
 CREATE COLUMNFAMILY bar (
   a int PRIMARY KEY,
   'b c' text
 ) WITH
 ...
 {noformat}
 cql3 does not recognize single quotes around column names, or columnfamily or 
 keyspace names either. cqlsh ought to learn how to use double-quotes instead 
 when in cql3 mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2393) Ring changes should stream from the replicas that are losing responsibility

2012-04-27 Thread paul cannon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264024#comment-13264024
 ] 

paul cannon commented on CASSANDRA-2393:


Just came across this ticket by accident. Seems to be pretty well superseded by 
#2434; should it stick around?

 Ring changes should stream from the replicas that are losing responsibility
 ---

 Key: CASSANDRA-2393
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2393
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Stu Hood
Priority: Minor

 During a bootstrap or move operation, the replica that is leaving the replica 
 set should be the preferred node to stream data to the node that is joining 
 the replica set. Currently, an effectively random node will be chosen.
 Since the data held by the leaving node may not agree with the other 
 replicas, it is important for consistency that it be the one that streams.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: Add missing file

2012-04-27 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.1 b81f5723e - c5ad288c9


Add missing file


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c5ad288c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c5ad288c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c5ad288c

Branch: refs/heads/cassandra-1.1
Commit: c5ad288c9119e4eda9e60fbb6d998e6a5fbd716b
Parents: b81f572
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Apr 27 17:14:33 2012 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Apr 27 17:14:33 2012 -0500

--
 pylib/cqlshlib/cql3handling.py |  210 +++
 1 files changed, 210 insertions(+), 0 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c5ad288c/pylib/cqlshlib/cql3handling.py
--
diff --git a/pylib/cqlshlib/cql3handling.py b/pylib/cqlshlib/cql3handling.py
new file mode 100644
index 000..9d27f44
--- /dev/null
+++ b/pylib/cqlshlib/cql3handling.py
@@ -0,0 +1,210 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an AS IS BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import re
+from warnings import warn
+from .cqlhandling import cql_typename, cql_escape
+
+try:
+import json
+except ImportError:
+import simplejson as json
+
+class UnexpectedTableStructure(UserWarning):
+def __init__(self, msg):
+self.msg = msg
+
+def __str__(self):
+return 'Unexpected table structure; may not translate correctly to 
CQL. ' + self.msg
+
+keywords = set((
+'select', 'from', 'where', 'and', 'key', 'insert', 'update', 'with',
+'limit', 'using', 'consistency', 'one', 'quorum', 'all', 'any',
+'local_quorum', 'each_quorum', 'two', 'three', 'use', 'count', 'set',
+'begin', 'apply', 'batch', 'truncate', 'delete', 'in', 'create',
+'keyspace', 'schema', 'columnfamily', 'table', 'index', 'on', 'drop',
+'primary', 'into', 'values', 'timestamp', 'ttl', 'alter', 'add', 'type',
+'compact', 'storage', 'order', 'by', 'asc', 'desc'
+))
+
+columnfamily_options = (
+'comment',
+'bloom_filter_fp_chance',
+'caching',
+'read_repair_chance',
+# 'local_read_repair_chance',   -- not yet a valid cql option
+'gc_grace_seconds',
+'min_compaction_threshold',
+'max_compaction_threshold',
+'replicate_on_write',
+'compaction_strategy_class',
+)
+
+columnfamily_map_options = (
+('compaction_strategy_options',
+()),
+('compression_parameters',
+('sstable_compression', 'chunk_length_kb', 'crc_check_chance')),
+)
+
+def cql3_escape_value(value):
+return cql_escape(value)
+
+def cql3_escape_name(name):
+return '%s' % name.replace('', '')
+
+valid_cql3_word_re = re.compile(r'^[a-z][0-9a-z_]*$', re.I)
+
+def is_valid_cql3_name(s):
+return valid_cql3_word_re.match(s) is not None and s not in keywords
+
+def maybe_cql3_escape_name(name):
+if is_valid_cql3_name(name):
+return name
+return cql3_escape_name(name)
+
+class CqlColumnDef:
+index_name = None
+
+def __init__(self, name, cqltype):
+self.name = name
+self.cqltype = cqltype
+
+@classmethod
+def from_layout(cls, layout):
+c = cls(layout[u'column'], cql_typename(layout[u'validator']))
+c.index_name = layout[u'index_name']
+return c
+
+def __str__(self):
+indexstr = ' (index %s)' % self.index_name if self.index_name is not 
None else ''
+return 'CqlColumnDef %r %r%s' % (self.name, self.cqltype, indexstr)
+__repr__ = __str__
+
+class CqlTableDef:
+json_attrs = ('column_aliases', 'compaction_strategy_options', 
'compression_parameters')
+composite_type_name = 'org.apache.cassandra.db.marshal.CompositeType'
+colname_type_name = 'org.apache.cassandra.db.marshal.UTF8Type'
+column_class = CqlColumnDef
+compact_storage = False
+
+key_components = ()
+columns = ()
+
+def __init__(self, name):
+self.name = name
+
+@classmethod
+def from_layout(cls, layout, coldefs):
+  

git commit: Add missing file

2012-04-27 Thread brandonwilliams
Updated Branches:
  refs/heads/trunk 5fd586424 - c6f95117d


Add missing file


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c6f95117
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c6f95117
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c6f95117

Branch: refs/heads/trunk
Commit: c6f95117d486795050bbfc2886be0cc645784587
Parents: 5fd5864
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Apr 27 17:14:33 2012 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Apr 27 17:20:12 2012 -0500

--
 pylib/cqlshlib/cql3handling.py |  210 +++
 1 files changed, 210 insertions(+), 0 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c6f95117/pylib/cqlshlib/cql3handling.py
--
diff --git a/pylib/cqlshlib/cql3handling.py b/pylib/cqlshlib/cql3handling.py
new file mode 100644
index 000..9d27f44
--- /dev/null
+++ b/pylib/cqlshlib/cql3handling.py
@@ -0,0 +1,210 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an AS IS BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import re
+from warnings import warn
+from .cqlhandling import cql_typename, cql_escape
+
+try:
+import json
+except ImportError:
+import simplejson as json
+
+class UnexpectedTableStructure(UserWarning):
+def __init__(self, msg):
+self.msg = msg
+
+def __str__(self):
+return 'Unexpected table structure; may not translate correctly to 
CQL. ' + self.msg
+
+keywords = set((
+'select', 'from', 'where', 'and', 'key', 'insert', 'update', 'with',
+'limit', 'using', 'consistency', 'one', 'quorum', 'all', 'any',
+'local_quorum', 'each_quorum', 'two', 'three', 'use', 'count', 'set',
+'begin', 'apply', 'batch', 'truncate', 'delete', 'in', 'create',
+'keyspace', 'schema', 'columnfamily', 'table', 'index', 'on', 'drop',
+'primary', 'into', 'values', 'timestamp', 'ttl', 'alter', 'add', 'type',
+'compact', 'storage', 'order', 'by', 'asc', 'desc'
+))
+
+columnfamily_options = (
+'comment',
+'bloom_filter_fp_chance',
+'caching',
+'read_repair_chance',
+# 'local_read_repair_chance',   -- not yet a valid cql option
+'gc_grace_seconds',
+'min_compaction_threshold',
+'max_compaction_threshold',
+'replicate_on_write',
+'compaction_strategy_class',
+)
+
+columnfamily_map_options = (
+('compaction_strategy_options',
+()),
+('compression_parameters',
+('sstable_compression', 'chunk_length_kb', 'crc_check_chance')),
+)
+
+def cql3_escape_value(value):
+return cql_escape(value)
+
+def cql3_escape_name(name):
+return '%s' % name.replace('', '')
+
+valid_cql3_word_re = re.compile(r'^[a-z][0-9a-z_]*$', re.I)
+
+def is_valid_cql3_name(s):
+return valid_cql3_word_re.match(s) is not None and s not in keywords
+
+def maybe_cql3_escape_name(name):
+if is_valid_cql3_name(name):
+return name
+return cql3_escape_name(name)
+
+class CqlColumnDef:
+index_name = None
+
+def __init__(self, name, cqltype):
+self.name = name
+self.cqltype = cqltype
+
+@classmethod
+def from_layout(cls, layout):
+c = cls(layout[u'column'], cql_typename(layout[u'validator']))
+c.index_name = layout[u'index_name']
+return c
+
+def __str__(self):
+indexstr = ' (index %s)' % self.index_name if self.index_name is not 
None else ''
+return 'CqlColumnDef %r %r%s' % (self.name, self.cqltype, indexstr)
+__repr__ = __str__
+
+class CqlTableDef:
+json_attrs = ('column_aliases', 'compaction_strategy_options', 
'compression_parameters')
+composite_type_name = 'org.apache.cassandra.db.marshal.CompositeType'
+colname_type_name = 'org.apache.cassandra.db.marshal.UTF8Type'
+column_class = CqlColumnDef
+compact_storage = False
+
+key_components = ()
+columns = ()
+
+def __init__(self, name):
+self.name = name
+
+@classmethod
+def from_layout(cls, layout, coldefs):
+cf = 

[jira] [Commented] (CASSANDRA-4138) Add varint encoding to Serializing Cache

2012-04-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264084#comment-13264084
 ] 

Jonathan Ellis commented on CASSANDRA-4138:
---

It's not immediately clear to me what the changes in ByteBufferUtil are doing 
-- EDOS doesn't change writeByte so what is breaking?  is this backwards 
compatible?

Have you done any smoke tests to see what kind of savings you get on typical 
cached data?  In other words: is our intuition correct that this is worth the 
extra complexity?

 Add varint encoding to Serializing Cache
 

 Key: CASSANDRA-4138
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4138
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-4138-Take1.patch, 
 0001-CASSANDRA-4138-V2.patch, 0001-CASSANDRA-4138-v4.patch, 
 0002-sizeof-changes-on-rest-of-the-code.patch, CASSANDRA-4138-v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4138) Add varint encoding to Serializing Cache

2012-04-27 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264108#comment-13264108
 ] 

Pavel Yaskevich commented on CASSANDRA-4138:


bq. It's not immediately clear to me what the changes in ByteBufferUtil are 
doing – EDOS doesn't change writeByte so what is breaking? is this backwards 
compatible?

I can explain ByteBufferUtil changes - instead of doing short write manually 
(copy code that does it from DO) it just uses appropriate method from the 
DataOutput which would handle the short encode and write. EDOS don't really 
need to change the way we write bytes, this is only about encoding integer 
types compactly. All legacy tests are passing and this doesn't seem to touch 
code so dramatically to break it as soon as it's all about SerializingCache...

bq. Have you done any smoke tests to see what kind of savings you get on 
typical cached data? In other words: is our intuition correct that this is 
worth the extra complexity?

This is question to Vijay so I won't interfere, just want to note that he 
pointed out in the first comment that this saves ~10% of memory compared to 
normal DIS.

 Add varint encoding to Serializing Cache
 

 Key: CASSANDRA-4138
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4138
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-4138-Take1.patch, 
 0001-CASSANDRA-4138-V2.patch, 0001-CASSANDRA-4138-v4.patch, 
 0002-sizeof-changes-on-rest-of-the-code.patch, CASSANDRA-4138-v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4138) Add varint encoding to Serializing Cache

2012-04-27 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264110#comment-13264110
 ] 

Vijay commented on CASSANDRA-4138:
--

 Have you done any smoke tests to see what kind of savings you get on 
 typical cached data?
Yes i do see 10% gain in space.

  is our intuition correct that this is worth the extra complexity?
The complexity for the Cache as such is only using EDOS instead of DOS... other 
complexity are generic to support messaging and SSTable formats.


 Add varint encoding to Serializing Cache
 

 Key: CASSANDRA-4138
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4138
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-4138-Take1.patch, 
 0001-CASSANDRA-4138-V2.patch, 0001-CASSANDRA-4138-v4.patch, 
 0002-sizeof-changes-on-rest-of-the-code.patch, CASSANDRA-4138-v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4138) Add varint encoding to Serializing Cache

2012-04-27 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264115#comment-13264115
 ] 

Vijay commented on CASSANDRA-4138:
--

In addition to Pavel's comment, the reason for the change is to make the 
writeWithShortLength write a varint while using EDOS (will just use a byte 
instead of 2 for most of the cases).

 Add varint encoding to Serializing Cache
 

 Key: CASSANDRA-4138
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4138
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-4138-Take1.patch, 
 0001-CASSANDRA-4138-V2.patch, 0001-CASSANDRA-4138-v4.patch, 
 0002-sizeof-changes-on-rest-of-the-code.patch, CASSANDRA-4138-v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4138) Add varint encoding to Serializing Cache

2012-04-27 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264118#comment-13264118
 ] 

Pavel Yaskevich commented on CASSANDRA-4138:


Yeah, this is my intended clarification of why is that needed but I somehow 
missed it :(

 Add varint encoding to Serializing Cache
 

 Key: CASSANDRA-4138
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4138
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-4138-Take1.patch, 
 0001-CASSANDRA-4138-V2.patch, 0001-CASSANDRA-4138-v4.patch, 
 0002-sizeof-changes-on-rest-of-the-code.patch, CASSANDRA-4138-v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4138) Add varint encoding to Serializing Cache

2012-04-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264191#comment-13264191
 ] 

Jonathan Ellis commented on CASSANDRA-4138:
---

LGTM, +1.

 Add varint encoding to Serializing Cache
 

 Key: CASSANDRA-4138
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4138
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-4138-Take1.patch, 
 0001-CASSANDRA-4138-V2.patch, 0001-CASSANDRA-4138-v4.patch, 
 0002-sizeof-changes-on-rest-of-the-code.patch, CASSANDRA-4138-v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[2/2] Add varint encoding to Serializing Cache patch by Vijay; reviewed by jbellis,xedin for CASSANDRA-4138

2012-04-27 Thread vijay
http://git-wip-us.apache.org/repos/asf/cassandra/blob/cb25a8fc/test/unit/org/apache/cassandra/utils/EncodedStreamsTest.java
--
diff --git a/test/unit/org/apache/cassandra/utils/EncodedStreamsTest.java 
b/test/unit/org/apache/cassandra/utils/EncodedStreamsTest.java
new file mode 100644
index 000..1907c83
--- /dev/null
+++ b/test/unit/org/apache/cassandra/utils/EncodedStreamsTest.java
@@ -0,0 +1,160 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.utils;
+
+import static org.apache.cassandra.Util.*;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+
+import org.apache.cassandra.SchemaLoader;
+import org.apache.cassandra.db.ColumnFamily;
+import org.apache.cassandra.db.DBTypeSizes;
+import org.apache.cassandra.utils.vint.EncodedDataInputStream;
+import org.apache.cassandra.utils.vint.EncodedDataOutputStream;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+public class EncodedStreamsTest extends SchemaLoader
+{
+private String tableName = Keyspace1;
+private String standardCFName = Standard1;
+private String counterCFName = Counter1;
+private String superCFName = Super1;
+
+@Test
+public void testStreams() throws IOException
+{
+ByteArrayOutputStream byteArrayOStream1 = new ByteArrayOutputStream();
+EncodedDataOutputStream odos = new 
EncodedDataOutputStream(byteArrayOStream1);
+
+ByteArrayOutputStream byteArrayOStream2 = new ByteArrayOutputStream();
+DataOutputStream dos = new DataOutputStream(byteArrayOStream2);
+
+for (short i = 0; i  1; i++)
+{
+dos.writeShort(i);
+odos.writeShort(i);
+}
+dos.flush();
+odos.flush();
+
+for (int i = Short.MAX_VALUE; i  ((int)Short.MAX_VALUE + 1); i++)
+{
+dos.writeInt(i);
+odos.writeInt(i);
+}
+dos.flush();
+odos.flush();
+
+for (long i = Integer.MAX_VALUE; i  ((long)Integer.MAX_VALUE + 
1);i++)
+{
+dos.writeLong(i);
+odos.writeLong(i);
+}
+dos.flush();
+odos.flush();
+Assert.assertTrue(byteArrayOStream1.size()  byteArrayOStream2.size());
+
+ByteArrayInputStream byteArrayIStream1 = new 
ByteArrayInputStream(byteArrayOStream1.toByteArray());
+EncodedDataInputStream idis = new EncodedDataInputStream(new 
DataInputStream(byteArrayIStream1));
+
+// assert reading Short
+for (int i = 0; i  1; i++)
+Assert.assertEquals(i, idis.readShort());
+
+// assert reading Integer
+for (int i = Short.MAX_VALUE; i  ((int)Short.MAX_VALUE + 1); i++)
+Assert.assertEquals(i, idis.readInt());
+
+// assert reading Long
+for (long i = Integer.MAX_VALUE; i  ((long)Integer.MAX_VALUE) + 1000; 
i++)
+Assert.assertEquals(i, idis.readLong());
+}
+
+private ColumnFamily createCF()
+{
+ColumnFamily cf = ColumnFamily.create(tableName, standardCFName);
+cf.addColumn(column(vijay, try, 1));
+cf.addColumn(column(to, be_nice, 1));
+return cf;
+}
+
+private ColumnFamily createCounterCF()
+{
+ColumnFamily cf = ColumnFamily.create(tableName, counterCFName);
+cf.addColumn(counterColumn(vijay, 1L, 1));
+cf.addColumn(counterColumn(wants, 100, 1));
+return cf;
+}
+
+private ColumnFamily createSuperCF()
+{
+ColumnFamily cf = ColumnFamily.create(tableName, superCFName);
+cf.addColumn(superColumn(cf, Avatar, column($2,782,275,172, 
2009, 1)));
+cf.addColumn(superColumn(cf, Titanic, column($1,925,905,151, 
1997, 1)));
+return cf;
+}
+
+@Test
+public void testCFSerialization() throws IOException
+{
+ByteArrayOutputStream byteArrayOStream1 = new ByteArrayOutputStream();
+EncodedDataOutputStream odos = new 
EncodedDataOutputStream(byteArrayOStream1);
+