[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2012-01-18 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188336#comment-13188336
 ] 

Sylvain Lebresne commented on CASSANDRA-3634:
-

bq. It can all be done on the client side if you have the full current schema 
available which, of course, is doable but expensive (in time) to get in place.

I think we could send enough info with the CqlPreparedResult, i.e, replace the 
count by a list of types, like what we do for CqlResult. It would be simpler 
for drivers than keeping the full schema somewhere and probably parse the 
initial prepared query to figure out to what each marker correspond in the 
schema.

There would be the slight issue of someone changing the validation of a given 
value between preparation and execution, but I don't think it's a big deal at 
all to say that you'll have to re-prepare queries if you do that (how often do 
you actually change a value validation function anyway, and even if you do so, 
you'd better change it for something that is compatible with the previous type 
for CQL, so in fact most changes would not be a problem).

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-3750) Migrations and Schema CFs use disk space proportional to the square of the number of CFs

2012-01-18 Thread Sylvain Lebresne (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-3750.
-

Resolution: Duplicate

While it is not yet committed, CASSANDRA-1391 will almost surely fix that, so 
marking that one as duplicate.

 Migrations and Schema CFs use disk space proportional to the square of the 
 number of CFs
 

 Key: CASSANDRA-3750
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3750
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.1
 Environment: Linux (CentOS 5.7)
Reporter: John Chakerian
 Attachments: fit.png


 The system keyspace grows proportional to the square of the number of CFs 
 (more likely, it grows quadratically with # of schema changes in general). 
 The major offenders in the keyspace are the Migrations table  the Schema 
 table. On clusters with very large #s of CFs (in the low thousands), we think 
 that these large system tables may be contributing to various performance 
 issues.
 The approximate expression is: s = 0.0003253*n^2 + 2.58, where n is # of 
 keyspaces + # of schemas and s is the size of the system keyspace in 
 megabytes. See attached plot of the regression curve showing fit. 
 Sampled data: 
 {noformat}
 NUM_CFS SYSTEM_SIZE_IN_MB
 100 4.4
 200 15
 300 32
 400 55
 500 85
 600 120
 700 162
 800 211
 900 266
 1000 327
 {noformat}
 This was hit in 1.0.1, but is almost certainly not version specific. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3545) Fix very low Secondary Index performance

2012-01-18 Thread Sylvain Lebresne (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-3545:


Fix Version/s: (was: 1.0.6)
   1.1

 Fix very low Secondary Index performance
 

 Key: CASSANDRA-3545
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3545
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Evgeny Ryabitskiy
Assignee: Sylvain Lebresne
 Fix For: 1.1

 Attachments: 0001-3545.patch, 0002-cleanup.patch


 While performing index search + value filtering over large Index Row ( ~100k 
 keys per index value) with chunks (size of 512-1024 keys) search time is 
 about 8-12 seconds, which is very very low.
 After profiling I got this picture:
 60% of search time is calculating MD5 hash with MessageDigester (Of cause it 
 is because of RundomPartitioner).
 33% of search time (half of all MD5 hash calculating time) is double 
 calculating of MD5 for comparing two row keys while rotating Index row to 
 startKey (when performing search query for next chunk).
 I see several performance improvements:
 1) Use good algorithm to search startKey in sorted collection, that is faster 
 then iteration over all keys. This solution is on first place because it 
 simple, need only local code changes and should solve problem (increase 
 search in multiple times).
 2) Don't calculate MD5 hash for startKey every time. It's optimal to compute 
 it once (so search will be twice faster).
 Also need local code changes.
 3) Think about something faster that MD5 for hashing (like 
 TigerRandomPartitioner with Tiger/128 hash).
 Need research and maybe this research was done.
 4) Don't use Tokens (with MD5 hash for RandomPartitioner) for comparing and 
 sorting keys in index rows. In index rows, keys can be stored and compared 
 with simple Byte Comparator. 
 This solution requires huge code changes.
 I'm going to start from first solution. Next improvements can be done with 
 next tickets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3743) Lower memory consumption used by index sampling

2012-01-18 Thread Radim Kolar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188349#comment-13188349
 ] 

Radim Kolar commented on CASSANDRA-3743:


I am working on it now

 Lower memory consumption used by index sampling
 ---

 Key: CASSANDRA-3743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3743
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.6
Reporter: Radim Kolar

 currently j.o.a.c.io.sstable.indexsummary is implemented as ArrayList of 
 KeyPosition (RowPosition key, long offset)i propose to change it to:
 RowPosition keys[]
 long offsets[]
 and use standard binary search on it. This will lower number of java objects 
 used per entry from 2 (KeyPosition + RowPosition) to 1 (RowPosition).
 For building these arrays convenient ArrayList class can be used and then 
 call to .toArray() on it.
 This is very important because index sampling uses a lot of memory on nodes 
 with billions rows

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3752) bulk loader no longer finds sstables

2012-01-18 Thread Brandon Williams (Created) (JIRA)
bulk loader no longer finds sstables


 Key: CASSANDRA-3752
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3752
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1
Reporter: Brandon Williams
 Fix For: 1.1


It looks like CASSANDRA-2749 broke it:

{noformat}
 WARN 13:02:20,107 Invalid file 'Standard1' in data directory 
/var/lib/cassandra/data/Keyspace1.
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3743) Lower memory consumption used by index sampling

2012-01-18 Thread Radim Kolar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated CASSANDRA-3743:
---

Attachment: cassandra-3743.txt

 Lower memory consumption used by index sampling
 ---

 Key: CASSANDRA-3743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3743
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.6
Reporter: Radim Kolar
  Labels: optimization
 Fix For: 1.0.8

 Attachments: cassandra-3743.txt


 currently j.o.a.c.io.sstable.indexsummary is implemented as ArrayList of 
 KeyPosition (RowPosition key, long offset)i propose to change it to:
 RowPosition keys[]
 long offsets[]
 and use standard binary search on it. This will lower number of java objects 
 used per entry from 2 (KeyPosition + RowPosition) to 1 (RowPosition).
 For building these arrays convenient ArrayList class can be used and then 
 call to .toArray() on it.
 This is very important because index sampling uses a lot of memory on nodes 
 with billions rows

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Update of CassandraLimitations by JonathanEllis

2012-01-18 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The CassandraLimitations page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/CassandraLimitations?action=diffrev1=28rev2=29

  
  == Stuff that isn't likely to change ==
   * All data for a single row must fit (on disk) on a single machine in the 
cluster. Because row keys alone are used to determine the nodes responsible for 
replicating their data, the amount of data associated with a single key has 
this upper bound.
-  * A single column value may not be larger than 2GB.
+  * A single column value may not be larger than 2GB.  (However, large values 
are read into memory when requested, so in practice small number of MB is 
more appropriate.)
   * The maximum of column per row is 2 billion.
   * The key (and column names) must be under 64K bytes.
  


[1/3] git commit: change bind parms from string to bytes

2012-01-18 Thread eevans
Updated Branches:
  refs/heads/trunk 0456b7eb2 - 7c92fc52e


change bind parms from string to bytes

Patch by eevans; reviewed by Rick Shaw for CASSANDRA-3634


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7c92fc52
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7c92fc52
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7c92fc52

Branch: refs/heads/trunk
Commit: 7c92fc52ec9aebd1906441a676cec28aa8c07967
Parents: ce29659
Author: Eric Evans eev...@sym-link.com
Authored: Thu Dec 15 09:33:42 2011 -0600
Committer: Eric Evans eev...@apache.org
Committed: Wed Jan 18 09:00:17 2012 -0600

--
 .../apache/cassandra/cql/AbstractModification.java |5 ++-
 .../org/apache/cassandra/cql/BatchStatement.java   |3 +-
 .../cassandra/cql/CreateColumnFamilyStatement.java |4 +-
 .../org/apache/cassandra/cql/DeleteStatement.java  |6 ++--
 .../org/apache/cassandra/cql/QueryProcessor.java   |   22 +++---
 src/java/org/apache/cassandra/cql/Term.java|4 +-
 .../org/apache/cassandra/cql/UpdateStatement.java  |6 ++--
 .../apache/cassandra/thrift/CassandraServer.java   |2 +-
 8 files changed, 27 insertions(+), 25 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7c92fc52/src/java/org/apache/cassandra/cql/AbstractModification.java
--
diff --git a/src/java/org/apache/cassandra/cql/AbstractModification.java 
b/src/java/org/apache/cassandra/cql/AbstractModification.java
index 38f323b..3a0b8cb 100644
--- a/src/java/org/apache/cassandra/cql/AbstractModification.java
+++ b/src/java/org/apache/cassandra/cql/AbstractModification.java
@@ -20,6 +20,7 @@
  */
 package org.apache.cassandra.cql;
 
+import java.nio.ByteBuffer;
 import java.util.List;
 
 import org.apache.cassandra.db.IMutation;
@@ -103,7 +104,7 @@ public abstract class AbstractModification
  *
  * @throws InvalidRequestException on the wrong request
  */
-public abstract ListIMutation prepareRowMutations(String keyspace, 
ClientState clientState, ListString variables)
+public abstract ListIMutation prepareRowMutations(String keyspace, 
ClientState clientState, ListByteBuffer variables)
 throws org.apache.cassandra.thrift.InvalidRequestException;
 
 /**
@@ -117,6 +118,6 @@ public abstract class AbstractModification
  *
  * @throws InvalidRequestException on the wrong request
  */
-public abstract ListIMutation prepareRowMutations(String keyspace, 
ClientState clientState, Long timestamp, ListString variables)
+public abstract ListIMutation prepareRowMutations(String keyspace, 
ClientState clientState, Long timestamp, ListByteBuffer variables)
 throws org.apache.cassandra.thrift.InvalidRequestException;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7c92fc52/src/java/org/apache/cassandra/cql/BatchStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql/BatchStatement.java 
b/src/java/org/apache/cassandra/cql/BatchStatement.java
index 650b53d..2781833 100644
--- a/src/java/org/apache/cassandra/cql/BatchStatement.java
+++ b/src/java/org/apache/cassandra/cql/BatchStatement.java
@@ -20,6 +20,7 @@
  */
 package org.apache.cassandra.cql;
 
+import java.nio.ByteBuffer;
 import java.util.LinkedList;
 import java.util.List;
 
@@ -76,7 +77,7 @@ public class BatchStatement
 return timeToLive;
 }
 
-public ListIMutation getMutations(String keyspace, ClientState 
clientState, ListString variables)
+public ListIMutation getMutations(String keyspace, ClientState 
clientState, ListByteBuffer variables)
 throws InvalidRequestException
 {
 ListIMutation batch = new LinkedListIMutation();

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7c92fc52/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java 
b/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
index 0f371f7..93b8331 100644
--- a/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
+++ b/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
@@ -55,7 +55,7 @@ public class CreateColumnFamilyStatement
 }
 
 /** Perform validation of parsed params */
-private void validate(ListString variables) throws 
InvalidRequestException
+private void validate(ListByteBuffer variables) throws 
InvalidRequestException
 {
 cfProps.validate();
 
@@ -164,7 +164,7 @@ public class CreateColumnFamilyStatement
  * @return a CFMetaData instance corresponding to the values 

[2/3] git commit: generated thrift code

2012-01-18 Thread eevans
generated thrift code

Patch by eevans; reviewed by Rick Shaw for CASSANDRA-3634


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ce29659a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ce29659a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ce29659a

Branch: refs/heads/trunk
Commit: ce29659ae29f8449023a27b6d80f1034767d302c
Parents: 0456b7e
Author: Eric Evans eev...@sym-link.com
Authored: Thu Dec 15 09:21:35 2011 -0600
Committer: Eric Evans eev...@apache.org
Committed: Wed Jan 18 08:59:54 2012 -0600

--
 interface/cassandra.thrift |2 +-
 .../org/apache/cassandra/thrift/Cassandra.java | 9070 +++
 2 files changed, 4496 insertions(+), 4576 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ce29659a/interface/cassandra.thrift
--
diff --git a/interface/cassandra.thrift b/interface/cassandra.thrift
index a35387b..a0298e5 100644
--- a/interface/cassandra.thrift
+++ b/interface/cassandra.thrift
@@ -709,7 +709,7 @@ service Cassandra {
* Executes a prepared CQL (Cassandra Query Language) statement by passing 
an id token and  a list of variables
* to bind and returns a CqlResult containing the results.
*/
-  CqlResult execute_prepared_cql_query(1:required i32 itemId, 2:required 
liststring values)
+  CqlResult execute_prepared_cql_query(1:required i32 itemId, 2:required 
listbinary values)
 throws (1:InvalidRequestException ire,
 2:UnavailableException ue,
 3:TimedOutException te,



[jira] [Resolved] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2012-01-18 Thread Eric Evans (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans resolved CASSANDRA-3634.
---

Resolution: Fixed

committed (7c92fc52)

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[1/2] git commit: fix merge error; recompile Cassandra.java with thrift 0.7.0

2012-01-18 Thread eevans
Updated Branches:
  refs/heads/trunk 7c92fc52e - bce44ff32


fix merge error; recompile Cassandra.java with thrift 0.7.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bce44ff3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bce44ff3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bce44ff3

Branch: refs/heads/trunk
Commit: bce44ff3297c36260a2adabea5f77c6b0b9dfe92
Parents: 7c92fc5
Author: Eric Evans eev...@apache.org
Authored: Wed Jan 18 10:54:33 2012 -0600
Committer: Eric Evans eev...@apache.org
Committed: Wed Jan 18 10:54:33 2012 -0600

--
 .../org/apache/cassandra/thrift/Cassandra.java | 9036 ---
 1 files changed, 4558 insertions(+), 4478 deletions(-)
--




[Cassandra Wiki] Update of Committers by JonathanEllis

2012-01-18 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The Committers page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/Committers?action=diffrev1=18rev2=19

Comment:
add Aaron

  ||Sylvain Lebresne||Mar 2011||Datastax||PMC member, Release manager||
  ||Pavel Yaskevich||Aug 2011||Datastax|| ||
  ||Vijay Parthasarathy||Jan 2012||Netflix|| ||
+ ||Aaron Morton||Jan 2012||Independent|| ||
  


[jira] [Commented] (CASSANDRA-3507) Proposal: separate cqlsh from CQL drivers

2012-01-18 Thread paul cannon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188586#comment-13188586
 ] 

paul cannon commented on CASSANDRA-3507:


Whatever we're going to do here for 1.1, we probably want to get started. Is 
there any further input? In particular, will C* lose points if it gets 
distributed (as a tarball, at least) without any client software?

 Proposal: separate cqlsh from CQL drivers
 -

 Key: CASSANDRA-3507
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3507
 Project: Cassandra
  Issue Type: Improvement
  Components: Packaging, Tools
Affects Versions: 1.0.3
 Environment: Debian-based systems
Reporter: paul cannon
Assignee: paul cannon
Priority: Minor
  Labels: cql, cqlsh
 Fix For: 1.1


 Whereas:
 * It has been shown to be very desirable to decouple the release cycles of 
 Cassandra from the various client CQL drivers, and
 * It is also desirable to include a good interactive CQL client with releases 
 of Cassandra, and
 * It is not desirable for Cassandra releases to depend on 3rd-party software 
 which is neither bundled with Cassandra nor readily available for every 
 target platform, but
 * Any good interactive CQL client will require a CQL driver;
 Therefore, be it resolved that:
 * cqlsh will not use an official or supported CQL driver, but will include 
 its own private CQL driver, not intended for use by anything else, and
 * the Cassandra project will still recommend installing and using a proper 
 CQL driver for client software.
 To ease maintenance, the private CQL driver included with cqlsh may very well 
 be created by copying the python CQL driver from one directory into 
 another, but the user shouldn't rely on this. Maybe we even ought to take 
 some minor steps to discourage its use for other purposes.
 Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3753) Update CqlPreparedResult to provide type information

2012-01-18 Thread Jonathan Ellis (Created) (JIRA)
Update CqlPreparedResult to provide type information


 Key: CASSANDRA-3753
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3753
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Affects Versions: 1.1
Reporter: Jonathan Ellis
Priority: Critical
 Fix For: 1.1


As discussed on CASSANDRA-3634, adding type information to a prepared statement 
would allow more client-side error checking.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2012-01-18 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188623#comment-13188623
 ] 

Jonathan Ellis commented on CASSANDRA-3634:
---

bq. I think we could send enough info with the CqlPreparedResult, i.e, replace 
the count by a list of types

Created CASSANDRA-3753 to follow up on that.

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3740) While using BulkOutputFormat unneccessarily look for the cassandra.yaml file.

2012-01-18 Thread Brandon Williams (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-3740:


Attachment: 0003-use-output-partitioner.txt
0002-Prevent-loading-from-yaml.txt
0001-Make-DD-the-canonical-partitioner-source.txt

 While using BulkOutputFormat  unneccessarily look for the cassandra.yaml file.
 --

 Key: CASSANDRA-3740
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3740
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.1
Reporter: Samarth Gahire
Assignee: Brandon Williams
  Labels: cassandra, hadoop, mapreduce
 Fix For: 1.1

 Attachments: 0001-Make-DD-the-canonical-partitioner-source.txt, 
 0002-Prevent-loading-from-yaml.txt, 0003-use-output-partitioner.txt


 I am trying to use BulkOutputFormat to stream the data from map of Hadoop 
 job. I have set the cassandra related configuration using ConfigHelper ,Also 
 have looked into Cassandra code seems Cassandra has taken care that it should 
 not look for the cassandra.yaml file.
 But still when I run the job i get the following error:
 {
 12/01/13 11:30:04 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
 the arguments. Applications should implement Tool for the same.
 12/01/13 11:30:04 INFO input.FileInputFormat: Total input paths to process : 1
 12/01/13 11:30:04 INFO mapred.JobClient: Running job: job_201201130910_0015
 12/01/13 11:30:05 INFO mapred.JobClient:  map 0% reduce 0%
 12/01/13 11:30:23 INFO mapred.JobClient: Task Id : 
 attempt_201201130910_0015_m_00_0, Status : FAILED
 java.lang.Throwable: Child Error
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.io.IOException: Task process exit with nonzero status of 1.
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
 attempt_201201130910_0015_m_00_0: Cannot locate cassandra.yaml
 attempt_201201130910_0015_m_00_0: Fatal configuration error; unable to 
 start server.
 }
 Also let me know how can i make this cassandra.yaml file available to Hadoop 
 mapreduce job?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3749) Allow rangeSlice queries to be start/end inclusive/exclusive

2012-01-18 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3749:
--

Attachment: 3749-comments.txt

Even after updating the comments to make things a bit more clear (attached), 
I'm still confused by the remainder/split dance.  For instance, if we are 
splitting a Bounds on bounds.right, that means that remainder overlaps Bounds 
entirely and so we should add that to the ranges, but instead we skip it.

 Allow rangeSlice queries to be start/end inclusive/exclusive 
 -

 Key: CASSANDRA-3749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3749
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 1.1

 Attachments: 3749-comments.txt, 3749.patch


 Currently, given two keys k1 and k2, we can only do a rangeSlice on the 
 intervals (k1, k2] (Range) and [k1, k2] (Bounds). CQL goes around this 
 manually, by querying one more row if the start is exclusive and removing 
 the start/end post-query if necessary. This doesn't work however with the new 
 option introduced by CASSANDRA-3742. So this ticket proposes to add support 
 (internally) for doing a rangeSlice for the intervals (k1, k2) an [k1, k2).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3743) Lower memory consumption used by index sampling

2012-01-18 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3743:
--

 Reviewer: yukim
Affects Version/s: (was: 1.0.6)
   1.0.0
Fix Version/s: (was: 1.0.8)
   1.1
 Assignee: Radim Kolar

 Lower memory consumption used by index sampling
 ---

 Key: CASSANDRA-3743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3743
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.0
Reporter: Radim Kolar
Assignee: Radim Kolar
  Labels: optimization
 Fix For: 1.1

 Attachments: cassandra-3743.txt


 currently j.o.a.c.io.sstable.indexsummary is implemented as ArrayList of 
 KeyPosition (RowPosition key, long offset)i propose to change it to:
 RowPosition keys[]
 long offsets[]
 and use standard binary search on it. This will lower number of java objects 
 used per entry from 2 (KeyPosition + RowPosition) to 1 (RowPosition).
 For building these arrays convenient ArrayList class can be used and then 
 call to .toArray() on it.
 This is very important because index sampling uses a lot of memory on nodes 
 with billions rows

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3744) Nodetool.bat double quotes classpath

2012-01-18 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3744:
--

Attachment: 3744-v2.txt

I see what you mean.  v2 attached to generalize fix to other .bats.

 Nodetool.bat double quotes classpath
 

 Key: CASSANDRA-3744
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3744
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Windows
Reporter: Nick Bailey
Assignee: Nick Bailey
Priority: Minor
 Fix For: 1.0.8

 Attachments: 0001-Don-t-double-quote-classpath.patch, 3744-v2.txt


 Windows sucks and double quoting things breaks stuff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3736) -Dreplace_token leaves old node (IP) in the gossip with the token.

2012-01-18 Thread Jackson Chung (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188817#comment-13188817
 ] 

Jackson Chung commented on CASSANDRA-3736:
--

looks like fix from CASSANDRA-3747 got the fix.

the replacement node would still get this once:
 INFO [GossipStage:1] 2012-01-18 23:45:56,412 Gossiper.java (line 834) Node 
/50.56.58.55 is now part of the cluster
 INFO [GossipStage:1] 2012-01-18 23:45:56,412 Gossiper.java (line 800) 
InetAddress /50.56.58.55 is now UP
 INFO [GossipStage:1] 2012-01-18 23:45:56,413 StorageService.java (line 1016) 
Nodes /50.56.58.55 and action-quick2/50.56.31.186 have the same token 
85070591730234615865843651857942052864.  Ignoring /50.56.58.55
 INFO [GossipTasks:1] 2012-01-18 23:46:05,805 Gossiper.java (line 814) 
InetAddress /50.56.58.55 is now dead.
 INFO [GossipTasks:1] 2012-01-18 23:46:26,819 Gossiper.java (line 628) 
FatClient /50.56.58.55 has been silent for 3ms, removing from gossip

but its quiet after that.

the other node would receive the same info also:

 INFO [GossipTasks:1] 2012-01-18 23:45:57,486 Gossiper.java (line 628) 
FatClient /50.56.58.55 has been silent for 3ms, removing from gossip

and the gossipinfo of those nodes are the matching:


$ ./bin/nodetool -h 50.56.31.186 gossipinfo
/50.56.59.68
  RELEASE_VERSION:1.0.7-SNAPSHOT
  LOAD:6820.0
  RPC_ADDRESS:50.56.59.68
  STATUS:NORMAL,0
  SCHEMA:--1000--
action-quick2/50.56.31.186
  RELEASE_VERSION:1.0.7-SNAPSHOT
  RPC_ADDRESS:50.56.31.186
  STATUS:NORMAL,85070591730234615865843651857942052864
  LOAD:11372.0
  SCHEMA:--1000--

$ ./bin/nodetool -h 50.56.59.68 gossipinfo
action-quick/50.56.59.68
  SCHEMA:--1000--
  RELEASE_VERSION:1.0.7-SNAPSHOT
  LOAD:6820.0
  RPC_ADDRESS:50.56.59.68
  STATUS:NORMAL,0
/50.56.31.186
  SCHEMA:--1000--
  RELEASE_VERSION:1.0.7-SNAPSHOT
  LOAD:11372.0
  RPC_ADDRESS:50.56.31.186
  STATUS:NORMAL,85070591730234615865843651857942052864


 -Dreplace_token leaves old node (IP) in the gossip with the token.
 --

 Key: CASSANDRA-3736
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3736
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Jackson Chung
Assignee: Vijay
 Fix For: 1.0.8

 Attachments: 0001-CASSANDRA-3736.patch


 https://issues.apache.org/jira/browse/CASSANDRA-957 introduce a 
 -Dreplace_token,
 however, the replaced IP keeps on showing up in the Gossiper when starting 
 the replacement node:
 {noformat}
  INFO [Thread-2] 2012-01-12 23:59:35,162 CassandraDaemon.java (line 213) 
 Listening for thrift clients...
  INFO [GossipStage:1] 2012-01-12 23:59:35,173 Gossiper.java (line 836) Node 
 /50.56.59.68 has restarted, now UP
  INFO [GossipStage:1] 2012-01-12 23:59:35,174 Gossiper.java (line 804) 
 InetAddress /50.56.59.68 is now UP
  INFO [GossipStage:1] 2012-01-12 23:59:35,175 StorageService.java (line 988) 
 Node /50.56.59.68 state jump to normal
  INFO [GossipStage:1] 2012-01-12 23:59:35,176 Gossiper.java (line 836) Node 
 /50.56.58.55 has restarted, now UP
  INFO [GossipStage:1] 2012-01-12 23:59:35,176 Gossiper.java (line 804) 
 InetAddress /50.56.58.55 is now UP
  INFO [GossipStage:1] 2012-01-12 23:59:35,177 StorageService.java (line 1016) 
 Nodes /50.56.58.55 and action-quick2/50.56.31.186 have the same token 
 85070591730234615865843651857942052864.  Ignoring /50.56.58.55
  INFO [GossipTasks:1] 2012-01-12 23:59:45,048 Gossiper.java (line 818) 
 InetAddress /50.56.58.55 is now dead.
  INFO [GossipTasks:1] 2012-01-13 00:00:06,062 Gossiper.java (line 632) 
 FatClient /50.56.58.55 has been silent for 3ms, removing from gossip
  INFO [GossipStage:1] 2012-01-13 00:01:06,320 Gossiper.java (line 838) Node 
 /50.56.58.55 is now part of the cluster
  INFO [GossipStage:1] 2012-01-13 00:01:06,320 Gossiper.java (line 804) 
 InetAddress /50.56.58.55 is now UP
  INFO [GossipStage:1] 2012-01-13 00:01:06,321 StorageService.java (line 1016) 
 Nodes /50.56.58.55 and action-quick2/50.56.31.186 have the same token 
 85070591730234615865843651857942052864.  Ignoring /50.56.58.55
  INFO [GossipTasks:1] 2012-01-13 00:01:16,106 Gossiper.java (line 818) 
 InetAddress /50.56.58.55 is now dead.
  INFO [GossipTasks:1] 2012-01-13 00:01:37,121 Gossiper.java (line 632) 
 FatClient /50.56.58.55 has been silent for 3ms, removing from gossip
  INFO [GossipStage:1] 2012-01-13 00:02:37,352 Gossiper.java (line 838) Node 
 /50.56.58.55 is now part of the cluster
  INFO [GossipStage:1] 2012-01-13 00:02:37,353 Gossiper.java (line 804) 
 InetAddress /50.56.58.55 is now UP
  INFO [GossipStage:1] 2012-01-13 00:02:37,353 StorageService.java (line 

[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2012-01-18 Thread Yuki Morishita (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-3668:
--

Attachment: 0003-Add-threads-option-to-sstableloader.patch
0002-Allow-concurrent-stream-in-StreamOutSession.patch
0001-Allow-multiple-connection-in-StreamInSession.patch

Attached patches add threads option(-t) to sstableloader. The option allows you 
to configure # of threads per destination.

I made patches for trunk because in 1.0 branch streaming socket is one-to-one 
to incoming stream session.

I got better throughput with 4 threads, and  observed little impact on target 
node's cpu and memory.

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.7
Reporter: Manish Zope
Assignee: Yuki Morishita
 Fix For: 1.0.8

 Attachments: 0001-Allow-multiple-connection-in-StreamInSession.patch, 
 0002-Allow-concurrent-stream-in-StreamOutSession.patch, 
 0003-Add-threads-option-to-sstableloader.patch, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2012-01-18 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3668:
--

 Reviewer: jbellis
 Priority: Minor  (was: Major)
Affects Version/s: (was: 1.0.7)
   1.0.0
Fix Version/s: (was: 1.0.8)
   1.1

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.0
Reporter: Manish Zope
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-Allow-multiple-connection-in-StreamInSession.patch, 
 0002-Allow-concurrent-stream-in-StreamOutSession.patch, 
 0003-Add-threads-option-to-sstableloader.patch, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3736) -Dreplace_token leaves old node (IP) in the gossip with the token.

2012-01-18 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188911#comment-13188911
 ] 

Vijay commented on CASSANDRA-3736:
--

Yes and the fix attached with this ticket will also remove the node from the 
System table, while replacing hence you wont even see the following message...

 INFO [GossipStage:1] 2012-01-18 23:45:56,412 Gossiper.java (line 800) 
 InetAddress /50.56.58.55 is now UP

The problem is that we remove the node after 30 seconds Meanwhile the 
gossip will make the other node know about .55 and hence the message in the 
other node. 
The patch will fix this by removing the information from the System table in 
the first place instead of restart which triggering it to reappear. Can you try 
redoing the test? it doesn't appear back in my tests.

 -Dreplace_token leaves old node (IP) in the gossip with the token.
 --

 Key: CASSANDRA-3736
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3736
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Jackson Chung
Assignee: Vijay
 Fix For: 1.0.8

 Attachments: 0001-CASSANDRA-3736.patch


 https://issues.apache.org/jira/browse/CASSANDRA-957 introduce a 
 -Dreplace_token,
 however, the replaced IP keeps on showing up in the Gossiper when starting 
 the replacement node:
 {noformat}
  INFO [Thread-2] 2012-01-12 23:59:35,162 CassandraDaemon.java (line 213) 
 Listening for thrift clients...
  INFO [GossipStage:1] 2012-01-12 23:59:35,173 Gossiper.java (line 836) Node 
 /50.56.59.68 has restarted, now UP
  INFO [GossipStage:1] 2012-01-12 23:59:35,174 Gossiper.java (line 804) 
 InetAddress /50.56.59.68 is now UP
  INFO [GossipStage:1] 2012-01-12 23:59:35,175 StorageService.java (line 988) 
 Node /50.56.59.68 state jump to normal
  INFO [GossipStage:1] 2012-01-12 23:59:35,176 Gossiper.java (line 836) Node 
 /50.56.58.55 has restarted, now UP
  INFO [GossipStage:1] 2012-01-12 23:59:35,176 Gossiper.java (line 804) 
 InetAddress /50.56.58.55 is now UP
  INFO [GossipStage:1] 2012-01-12 23:59:35,177 StorageService.java (line 1016) 
 Nodes /50.56.58.55 and action-quick2/50.56.31.186 have the same token 
 85070591730234615865843651857942052864.  Ignoring /50.56.58.55
  INFO [GossipTasks:1] 2012-01-12 23:59:45,048 Gossiper.java (line 818) 
 InetAddress /50.56.58.55 is now dead.
  INFO [GossipTasks:1] 2012-01-13 00:00:06,062 Gossiper.java (line 632) 
 FatClient /50.56.58.55 has been silent for 3ms, removing from gossip
  INFO [GossipStage:1] 2012-01-13 00:01:06,320 Gossiper.java (line 838) Node 
 /50.56.58.55 is now part of the cluster
  INFO [GossipStage:1] 2012-01-13 00:01:06,320 Gossiper.java (line 804) 
 InetAddress /50.56.58.55 is now UP
  INFO [GossipStage:1] 2012-01-13 00:01:06,321 StorageService.java (line 1016) 
 Nodes /50.56.58.55 and action-quick2/50.56.31.186 have the same token 
 85070591730234615865843651857942052864.  Ignoring /50.56.58.55
  INFO [GossipTasks:1] 2012-01-13 00:01:16,106 Gossiper.java (line 818) 
 InetAddress /50.56.58.55 is now dead.
  INFO [GossipTasks:1] 2012-01-13 00:01:37,121 Gossiper.java (line 632) 
 FatClient /50.56.58.55 has been silent for 3ms, removing from gossip
  INFO [GossipStage:1] 2012-01-13 00:02:37,352 Gossiper.java (line 838) Node 
 /50.56.58.55 is now part of the cluster
  INFO [GossipStage:1] 2012-01-13 00:02:37,353 Gossiper.java (line 804) 
 InetAddress /50.56.58.55 is now UP
  INFO [GossipStage:1] 2012-01-13 00:02:37,353 StorageService.java (line 1016) 
 Nodes /50.56.58.55 and action-quick2/50.56.31.186 have the same token 
 85070591730234615865843651857942052864.  Ignoring /50.56.58.55
  INFO [GossipTasks:1] 2012-01-13 00:02:47,158 Gossiper.java (line 818) 
 InetAddress /50.56.58.55 is now dead.
  INFO [GossipStage:1] 2012-01-13 00:02:50,162 Gossiper.java (line 818) 
 InetAddress /50.56.58.55 is now dead.
  INFO [GossipStage:1] 2012-01-13 00:02:50,163 StorageService.java (line 1156) 
 Removing token 122029383590318827259508597176866581733 for /50.56.58.55
 {noformat}
 in the above, /50.56.58.55 was the replaced IP.
 tried adding the Gossiper.instance.removeEndpoint(endpoint); in the 
 StorageService.java where the message 'Nodes %s and %s have the same token 
 %s.  Ignoring %s,' seems only have fixed this temporary. Here is a ring 
 output:
 {noformat}
 riptano@action-quick:~/work/cassandra$ ./bin/nodetool -h localhost ring
 Address DC  RackStatus State   LoadOwns   
  Token   
   
  85070591730234615865843651857942052864  
 50.56.59.68 datacenter1 rack1   Up Normal  6.67 KB 85.56% 
  

[jira] [Commented] (CASSANDRA-3743) Lower memory consumption used by index sampling

2012-01-18 Thread Radim Kolar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188950#comment-13188950
 ] 

Radim Kolar commented on CASSANDRA-3743:


patch is against-1.0. This was expected to go into 1.0-branch. Its small change 
KeyPosition was referenced by just one other class

 Lower memory consumption used by index sampling
 ---

 Key: CASSANDRA-3743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3743
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.0
Reporter: Radim Kolar
Assignee: Radim Kolar
  Labels: optimization
 Fix For: 1.1

 Attachments: cassandra-3743.txt


 currently j.o.a.c.io.sstable.indexsummary is implemented as ArrayList of 
 KeyPosition (RowPosition key, long offset)i propose to change it to:
 RowPosition keys[]
 long offsets[]
 and use standard binary search on it. This will lower number of java objects 
 used per entry from 2 (KeyPosition + RowPosition) to 1 (RowPosition).
 For building these arrays convenient ArrayList class can be used and then 
 call to .toArray() on it.
 This is very important because index sampling uses a lot of memory on nodes 
 with billions rows

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira