Re: unbalanced ring

2013-02-13 Thread Alain RODRIGUEZ
Maybe people think that 1.2 = Vnodes, when Vnodes are actually not
mandatory and furthermore it is advised to upgrade and then, after a while,
when all is running smooth, eventually switch to vnodes...


2013/2/13 Brandon Williams dri...@gmail.com

 On Tue, Feb 12, 2013 at 6:13 PM, Edward Capriolo edlinuxg...@gmail.com
 wrote:
 
  Are vnodes on by default. It seems that many on list are using this
 feature
  with small clusters.

 They are not.

 -Brandon



Re: Deleting old items

2013-02-13 Thread Alain RODRIGUEZ
Hi Aaron, once again thanks for this answer.

So is it possible to delete all the data inserted in some CF between 2
dates or data older than 1 month ?

No. 

Why is there no way of deleting or getting data using the internal
timestamp stored alongside of any inserted column (as described here:
http://www.datastax.com/docs/1.1/ddl/column_family#standard-columns) ? Is
that a feature that could possibly be developed one day ? It could
be useful to perform delete of old data or to bring to a dev cluster just
the last week of data for example.

With min_compaction_level_threshold did you mean min_compaction_threshold
 ? If so, why should I do that, what are the advantage/inconvenient of
reducing this value ?

Looking at the doc I saw that: max_compaction_threshold: Ignored in
Cassandra 1.1 and later.. How to ensure that I'll always keep a small
amount of SSTables then ? Why is this deprecated ?

Alain


2013/2/12 aaron morton aa...@thelastpickle.com

 So is it possible to delete all the data inserted in some CF between 2
 dates or data older than 1 month ?

 No.

 You need to issue row level deletes. If you don't know the row key you'll
 need to do range scans to locate them.

 If you are deleting parts of wide rows consider reducing the
 min_compaction_level_threshold on the CF to 2

 Cheers


-
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 12/02/2013, at 4:21 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:

 Hi,

 I would like to know if there is a way to delete old/unused data easily ?

 I know about TTL but there are 2 limitations of TTL:

 - AFAIK, there is no TTL on counter columns
 - TTL need to be defined at write time, so it's too late for data already
 inserted.

 I also could use a standard delete but it seems inappropriate for such a
 massive.

 In some cases, I don't know the row key and would like to delete all the
 rows starting by, let's say, 1050#...

 Even better, I understood that columns are always inserted in C* with
 (name, value, timestamp). So is it possible to delete all the data inserted
 in some CF between 2 dates or data older than 1 month ?

 Alain





Deleting old items during compaction (WAS: Deleting old items)

2013-02-13 Thread Ilya Grebnov
Hi,

 

We looking for solution for same problem. We have a wide column family with
counters and we want to delete old data like 1 months old. One of potential
ideas was to implement hook in compaction code and drop column which we
don't need. Is this a viable option?

 

Thanks,

Ilya

From: aaron morton [mailto:aa...@thelastpickle.com] 
Sent: Tuesday, February 12, 2013 9:01 AM
To: user@cassandra.apache.org
Subject: Re: Deleting old items

 

So is it possible to delete all the data inserted in some CF between 2 dates
or data older than 1 month ?

No. 

 

You need to issue row level deletes. If you don't know the row key you'll
need to do range scans to locate them. 

 

If you are deleting parts of wide rows consider reducing the
min_compaction_level_threshold on the CF to 2

 

Cheers

 

 

-

Aaron Morton

Freelance Cassandra Developer

New Zealand

 

@aaronmorton

http://www.thelastpickle.com

 

On 12/02/2013, at 4:21 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:





Hi,

 

I would like to know if there is a way to delete old/unused data easily ?

 

I know about TTL but there are 2 limitations of TTL:

 

- AFAIK, there is no TTL on counter columns

- TTL need to be defined at write time, so it's too late for data already
inserted.

 

I also could use a standard delete but it seems inappropriate for such a
massive.

 

In some cases, I don't know the row key and would like to delete all the
rows starting by, let's say, 1050#... 

 

Even better, I understood that columns are always inserted in C* with (name,
value, timestamp). So is it possible to delete all the data inserted in some
CF between 2 dates or data older than 1 month ?

 

Alain

 



Write performance expectations...

2013-02-13 Thread kadey
Hello, 
New member here, and I have (yet another) question on write performance. 

I'm using Apache Cassandra version 1.1, Python 2.7 and Pycassa 1.7. 

I have a cluster of 2 datacenters, each with 3 nodes, on AWS EC2 using EBS and 
the RandomPartioner. I'm writing to a column family in a keyspace that's 
replicated to all nodes in both datacenters, with a consistency level of 
LOCAL_QUORUM. 

I'm seeing write performance of around 2500 rows per second. 

Is this in the ballpark for this kind of configuration? 

Thanks in advance. 




Ken 


Re: Write performance expectations...

2013-02-13 Thread Alain RODRIGUEZ
Is there a particular reason for you to use EBS ? Instance Store
are recommended because they improve performances by reducing the I/O
throttling.

An other thing you should be aware of is that replicating the data to all
node reduce your performance, it is more or less like if you had only one
node (at performance level I mean).

Also, writing to different datacenters probably induce some network latency.

You should give the EC2 instance type (m1.xlarge / m1.large / ...) if you
want some feedback about the 2500 w/s, and also give the mean size of your
rows.

Alain


2013/2/13 ka...@comcast.net

 Hello,
  New member here, and I have (yet another) question on write
 performance.

 I'm using Apache Cassandra version 1.1, Python 2.7 and Pycassa 1.7.

 I have a cluster of 2 datacenters, each with 3 nodes, on AWS EC2 using EBS
 and the RandomPartioner. I'm writing to a column family in a keyspace
 that's replicated to all nodes in both datacenters, with a consistency
 level of LOCAL_QUORUM.

 I'm seeing write performance of around 2500 rows per second.

 Is this in the ballpark for this kind of configuration?

 Thanks in advance.

 Ken




Re: Cluster not accepting insert while one node is down

2013-02-13 Thread Alain RODRIGUEZ
We probably need more info like the RF of your cluster and CL of your reads
and writes. Maybe could you also tell us if you use vnodes or not.

I heard that Astyanax was not running very smoothly on 1.2.0, but a bit
better on 1.2.1. Yet, Netflix didn't release a version of Astyanax for
C*1.2.

Alain


2013/2/13 Traian Fratean traian.frat...@gmail.com

 Hi,

 I have a cluster of 5 nodes running Cassandra 1.2.0 . I have a Java client
 with Astyanax 1.56.21.
 When a node(10.60.15.67 - *diiferent* from the one in the stacktrace
 below) went down I get TokenRandeOfflineException and no other data gets
 inserted into *any other* node from the cluster.

 Am I having a configuration issue or this is supposed to happen?


 com.netflix.astyanax.connectionpool.impl.CountingConnectionPoolMonitor.trackError(CountingConnectionPoolMonitor.java:81)
 -
 com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException:
 TokenRangeOfflineException: [host=10.60.15.66(10.60.15.66):9160,
 latency=2057(2057), attempts=1]UnavailableException()
 com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException:
 TokenRangeOfflineException: [host=10.60.15.66(10.60.15.66):9160,
 latency=2057(2057), attempts=1]UnavailableException()
 at
 com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
  at
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
 at
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:27)
  at
 com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$1.execute(ThriftSyncConnectionFactoryImpl.java:140)
 at
 com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
  at
 com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:255)



 Thank you,
 Traian.



Re: Documentation/Examples for DataStax java-driver

2013-02-13 Thread Shahryar Sedghi
Source code has enough documentation in it, apparently this is how they do
it with new stuff. Start with Custer class, it tells you how to write. If
you still had problem let me know, I can give you sample code.


On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian d...@venarc.com wrote:

 Are there any documentation/examples available for DataStax java-driver
 besides what's in the GitHub repo?

 -- Drew




-- 
Life is what happens while you are making other plans. ~ John Lennon


Re: Documentation/Examples for DataStax java-driver

2013-02-13 Thread Gabriel Ciuloaica
Code has good documentation and also the example module has enough 
sample code to help you started.


--Gabi

On 2/13/13 5:31 PM, Shahryar Sedghi wrote:
Source code has enough documentation in it, apparently this is how 
they do it with new stuff. Start with Custer class, it tells you how 
to write. If you still had problem let me know, I can give you sample 
code.



On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian d...@venarc.com 
mailto:d...@venarc.com wrote:


Are there any documentation/examples available for DataStax
java-driver besides what's in the GitHub repo?

-- Drew




--
Life is what happens while you are making other plans. ~ John Lennon




Size Tiered - Leveled Compaction

2013-02-13 Thread Mike

Hello,

I'm investigating the transition of some of our column families from 
Size Tiered - Leveled Compaction.  I believe we have some 
high-read-load column families that would benefit tremendously.


I've stood up a test DB Node to investigate the transition.  I 
successfully alter the column family, and I immediately noticed a large 
number (1000+) pending compaction tasks become available, but no 
compaction get executed.


I tried running nodetool sstableupgrade on the column family, and the 
compaction tasks don't move.


I also notice no changes to the size and distribution of the existing 
SSTables.


I then run a major compaction on the column family.  All pending 
compaction tasks get run, and the SSTables have a distribution that I 
would expect from LeveledCompaction (lots and lots of 10MB files).


Couple of questions:

1) Is a major compaction required to transition from size-tiered to 
leveled compaction?
2) Are major compactions as much of a concern for LeveledCompaction as 
their are for Size Tiered?


All the documentation I found concerning transitioning from Size Tiered 
to Level compaction discuss the alter table cql command, but I haven't 
found too much on what else needs to be done after the schema change.


I did these tests with Cassandra 1.1.9.

Thanks,
-Mike


Re: Write performance expectations...

2013-02-13 Thread Tyler Hobbs
2500 inserts per second is about what a single python thread using pycassa
can do against a local node.  Are you using multiple threads for the
inserts? Multiple processes?


On Wed, Feb 13, 2013 at 8:21 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:

 Is there a particular reason for you to use EBS ? Instance Store
 are recommended because they improve performances by reducing the I/O
 throttling.

 An other thing you should be aware of is that replicating the data to all
 node reduce your performance, it is more or less like if you had only one
 node (at performance level I mean).

 Also, writing to different datacenters probably induce some network
 latency.

 You should give the EC2 instance type (m1.xlarge / m1.large / ...) if you
 want some feedback about the 2500 w/s, and also give the mean size of your
 rows.

 Alain


 2013/2/13 ka...@comcast.net

 Hello,
  New member here, and I have (yet another) question on write
 performance.

 I'm using Apache Cassandra version 1.1, Python 2.7 and Pycassa 1.7.

 I have a cluster of 2 datacenters, each with 3 nodes, on AWS EC2 using
 EBS and the RandomPartioner. I'm writing to a column family in a keyspace
 that's replicated to all nodes in both datacenters, with a consistency
 level of LOCAL_QUORUM.

 I'm seeing write performance of around 2500 rows per second.

 Is this in the ballpark for this kind of configuration?

 Thanks in advance.

 Ken





-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: Documentation/Examples for DataStax java-driver

2013-02-13 Thread Edward Capriolo
Just an FYI. More appropriate for the client-dev list.

On Wed, Feb 13, 2013 at 10:37 AM, Gabriel Ciuloaica
gciuloa...@gmail.com wrote:
 Code has good documentation and also the example module has enough sample
 code to help you started.

 --Gabi

 On 2/13/13 5:31 PM, Shahryar Sedghi wrote:

 Source code has enough documentation in it, apparently this is how they do
 it with new stuff. Start with Custer class, it tells you how to write. If
 you still had problem let me know, I can give you sample code.


 On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian d...@venarc.com wrote:

 Are there any documentation/examples available for DataStax java-driver
 besides what's in the GitHub repo?

 -- Drew




 --
 Life is what happens while you are making other plans. ~ John Lennon




Re: Documentation/Examples for DataStax java-driver

2013-02-13 Thread Drew Kutcharian
@Shahryar/Gabriel
I know the source code is nicely documented, but I couldn't find much info on:
1. Creating/submitting atomic/non-atomic batches.
2. Handling Counter columns
Do you have any examples for that?

@Edward
I was under impression that client-dev mailing list was to be used by the 
developers/committers of the client libs and each client has their own mailing 
list such as hector, but I'm not sure there exist a mailing list for DataStax's 
java-driver.


-- Drew



On Feb 13, 2013, at 8:06 AM, Edward Capriolo edlinuxg...@gmail.com wrote:

 Just an FYI. More appropriate for the client-dev list.
 
 On Wed, Feb 13, 2013 at 10:37 AM, Gabriel Ciuloaica
 gciuloa...@gmail.com wrote:
 Code has good documentation and also the example module has enough sample
 code to help you started.
 
 --Gabi
 
 On 2/13/13 5:31 PM, Shahryar Sedghi wrote:
 
 Source code has enough documentation in it, apparently this is how they do
 it with new stuff. Start with Custer class, it tells you how to write. If
 you still had problem let me know, I can give you sample code.
 
 
 On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian d...@venarc.com wrote:
 
 Are there any documentation/examples available for DataStax java-driver
 besides what's in the GitHub repo?
 
 -- Drew
 
 
 
 
 --
 Life is what happens while you are making other plans. ~ John Lennon
 
 



Re: Write performance expectations...

2013-02-13 Thread kadey
I'm not using multi-threads/processes. I'll try multi-threading to see if I get 
a boost. 

Thanks. 


Ken 


- Original Message -
From: Tyler Hobbs ty...@datastax.com 
To: user@cassandra.apache.org 
Sent: Wednesday, February 13, 2013 11:06:30 AM 
Subject: Re: Write performance expectations... 


2500 inserts per second is about what a single python thread using pycassa can 
do against a local node. Are you using multiple threads for the inserts? 
Multiple processes? 




On Wed, Feb 13, 2013 at 8:21 AM, Alain RODRIGUEZ  arodr...@gmail.com  wrote: 



Is there a particular reason for you to use EBS ? Instance Store are 
recommended because they improve performances by reducing the I/O throttling. 


An other thing you should be aware of is that replicating the data to all node 
reduce your performance, it is more or less like if you had only one node (at 
performance level I mean). 


Also, writing to different datacenters probably induce some network latency. 


You should give the EC2 instance type (m1.xlarge / m1.large / ...) if you want 
some feedback about the 2500 w/s, and also give the mean size of your rows. 


Alain 



2013/2/13  ka...@comcast.net  



blockquote


Hello, 
New member here, and I have (yet another) question on write performance. 

I'm using Apache Cassandra version 1.1, Python 2.7 and Pycassa 1.7. 

I have a cluster of 2 datacenters, each with 3 nodes, on AWS EC2 using EBS and 
the RandomPartioner. I'm writing to a column family in a keyspace that's 
replicated to all nodes in both datacenters, with a consistency level of 
LOCAL_QUORUM. 

I'm seeing write performance of around 2500 rows per second. 

Is this in the ballpark for this kind of configuration? 

Thanks in advance. 




Ken 




/blockquote



-- 
Tyler Hobbs 
DataStax 


Re: Documentation/Examples for DataStax java-driver

2013-02-13 Thread Shahryar Sedghi
The API allows to build your own batch through building a query I do not
use that, neither counter columns. I do not build a query, I create a CQL
like:
String batchInsert = BEGIN  BATCH  +

INSERT INTO xyz( a,b,c,   +
 VALUES ( ?, ?, ?)   +

INSERT INTO def(a, b,   ,c, +

 VALUES ( ?, ?, ?)  +

APPLY BATCH;

 PreparedStatement prBatchInsert = session.prepare(batchInsert);
statement.setConsistencyLevel(ConsistencyLevel.QUORUM);
 BoundStatement query = prBatchInsert.bind(1,2,3, 1,2,3);
 session.execute(query);

I got session through this:

cluster =
Cluster.builder().addContactPoint(getInitParameter(cassandraCluster))

.withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE ).build();
 session = cluster.connect(getInitParameter(keyspace));

I have queries that i have begin unlogged batch instead of begin batch

Hopefully it helps


On Wed, Feb 13, 2013 at 12:23 PM, Drew Kutcharian d...@venarc.com wrote:

 @Shahryar/Gabriel
 I know the source code is nicely documented, but I couldn't find much info
 on:
 1. Creating/submitting atomic/non-atomic batches.
 2. Handling Counter columns
 Do you have any examples for that?

 @Edward
 I was under impression that client-dev mailing list was to be used by the
 developers/committers of the client libs and each client has their own
 mailing list such as hector, but I'm not sure there exist a mailing list
 for DataStax's java-driver.


 -- Drew



 On Feb 13, 2013, at 8:06 AM, Edward Capriolo edlinuxg...@gmail.com
 wrote:

  Just an FYI. More appropriate for the client-dev list.
 
  On Wed, Feb 13, 2013 at 10:37 AM, Gabriel Ciuloaica
  gciuloa...@gmail.com wrote:
  Code has good documentation and also the example module has enough
 sample
  code to help you started.
 
  --Gabi
 
  On 2/13/13 5:31 PM, Shahryar Sedghi wrote:
 
  Source code has enough documentation in it, apparently this is how they
 do
  it with new stuff. Start with Custer class, it tells you how to write.
 If
  you still had problem let me know, I can give you sample code.
 
 
  On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian d...@venarc.com
 wrote:
 
  Are there any documentation/examples available for DataStax java-driver
  besides what's in the GitHub repo?
 
  -- Drew
 
 
 
 
  --
  Life is what happens while you are making other plans. ~ John Lennon
 
 




-- 
Life is what happens while you are making other plans. ~ John Lennon


Re: Documentation/Examples for DataStax java-driver

2013-02-13 Thread Michaël Figuière
As mentioned previously the code comes with some detailed Javadoc. But
you're right as well that Javadoc isn't enough. At DataStax we're currently
working on a documentation for our drivers that will be as detailed as the
one we provide for Apache Cassandra.

Meanwhile we'll also add more code samples to make it easier for newcomers.



Michaël

On Wed, Feb 13, 2013 at 10:07 AM, Shahryar Sedghi shsed...@gmail.comwrote:

 The API allows to build your own batch through building a query I do not
 use that, neither counter columns. I do not build a query, I create a CQL
 like:
 String batchInsert = BEGIN  BATCH  +

 INSERT INTO xyz( a,b,c,   +
  VALUES ( ?, ?, ?)   +

 INSERT INTO def(a, b,   ,c, +

  VALUES ( ?, ?, ?)  +

 APPLY BATCH;

  PreparedStatement prBatchInsert = session.prepare(batchInsert);
 statement.setConsistencyLevel(ConsistencyLevel.QUORUM);
  BoundStatement query = prBatchInsert.bind(1,2,3, 1,2,3);
  session.execute(query);

 I got session through this:

 cluster =
 Cluster.builder().addContactPoint(getInitParameter(cassandraCluster))

 .withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE ).build();
  session = cluster.connect(getInitParameter(keyspace));

 I have queries that i have begin unlogged batch instead of begin batch

 Hopefully it helps


 On Wed, Feb 13, 2013 at 12:23 PM, Drew Kutcharian d...@venarc.com wrote:

 @Shahryar/Gabriel
 I know the source code is nicely documented, but I couldn't find much
 info on:
 1. Creating/submitting atomic/non-atomic batches.
 2. Handling Counter columns
 Do you have any examples for that?

 @Edward
 I was under impression that client-dev mailing list was to be used by the
 developers/committers of the client libs and each client has their own
 mailing list such as hector, but I'm not sure there exist a mailing list
 for DataStax's java-driver.


 -- Drew



 On Feb 13, 2013, at 8:06 AM, Edward Capriolo edlinuxg...@gmail.com
 wrote:

  Just an FYI. More appropriate for the client-dev list.
 
  On Wed, Feb 13, 2013 at 10:37 AM, Gabriel Ciuloaica
  gciuloa...@gmail.com wrote:
  Code has good documentation and also the example module has enough
 sample
  code to help you started.
 
  --Gabi
 
  On 2/13/13 5:31 PM, Shahryar Sedghi wrote:
 
  Source code has enough documentation in it, apparently this is how
 they do
  it with new stuff. Start with Custer class, it tells you how to write.
 If
  you still had problem let me know, I can give you sample code.
 
 
  On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian d...@venarc.com
 wrote:
 
  Are there any documentation/examples available for DataStax
 java-driver
  besides what's in the GitHub repo?
 
  -- Drew
 
 
 
 
  --
  Life is what happens while you are making other plans. ~ John Lennon
 
 




 --
 Life is what happens while you are making other plans. ~ John Lennon



Re: Documentation/Examples for DataStax java-driver

2013-02-13 Thread Edward Capriolo
@Drew

This list is for cassandra users. Since the DataStax java-driver is
not actually part of Cassandra. If every user comes here to talk about
their driver/orm/problems they are having with code that is not part
of cassandra this list will get noisy.

IMHO client-dev is the right place for these type of topics.
Occasionally a cross post makes sense.

Edward

On Wed, Feb 13, 2013 at 12:23 PM, Drew Kutcharian d...@venarc.com wrote:
 @Shahryar/Gabriel
 I know the source code is nicely documented, but I couldn't find much info on:
 1. Creating/submitting atomic/non-atomic batches.
 2. Handling Counter columns
 Do you have any examples for that?

 @Edward
 I was under impression that client-dev mailing list was to be used by the 
 developers/committers of the client libs and each client has their own 
 mailing list such as hector, but I'm not sure there exist a mailing list for 
 DataStax's java-driver.


 -- Drew



 On Feb 13, 2013, at 8:06 AM, Edward Capriolo edlinuxg...@gmail.com wrote:

 Just an FYI. More appropriate for the client-dev list.

 On Wed, Feb 13, 2013 at 10:37 AM, Gabriel Ciuloaica
 gciuloa...@gmail.com wrote:
 Code has good documentation and also the example module has enough sample
 code to help you started.

 --Gabi

 On 2/13/13 5:31 PM, Shahryar Sedghi wrote:

 Source code has enough documentation in it, apparently this is how they do
 it with new stuff. Start with Custer class, it tells you how to write. If
 you still had problem let me know, I can give you sample code.


 On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian d...@venarc.com wrote:

 Are there any documentation/examples available for DataStax java-driver
 besides what's in the GitHub repo?

 -- Drew




 --
 Life is what happens while you are making other plans. ~ John Lennon





Re: [VOTE] Release Mojo's Cassandra Maven Plugin 1.2.0-1

2013-02-13 Thread Stephen Connolly
More I'm looking for somebody who is actively sing C* to test it (there are
a couple of users... The lot f you who asked me to roll another release). I
will roll a 1.2.1 once I close this vote... I could close with lazy
consensus, but feel more comfortable if it has ad some testing ;-)

On Wednesday, 13 February 2013, Michael Kjellman wrote:

 Considering that 1.2.1 is out, and looking at your project very quickly
 (looks interesting)/overlaps a bit with CCMBridge no?/ I'd def say +1 :)

 From: Stephen Connolly stephen.alan.conno...@gmail.com javascript:;
 mailto:stephen.alan.conno...@gmail.com javascript:;
 Reply-To: user@cassandra.apache.org javascript:;mailto:
 user@cassandra.apache.org javascript:; 
 user@cassandra.apache.orgjavascript:;
 mailto:user@cassandra.apache.org javascript:;
 Date: Wednesday, February 13, 2013 1:27 PM
 To: d...@mojo.codehaus.org 
 javascript:;mailto:d...@mojo.codehaus.orgjavascript:;
 d...@mojo.codehaus.org 
 javascript:;mailto:d...@mojo.codehaus.orgjavascript:;,
 dev d...@cassandra.apache.org javascript:;mailto:
 d...@cassandra.apache.org javascript:;, 
 user@cassandra.apache.orgjavascript:;
 mailto:user@cassandra.apache.org javascript:; 
 user@cassandra.apache.org 
 javascript:;mailto:user@cassandra.apache.orgjavascript:;
 
 Subject: Re: [VOTE] Release Mojo's Cassandra Maven Plugin 1.2.0-1

 Ping

 On Monday, 4 February 2013, Stephen Connolly wrote:
 Hi,

 I'd like to release version 1.2.0-1 of Mojo's Cassandra Maven Plugin
 to sync up with the 1.2.0 release of Apache Cassandra. (a 1.2.1-1 will
 follow shortly after this release, but it should be possible to use the
 xpath://project/build/plugins/plugin/dependencies/dependency override of
 cassandra-server to use C* releases from the 1.2.x stream now that the link
 errors have been resolved, so that is less urgent)

 We solved 1 issues:

 http://jira.codehaus.org/secure/ReleaseNote.jspa?projectId=12121version=18467

 Staging Repository:
 https://nexus.codehaus.org/content/repositories/orgcodehausmojo-013/

 Site:
 http://mojo.codehaus.org/cassandra-maven-plugin/index.html

 SCM Tag:
 https://svn.codehaus.org/mojo/tags/cassandra-maven-plugin-1.2.0-1@17921

  [ ] +1 Yeah! fire ahead oh and the blind man on the galloping horse
 says it looks fine too.
  [ ] 0 Mehhh! like I care, I don't have any opinions either, I'd
 follow somebody else if only I could decide who
  [ ] -1 No! wait up there I have issues (in general like, ya know,
 and being a trouble-maker is only one of them)

 The vote is open for 72h and will succeed by lazy consensus.

 Guide to testing staged releases:
 http://maven.apache.org/guides/development/guide-testing-releases.html

 Cheers

 -Stephen

 P.S.
  In the interest of ensuring (more is) better testing, and as is now
 tradition for Mojo's Cassandra Maven Plugin, this vote is
 also open to any subscribers of the dev and 
 user@cassandra.apache.orgjavascript:;
 javascript:_e({},%20'cvml',%20'user@cassandra.apache.org javascript:;
 ');
 mailing lists that want to test or use this plugin.



Re: Documentation/Examples for DataStax java-driver

2013-02-13 Thread Drew Kutcharian
That's kinda what I was thinking to, just wanted to see if there's a built-in 
way.


On Feb 13, 2013, at 10:07 AM, Shahryar Sedghi shsed...@gmail.com wrote:

 The API allows to build your own batch through building a query I do not use 
 that, neither counter columns. I do not build a query, I create a CQL like:
 String batchInsert = BEGIN  BATCH  +
 
 INSERT INTO xyz( a,b,c,   +
  VALUES ( ?, ?, ?)   +
 
 INSERT INTO def(a, b,   ,c, +
  
  VALUES ( ?, ?, ?)  +
 
 APPLY BATCH; 
 
  PreparedStatement prBatchInsert = session.prepare(batchInsert);
 statement.setConsistencyLevel(ConsistencyLevel.QUORUM);
  BoundStatement query = prBatchInsert.bind(1,2,3, 1,2,3);
  session.execute(query);
 
 I got session through this:
 
 cluster = 
 Cluster.builder().addContactPoint(getInitParameter(cassandraCluster))
 
 .withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE ).build();
  session = cluster.connect(getInitParameter(keyspace));
 
 I have queries that i have begin unlogged batch instead of begin batch
 
 Hopefully it helps
 
 
 On Wed, Feb 13, 2013 at 12:23 PM, Drew Kutcharian d...@venarc.com wrote:
 @Shahryar/Gabriel
 I know the source code is nicely documented, but I couldn't find much info on:
 1. Creating/submitting atomic/non-atomic batches.
 2. Handling Counter columns
 Do you have any examples for that?
 
 @Edward
 I was under impression that client-dev mailing list was to be used by the 
 developers/committers of the client libs and each client has their own 
 mailing list such as hector, but I'm not sure there exist a mailing list for 
 DataStax's java-driver.
 
 
 -- Drew
 
 
 
 On Feb 13, 2013, at 8:06 AM, Edward Capriolo edlinuxg...@gmail.com wrote:
 
  Just an FYI. More appropriate for the client-dev list.
 
  On Wed, Feb 13, 2013 at 10:37 AM, Gabriel Ciuloaica
  gciuloa...@gmail.com wrote:
  Code has good documentation and also the example module has enough sample
  code to help you started.
 
  --Gabi
 
  On 2/13/13 5:31 PM, Shahryar Sedghi wrote:
 
  Source code has enough documentation in it, apparently this is how they do
  it with new stuff. Start with Custer class, it tells you how to write. If
  you still had problem let me know, I can give you sample code.
 
 
  On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian d...@venarc.com wrote:
 
  Are there any documentation/examples available for DataStax java-driver
  besides what's in the GitHub repo?
 
  -- Drew
 
 
 
 
  --
  Life is what happens while you are making other plans. ~ John Lennon
 
 
 
 
 
 
 -- 
 Life is what happens while you are making other plans. ~ John Lennon



Re: Write performance expectations...

2013-02-13 Thread Ken Adey
On a single processor EC2 instance, however, multiprocessing would be 
useless.


Ken

On 2/13/2013 5:29 PM, Ben Bromhead wrote:
If you are using CPython (most likely) remember to use the 
multiprocessing interface rather than multithreading to avoid the 
global interpreter lock.


Cheers

Ben

On Thu, Feb 14, 2013 at 4:35 AM, ka...@comcast.net 
mailto:ka...@comcast.net wrote:


I'm not using multi-threads/processes. I'll try multi-threading to
see if I get a boost.

Thanks.

Ken



*From: *Tyler Hobbs ty...@datastax.com mailto:ty...@datastax.com
*To: *user@cassandra.apache.org mailto:user@cassandra.apache.org
*Sent: *Wednesday, February 13, 2013 11:06:30 AM
*Subject: *Re: Write performance expectations...


2500 inserts per second is about what a single python thread using
pycassa can do against a local node.  Are you using multiple
threads for the inserts? Multiple processes?


On Wed, Feb 13, 2013 at 8:21 AM, Alain RODRIGUEZ
arodr...@gmail.com mailto:arodr...@gmail.com wrote:

Is there a particular reason for you to use EBS ? Instance
Store are recommended because they improve performances by
reducing the I/O throttling.

An other thing you should be aware of is that replicating the
data to all node reduce your performance, it is more or less
like if you had only one node (at performance level I mean).

Also, writing to different datacenters probably induce some
network latency.

You should give the EC2 instance type (m1.xlarge / m1.large /
...) if you want some feedback about the 2500 w/s, and also
give the mean size of your rows.

Alain


2013/2/13 ka...@comcast.net mailto:ka...@comcast.net

Hello,
 New member here, and I have (yet another) question on
write performance.

I'm using Apache Cassandra version 1.1, Python 2.7 and
Pycassa 1.7.

I have a cluster of 2 datacenters, each with 3 nodes, on
AWS EC2 using EBS and the RandomPartioner. I'm writing to
a column family in a keyspace that's replicated to all
nodes in both datacenters, with a consistency level of
LOCAL_QUORUM.

I'm seeing write performance of around 2500 rows per second.

Is this in the ballpark for this kind of configuration?

Thanks in advance.

Ken





-- 
Tyler Hobbs

DataStax http://datastax.com/




Cassandra Geospatial Search

2013-02-13 Thread Drew Kutcharian
Hi Guys,

Has anyone on this mailing list tried to build a bounding box style (get the 
records inside a known bounding box) geospatial search? I've been researching 
this a bit and seems like the only attempt at this was by SimpleGeo guys, but 
there isn't much public info out there on how they did it besides the a video.

-- Drew



Re: Cassandra Geospatial Search

2013-02-13 Thread Joe Stein
what about using geo hashes http://geohash.org/dr5ru2mevjppe

store as column names the geo hashes

geohash#dr5ru2mevjppe
geohash#dr5ru2mevjpp
geohash#dr5ru2mevjp
geohash#dr5ru2mevj
geohash#dr5ru2mev
geohash#dr5ru2me
geohash#dr5ru2m
geohash#dr5ru2
geohash#dr5ru
geohash#dr5

the rows is what you want to return

do a MultigetSliceQuery like this
https://github.com/joestein/skeletor/blob/master/src/test/scala/skeletor/SkeletorSpec.scala#L171

the column value you can hold some json objects or more serialization on
relationships from there, maybe persisted graph structure

here are my slides on how we do this and what for
http://files.meetup.com/1794037/jstein.meetup.cassandra2002.pptx

On Wed, Feb 13, 2013 at 8:42 PM, Drew Kutcharian d...@venarc.com wrote:

 Hi Guys,

 Has anyone on this mailing list tried to build a bounding box style (get
 the records inside a known bounding box) geospatial search? I've been
 researching this a bit and seems like the only attempt at this was by
 SimpleGeo guys, but there isn't much public info out there on how they did
 it besides the a video.

 -- Drew




-- 

/*
Joe Stein
http://www.linkedin.com/in/charmalloc
Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
*/


Re: [VOTE] Release Mojo's Cassandra Maven Plugin 1.2.0-1

2013-02-13 Thread Mikhail Mazursky
+1. Please, release it.


2013/2/14 Stephen Connolly stephen.alan.conno...@gmail.com

 More I'm looking for somebody who is actively sing C* to test it (there
 are a couple of users... The lot f you who asked me to roll another
 release). I will roll a 1.2.1 once I close this vote... I could close with
 lazy consensus, but feel more comfortable if it has ad some testing ;-)


 On Wednesday, 13 February 2013, Michael Kjellman wrote:

 Considering that 1.2.1 is out, and looking at your project very quickly
 (looks interesting)/overlaps a bit with CCMBridge no?/ I'd def say +1 :)

 From: Stephen Connolly stephen.alan.conno...@gmail.commailto:
 stephen.alan.conno...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org

 Date: Wednesday, February 13, 2013 1:27 PM
 To: d...@mojo.codehaus.orgmailto:d...@mojo.codehaus.org 
 d...@mojo.codehaus.orgmailto:d...@mojo.codehaus.org, dev 
 d...@cassandra.apache.orgmailto:d...@cassandra.apache.org, 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org

 Subject: Re: [VOTE] Release Mojo's Cassandra Maven Plugin 1.2.0-1

 Ping

 On Monday, 4 February 2013, Stephen Connolly wrote:
 Hi,

 I'd like to release version 1.2.0-1 of Mojo's Cassandra Maven Plugin
 to sync up with the 1.2.0 release of Apache Cassandra. (a 1.2.1-1 will
 follow shortly after this release, but it should be possible to use the
 xpath://project/build/plugins/plugin/dependencies/dependency override of
 cassandra-server to use C* releases from the 1.2.x stream now that the link
 errors have been resolved, so that is less urgent)

 We solved 1 issues:

 http://jira.codehaus.org/secure/ReleaseNote.jspa?projectId=12121version=18467

 Staging Repository:
 https://nexus.codehaus.org/content/repositories/orgcodehausmojo-013/

 Site:
 http://mojo.codehaus.org/cassandra-maven-plugin/index.html

 SCM Tag:
 https://svn.codehaus.org/mojo/tags/cassandra-maven-plugin-1.2.0-1@17921

  [ ] +1 Yeah! fire ahead oh and the blind man on the galloping horse
 says it looks fine too.
  [ ] 0 Mehhh! like I care, I don't have any opinions either, I'd
 follow somebody else if only I could decide who
  [ ] -1 No! wait up there I have issues (in general like, ya know,
 and being a trouble-maker is only one of them)

 The vote is open for 72h and will succeed by lazy consensus.

 Guide to testing staged releases:
 http://maven.apache.org/guides/development/guide-testing-releases.html

 Cheers

 -Stephen

 P.S.
  In the interest of ensuring (more is) better testing, and as is now
 tradition for Mojo's Cassandra Maven Plugin, this vote is
 also open to any subscribers of the dev and user@cassandra.apache.org
 javascript:_e({},%20'cvml',%20'user@cassandra.apache.org');

 mailing lists that want to test or use this plugin.