Re: I don't understand paging through a table by primary key.

2014-05-30 Thread DuyHai Doan
Hello Kevin Can you be more specific on the issue you're facing ? What is the table design ? What kind of query are you doing ? Regards On Fri, May 30, 2014 at 7:10 AM, Kevin Burton bur...@spinn3r.com wrote: I'm trying to grok this but I can't figure it out in CQL world. I'd like to

Insert failed after some time in cassandra with timeout‏

2014-05-30 Thread Sharaf Ali
I have installed Cassandra 2.0 On CentOS6.5 Server and and while testing simple records everything is working fine, Now I have to upload 600 billion rows, when I use COPY on cqlsh it failed after 5 minutes and approx rows inserted are 0.2 million with rpc timeout, then I opted for pycasso and

Re: Multi-DC Environment Question

2014-05-30 Thread Vasileios Vlachos
Thanks for your responses, Ben thanks for the link. Basically you sort of confirmed that if down_time max_hint_window_in_ms the only way to bring DC1 up-to-date is anti-entropy repair. Read consistency level is irrelevant to the problem I described as I am reading LOCAL_QUORUM. In this situation

Impact of Bloom filter false positive rate

2014-05-30 Thread Thomas GERBET
Hi, I'm currently working on some properties of Bloom filters and this is the first time I use Cassandre, so I'm sorry if my question seems dumb. Basically, I try to see the impact of the false positive rate of Bloom filter on performance. My test case is: 1. I create a table with: create table

Re: Write Failed, COPY on cqlsh with rpc_timeout‏

2014-05-30 Thread Patricia Gorla
Sharaf, Do the logs show any errors while you're trying to insert into Cassandra? -- Patricia Gorla @patriciagorla Consultant Apache Cassandra Consulting http://www.thelastpickle.com http://thelastpickle.com

Re: Managing truststores with inter-node encryption

2014-05-30 Thread Jeremy Jongsma
It appears that only adding the CA certificate to the truststore is sufficient for this. On Thu, May 22, 2014 at 10:05 AM, Jeremy Jongsma jer...@barchart.com wrote: The docs say that each node needs every other node's certificate in its local truststore:

Re: Anyone using Astyanax in production besides Netflix itself?

2014-05-30 Thread user 01
Anyone who's already using Astyanax in production cluster? What C* do you use with Astyanax ?

Reading Cassandra Data From Pig/Hadoop

2014-05-30 Thread Alex McLintock
I am reasonably experienced with Hadoop and Pig but less so with Cassandra. I have been banging my head against the wall as all the documentation assumes I know something... I am using Apache's tarball of Cassandra 1.something and I see that there are some example pig scripts and a shell script

Re: Anyone using Astyanax in production besides Netflix itself?

2014-05-30 Thread Jeremy Powell
My team uses astyanax for 3 different c* clusters in production. we're on c* 1.2.xx. works well for our requirements - we don't use cql, mostly just time series data. But cutting this short, most people who ask about astyanax get redirected to their user group (

Re: I don't understand paging through a table by primary key.

2014-05-30 Thread Robert Coli
On Thu, May 29, 2014 at 10:10 PM, Kevin Burton bur...@spinn3r.com wrote: I'd like to efficiently page through a table via primary key. This way I only involve one node at a time and the reads on disk are This is only true if you use an Ordered Partitioner, which almost no one does? I

Re: Reading Cassandra Data From Pig/Hadoop

2014-05-30 Thread James Schappet
To specify your cassandra cluster, you only need to define one node: In you profile or batch command set and export these variables: export PIG_HOME=PATH TO PIG INSTALL export PIG_INITIAL_ADDRESS=localhost export PIG_RPC_PORT=9160 # the partitioner must match your cassandra partitioner

Re: I don't understand paging through a table by primary key.

2014-05-30 Thread Russell Bradberry
I think what you want is a clustering column”.  When you model your data, you specify “partition columns” which are synonymous with the old thrift style “keys” and clustering columns.  When creating your PRIMARY KEY, you specify the partition column first then each subsequent column in the

Re: I don't understand paging through a table by primary key.

2014-05-30 Thread Kevin Burton
The specific issue is I have a fairly large table, which is immutable, and I need to get it in a form where it can be downloaded, page by page, via an API. This would involve reading the whole table. I'd like to page through it by key order to efficiently read the rows to minimize random reads.

Re: I don't understand paging through a table by primary key.

2014-05-30 Thread Russell Bradberry
Then the data model you chose is incorrect.  As Rob Coli mentioned, you can not page through partitions that are ordered unless you are using an ordered partitioner.  Your only option is to store the data differently.  When using Cassandra you have to remember to “model your queries, not your

Re: Reading Cassandra Data From Pig/Hadoop

2014-05-30 Thread Kevin Burton
There's a pig-with-cassandra script somewhere you should be using. It adds the jars, etc. One issue, is that you need to call register on the .jars from your pig scripts. Honestly, someone should write an example pig setup with modern hadoop, all the right register commands, real UPDATE queries

A

2014-05-30 Thread Ruchir Jha
Sent from my iPhone

RE: Write Failed, COPY on cqlsh with rpc_timeout‏

2014-05-30 Thread Sharaf Ali
Dear Patricia, Here is trace of Error for your reference, Other this is that it an single node server only. KeySpace is Created using CREATE KEYSPACE mykeyspace WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 1}; Created table using: CREATE TABLE details ( id bigint

Cassandra Summit 2014 - San Francisco, CA

2014-05-30 Thread Brady Gentile
This year’s Cassandra Summit will be held on September 10th and 11th at The Westin St. Francis in San Francisco, CA. We invite you to submit your talk, register for free tickets, enroll in our 1-day Cassandra training (early bird pricing end May 31st), and apply for a seat at Cassandra Summit

Re: I don't understand paging through a table by primary key.

2014-05-30 Thread DuyHai Doan
Hello Kevin One possible data model: CREATE TABLE myLog( day int //day format as MMdd, date timeuuid, log_message text, PRIMARY_KEY(day,date) ); For each day, you can query paging by date (timeuuid format). SELECT log_message FROM myLog where day = 20140530 AND date... LIMIT xxx

Shouldn't cqlsh have an option for no formatting and no headers?

2014-05-30 Thread Kevin Burton
I do this all the time with mysql… dump some database table to an output file so that I can use it in a script. but cqlsh insists on formatting the output. there should be an option for no headers and no whitespace formatting of the results. I mean I can work around it for now… but it's not

Re: Shouldn't cqlsh have an option for no formatting and no headers?

2014-05-30 Thread Russell Bradberry
cqlsh isn’t designed for dumping data. I think you want COPY  http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/copy_r.html On May 30, 2014 at 2:32:24 PM, Kevin Burton (bur...@spinn3r.com) wrote: I do this all the time with mysql… dump some database table to an output file so

Re: backend query of a Cassandra db

2014-05-30 Thread Bobby Chowdary
There are few way you can do this really depends on preferences to have separate cluster or use same nodes etc... 1. If you have DSE they have hadoop/hive integrated or you can use Opensouce hive handler by tuple jump  https://github.com/tuplejump/cash 2. Spark/Shark : Using Tuplejump Calliope

Re: Managing truststores with inter-node encryption

2014-05-30 Thread Ben Bromhead
Java ssl sockets need to be able to build a chain of trust. So having either a nodes public cert or the root cert in the truststore works (as you found out). To get cassandra to use cypher suites 128 bit you will need to install the JCE unlimited strength jurisdiction policy files. You will know