sstable loader

2015-03-27 Thread Rahul Bhardwaj
Hi All, Can we use sstable loader for loading external flat file or csv file. If yes , kindly share the steps or manual. I need to put 40 million data into a table of around 70 columns Regards: Rahul Bhardwaj -- Follow IndiaMART.com http://www.indiamart.com for latest updates on this and

Re: Java Driver 2.1 reading counter values from row

2015-03-27 Thread Amila Paranawithana
Hi All, This is possible with cassandra-driver-core-2.1.5, with 'row.getLong(sum)'. Thanks On Fri, Mar 27, 2015 at 2:51 PM, Amila Paranawithana amila1...@gmail.com wrote: in Apache Cassandra Java Driver 2.1 how to read counter type values from a row when iterating over result set. eg: If I

Re: Replication to second data center with different number of nodes

2015-03-27 Thread Sibbald, Charles
I would recommend you utilise Cassandra’s Vnodes config and let it manage this itself. This means it will create these and a mange them all on its own and allows quick and easy scaling and boot strapping. From: Björn Hachmann bjoern.hachm...@metrigo.demailto:bjoern.hachm...@metrigo.de

Re: upgrade from 1.0.12 to 1.1.12

2015-03-27 Thread Jason Wee
Rob, the cluster now upgraded to cassandra 1.0.12 (default hd version, in Descriptor.java) and I ensure all sstables in current cluster are hd version before upgrade to cassandra 1.1. I have also checked in cassandra 1.1.12 , the sstable is version hf version. so i guess, nodetool upgradesstables

Replication to second data center with different number of nodes

2015-03-27 Thread Björn Hachmann
Hi, we currently plan to add a second data center to our Cassandra-Cluster. I have read about this procedure in the documentation (eg. https://www.datastax.com/documentation/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html), but at least one question remains: Do I have to provide

High latencies for simple queries

2015-03-27 Thread Artur Siekielski
I'm running Cassandra locally and I see that the execution time for the simplest queries is 1-2 milliseconds. By a simple query I mean either INSERT or SELECT from a small table with short keys. While this number is not high, it's about 10-20 times slower than Postgresql (even if INSERTs are

Re: Arbitrary nested tree hierarchy data model

2015-03-27 Thread Fabian Siddiqi
Hi Robert, We're trying to do something similar to the OP and finding it a bit difficult. Would it be possible to provide more details about how you're doing it? Thanks. On Fri, Mar 27, 2015 at 3:15 AM, Robert Wille rwi...@fold3.com wrote: I have a cluster which stores tree structures. I keep

Java Driver 2.1 reading counter values from row

2015-03-27 Thread Amila Paranawithana
in Apache Cassandra Java Driver 2.1 how to read counter type values from a row when iterating over result set. eg: If I have a counter table called 'countertable' with key and a counter colum 'sum' how can I read the value of the counter column using Java driver? If I say, row.getInt(sum) this

Re: sstable loader

2015-03-27 Thread Amila Paranawithana
Hi, This post[1] may be useful. But note that this was done with cassandra older version. So there may be new way to do this. [1]. http://amilaparanawithana.blogspot.com/2012/06/bulk-loading-external-data-to-cassandra.html Thanks, On Fri, Mar 27, 2015 at 11:40 AM, Rahul Bhardwaj

Re: Replication to second data center with different number of nodes

2015-03-27 Thread Sibbald, Charles
http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__num_tokens So go with a default 256, and leave initial token empty: num_tokens: 256 # initial_token: Cassandra will always give each node the same number of

('Unable to complete the operation against any hosts', {})

2015-03-27 Thread Rahul Bhardwaj
Hi All, We are using cassandra version 2.1.2 with cqlsh 5.0.1 (cluster of three nodes with rf 2) I need to load around 40 million records into a table of cassandra db. I have created batch of 1 million ( batch of 1 records also gives the same error) in csv format. when I use copy command

Re: Replication to second data center with different number of nodes

2015-03-27 Thread Björn Hachmann
2015-03-27 11:58 GMT+01:00 Sibbald, Charles charles.sibb...@bskyb.com: Cassandra’s Vnodes config ​Thank you. Yes, we are using vnodes! The num_token parameter controls the number of vnodes assigned to a specific node.​ Might be I am seeing problems where are none. Let me rephrase my

Re: Delayed events processing / queue (anti-)pattern

2015-03-27 Thread Brice Dutheil
Would it help here to not actually issue a delete statement but instead use date based compaction and a dynamically calculated ttl that is some safe distance in the future from your key? I’m not sure about about this part *date based compaction*, do you mean DateTieredCompationStrategy ? Anyway

Re: Arbitrary nested tree hierarchy data model

2015-03-27 Thread List
On 3/26/15 10:15 PM, Robert Wille wrote: I have a cluster which stores tree structures. I keep several hundred unrelated trees. The largest has about 180 million nodes, and the smallest has 1 node. The largest fanout is almost 400K. Depth is arbitrary, but in practice is probably less than

Re: Arbitrary nested tree hierarchy data model

2015-03-27 Thread Jonathan Haddad
I'd be interested to see that data model. I think the entire list would benefit! On Thu, Mar 26, 2015 at 8:16 PM Robert Wille rwi...@fold3.com wrote: I have a cluster which stores tree structures. I keep several hundred unrelated trees. The largest has about 180 million nodes, and the smallest

Re: upgrade from 1.0.12 to 1.1.12

2015-03-27 Thread Jonathan Haddad
Running upgrade is a noop if the tables don't need to be upgraded. I consider the cost of this to be less than the cost of missing an upgrade. On Thu, Mar 26, 2015 at 4:23 PM Robert Coli rc...@eventbrite.com wrote: On Wed, Mar 25, 2015 at 7:16 PM, Jonathan Haddad j...@jonhaddad.com wrote:

Re: Delayed events processing / queue (anti-)pattern

2015-03-27 Thread Thunder Stumpges
Yeah that's the one :) sorry, was on my phone and didn't want to look up the exact name. Cheers, Thunder On Mar 27, 2015 6:17 AM, Brice Dutheil brice.duth...@gmail.com wrote: Would it help here to not actually issue a delete statement but instead use date based compaction and a dynamically

Re: High latencies for simple queries

2015-03-27 Thread Tyler Hobbs
Just to check, are you concerned about minimizing that latency or maximizing throughput? I'll that latency is what you're actually concerned about. A fair amount of that latency is probably happening in the python driver. Although it can easily execute ~8k operations per second (using cpython),

Re: High latencies for simple queries

2015-03-27 Thread Artur Siekielski
Yes, I'm concerned about the latency. Throughput can be high even when using Python: http://datastax.github.io/python-driver/performance.html. But in my scenarios I need to run queries sequentially, so latencies matter. And Cassandra requires issuing more queries than SQL databases so these

Re: High latencies for simple queries

2015-03-27 Thread Artur Siekielski
I think that in your example Postgres spends most time on waiting for fsync() to complete. On Linux, for a battery-backed raid controller, it's safe to mount ext4 filesystem with barrier=0 option which improves fsync() performance a lot. I have partitions mounted with this option and I did a

Re: High latencies for simple queries

2015-03-27 Thread Tyler Hobbs
Since you're executing queries sequentially, you may want to look into using callback chaining to avoid the cross-thread signaling that results in the 1ms latencies. Basically, just use session.execute_async() and attach a callback to the returned future that will execute your next query. The

Re: High latencies for simple queries

2015-03-27 Thread Ben Bromhead
Latency can be so variable even when testing things locally. I quickly fired up postgres and did the following with psql: ben=# CREATE TABLE foo(i int, j text, PRIMARY KEY(i)); CREATE TABLE ben=# \timing Timing is on. ben=# INSERT INTO foo VALUES(2, 'yay'); INSERT 0 1 Time: 1.162 ms ben=# INSERT

Re: cassandra source code

2015-03-27 Thread Divya Divs
hi I hav run the source of cassandra in eclipse juno by following this document http://brianoneill.blogspot.in/2015/03/getting-started-with-cassandra.html. but i'm getting the exceptions. please help to solve this. INFO 17:43:40 Node localhost/127.0.0.1 state jump to normal INFO 17:43:41 Netty

Re: High latencies for simple queries

2015-03-27 Thread Laing, Michael
I use callback chaining with the python driver and can confirm that it is very fast. You can chain the chains together to perform sequential processing. I do this when retrieving metadata and then the referenced payload for example, when the metadata has been inverted and the payload is larger

Re: Arbitrary nested tree hierarchy data model

2015-03-27 Thread Ben Bromhead
+1 would love to see how you do it On 27 March 2015 at 07:18, Jonathan Haddad j...@jonhaddad.com wrote: I'd be interested to see that data model. I think the entire list would benefit! On Thu, Mar 26, 2015 at 8:16 PM Robert Wille rwi...@fold3.com wrote: I have a cluster which stores tree

Re: Arbitrary nested tree hierarchy data model

2015-03-27 Thread Jack Krupansky
Hmmm... If you serialize the tree properly in a partition, you could always read an entire sub-tree as a single slice (consecutive CQL rows.) Is there much more to it? -- Jack Krupansky On Fri, Mar 27, 2015 at 7:35 PM, Ben Bromhead b...@instaclustr.com wrote: +1 would love to see how you do it

Re: Arbitrary nested tree hierarchy data model

2015-03-27 Thread Robert Wille
Okay, this is going to be a pretty long post, but I think its an interesting data model, and hopefully someone will find it worth going through. First, I think it will be easier to understand the modeling choices I made if you see the end product. Go to