Hi All,
Can we use sstable loader for loading external flat file or csv file.
If yes , kindly share the steps or manual.
I need to put 40 million data into a table of around 70 columns
Regards:
Rahul Bhardwaj
--
Follow IndiaMART.com http://www.indiamart.com for latest updates on this
and
Hi All,
This is possible with cassandra-driver-core-2.1.5, with
'row.getLong(sum)'.
Thanks
On Fri, Mar 27, 2015 at 2:51 PM, Amila Paranawithana amila1...@gmail.com
wrote:
in Apache Cassandra Java Driver 2.1 how to read counter type values from a
row when iterating over result set.
eg: If I
I would recommend you utilise Cassandra’s Vnodes config and let it manage this
itself.
This means it will create these and a mange them all on its own and allows
quick and easy scaling and boot strapping.
From: Björn Hachmann
bjoern.hachm...@metrigo.demailto:bjoern.hachm...@metrigo.de
Rob, the cluster now upgraded to cassandra 1.0.12 (default hd version,
in Descriptor.java) and I ensure all sstables in current cluster are
hd version before upgrade to cassandra 1.1. I have also checked in
cassandra 1.1.12 , the sstable is version hf version. so i guess,
nodetool upgradesstables
Hi,
we currently plan to add a second data center to our Cassandra-Cluster. I
have read about this procedure in the documentation (eg.
https://www.datastax.com/documentation/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html),
but at least one question remains:
Do I have to provide
I'm running Cassandra locally and I see that the execution time for the
simplest queries is 1-2 milliseconds. By a simple query I mean either
INSERT or SELECT from a small table with short keys.
While this number is not high, it's about 10-20 times slower than
Postgresql (even if INSERTs are
Hi Robert,
We're trying to do something similar to the OP and finding it a bit
difficult. Would it be possible to provide more details about how you're
doing it?
Thanks.
On Fri, Mar 27, 2015 at 3:15 AM, Robert Wille rwi...@fold3.com wrote:
I have a cluster which stores tree structures. I keep
in Apache Cassandra Java Driver 2.1 how to read counter type values from a
row when iterating over result set.
eg: If I have a counter table called 'countertable' with key and a counter
colum 'sum' how can I read the value of the counter column using Java
driver?
If I say, row.getInt(sum) this
Hi,
This post[1] may be useful. But note that this was done with cassandra
older version. So there may be new way to do this.
[1].
http://amilaparanawithana.blogspot.com/2012/06/bulk-loading-external-data-to-cassandra.html
Thanks,
On Fri, Mar 27, 2015 at 11:40 AM, Rahul Bhardwaj
http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__num_tokens
So go with a default 256, and leave initial token empty:
num_tokens: 256
# initial_token:
Cassandra will always give each node the same number of
Hi All,
We are using cassandra version 2.1.2 with cqlsh 5.0.1 (cluster of three
nodes with rf 2)
I need to load around 40 million records into a table of cassandra db. I
have created batch of 1 million ( batch of 1 records also gives the
same error) in csv format. when I use copy command
2015-03-27 11:58 GMT+01:00 Sibbald, Charles charles.sibb...@bskyb.com:
Cassandra’s Vnodes config
Thank you. Yes, we are using vnodes! The num_token parameter controls the
number of vnodes assigned to a specific node.
Might be I am seeing problems where are none.
Let me rephrase my
Would it help here to not actually issue a delete statement but instead use
date based compaction and a dynamically calculated ttl that is some safe
distance in the future from your key?
I’m not sure about about this part *date based compaction*, do you mean
DateTieredCompationStrategy ?
Anyway
On 3/26/15 10:15 PM, Robert Wille wrote:
I have a cluster which stores tree structures. I keep several hundred unrelated
trees. The largest has about 180 million nodes, and the smallest has 1 node.
The largest fanout is almost 400K. Depth is arbitrary, but in practice is
probably less than
I'd be interested to see that data model. I think the entire list would
benefit!
On Thu, Mar 26, 2015 at 8:16 PM Robert Wille rwi...@fold3.com wrote:
I have a cluster which stores tree structures. I keep several hundred
unrelated trees. The largest has about 180 million nodes, and the smallest
Running upgrade is a noop if the tables don't need to be upgraded. I
consider the cost of this to be less than the cost of missing an upgrade.
On Thu, Mar 26, 2015 at 4:23 PM Robert Coli rc...@eventbrite.com wrote:
On Wed, Mar 25, 2015 at 7:16 PM, Jonathan Haddad j...@jonhaddad.com
wrote:
Yeah that's the one :) sorry, was on my phone and didn't want to look up
the exact name.
Cheers,
Thunder
On Mar 27, 2015 6:17 AM, Brice Dutheil brice.duth...@gmail.com wrote:
Would it help here to not actually issue a delete statement but instead
use date based compaction and a dynamically
Just to check, are you concerned about minimizing that latency or
maximizing throughput?
I'll that latency is what you're actually concerned about. A fair amount
of that latency is probably happening in the python driver. Although it
can easily execute ~8k operations per second (using cpython),
Yes, I'm concerned about the latency. Throughput can be high even when
using Python: http://datastax.github.io/python-driver/performance.html.
But in my scenarios I need to run queries sequentially, so latencies
matter. And Cassandra requires issuing more queries than SQL databases
so these
I think that in your example Postgres spends most time on waiting for
fsync() to complete. On Linux, for a battery-backed raid controller,
it's safe to mount ext4 filesystem with barrier=0 option which
improves fsync() performance a lot. I have partitions mounted with this
option and I did a
Since you're executing queries sequentially, you may want to look into
using callback chaining to avoid the cross-thread signaling that results in
the 1ms latencies. Basically, just use session.execute_async() and attach
a callback to the returned future that will execute your next query. The
Latency can be so variable even when testing things locally. I quickly
fired up postgres and did the following with psql:
ben=# CREATE TABLE foo(i int, j text, PRIMARY KEY(i));
CREATE TABLE
ben=# \timing
Timing is on.
ben=# INSERT INTO foo VALUES(2, 'yay');
INSERT 0 1
Time: 1.162 ms
ben=# INSERT
hi
I hav run the source of cassandra in eclipse juno by following this
document
http://brianoneill.blogspot.in/2015/03/getting-started-with-cassandra.html.
but i'm getting the exceptions. please help to solve this.
INFO 17:43:40 Node localhost/127.0.0.1 state jump to normal
INFO 17:43:41 Netty
I use callback chaining with the python driver and can confirm that it is
very fast.
You can chain the chains together to perform sequential processing. I do
this when retrieving metadata and then the referenced payload for
example, when the metadata has been inverted and the payload is larger
+1 would love to see how you do it
On 27 March 2015 at 07:18, Jonathan Haddad j...@jonhaddad.com wrote:
I'd be interested to see that data model. I think the entire list would
benefit!
On Thu, Mar 26, 2015 at 8:16 PM Robert Wille rwi...@fold3.com wrote:
I have a cluster which stores tree
Hmmm... If you serialize the tree properly in a partition, you could always
read an entire sub-tree as a single slice (consecutive CQL rows.) Is there
much more to it?
-- Jack Krupansky
On Fri, Mar 27, 2015 at 7:35 PM, Ben Bromhead b...@instaclustr.com wrote:
+1 would love to see how you do it
Okay, this is going to be a pretty long post, but I think its an interesting
data model, and hopefully someone will find it worth going through.
First, I think it will be easier to understand the modeling choices I made if
you see the end product. Go to
27 matches
Mail list logo