Re: Filter data on row key in Cassandra Hadoop's Random Partitioner

2012-12-12 Thread Шамим
You can use Apache PIG to load data and filter it by row key, filter in pig is very fast. Regards Shamim 11.12.2012, 20:46, Ayush V. ayushv...@gmail.com: I'm working on Cassandra Hadoop intergration (MapReduce). We have used Random Partioner to insert data to gain faster write. Now we have

RE: cassandra vs couchbase benchmark

2012-12-12 Thread Viktor Jevdokimov
Pure marketing comparing apples to oranges. Was Cassandra usage optimized? - What consistency level was used? (fastest reads with ONE) - Does Cassandra client used was token aware? (make request to appropriate node) - Was dynamic snitch turned off? (prevent forward request to other replica if

Re: Batch mutation streaming

2012-12-12 Thread Ben Hood
Hey Aaron, That sounds sensible - thanks for the heads up. Cheers, Ben On Dec 10, 2012, at 0:47, aaron morton aa...@thelastpickle.com wrote: (and if the message is being decoded on the server site as a complete message, then presumably the same resident memory consumption applies there

Re: Why Secondary indexes is so slowly by my test?

2012-12-12 Thread Hiller, Dean
You could always try PlayOrm's query capability on top of cassandra ;)….it works for us. Dean From: Chengying Fang cyf...@ngnsoft.commailto:cyf...@ngnsoft.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date:

Re: Vnode migration path

2012-12-12 Thread Eric Evans
On Tue, Dec 11, 2012 at 4:28 PM, Michael Kjellman mkjell...@barracuda.com wrote: Awesome (and very welcomed news), what kind of failure conditions can we expect if a node goes down during the migration? A shuffle is just a bunch of moves mapped out ahead of time, and worked through by each node

Null Error Running pig_cassandra

2012-12-12 Thread James Schappet
When trying to run the example-script.pig, I get the following error, null error. tsunami:pig schappetj$ bin/pig_cassandra -x local example-script.pig Using /Library/pig-0.10.0/pig-0.10.0.jar. 2012-12-12 11:02:54,079 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0 (r1328203)

Re: Cassandra on EC2 - describe_ring() is giving private IPs

2012-12-12 Thread santi kumar
When I configured rpc_address with public IP, cassandra is not starting up. It's trowing 'unable to create thrift socket on public IP. When I changed it to private IP, it was good. java.lang.RuntimeException: Unable to create thrift socket to / 107.21.80.94:9160 at

Re: cassandra vs couchbase benchmark

2012-12-12 Thread Radim Kolar
if dataset fits into memory and data used in test almost fits into memory then cassandra is slow compared to other leading nosql databases, it can go up to 10:1 ratio. Check infinispan benchmarks. Common use pattern is to use memcached on top of cassandra. cassandra is good if you have way

Re: Cassandra on EC2 - describe_ring() is giving private IPs

2012-12-12 Thread santi kumar
Yes That worked. Thanks for the pointer. Once the broadcast_address is pointed to public IP, end points are coming with public IP. so Hectors NodeAutoDiscoveryService matches with the existing host and not treating it as new node. On Wed, Dec 12, 2012 at 11:10 PM, Andrey Ilinykh

Re: bug with cqlsh for foreign charater

2012-12-12 Thread aaron morton
Can you please put together a test case using CQL 3 to write and read the data and create a ticket at https://issues.apache.org/jira/browse/CASSANDRA ? Thanks Aaron - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On

Re: Multiple Data Center shows very uneven load

2012-12-12 Thread aaron morton
c:\SERVERS\apache-cassandra-1.1.6\binnodetool -h 11.111.111.1 ring Starting NodeTool Address DC RackStatus State Load Effective-Ownership Token Token(bytes[6c03]) 11.111.111.1VA SVA Up Normal 1.44 GB 33.33%

Re: Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host)

2012-12-12 Thread aaron morton
sliceRangeQuery.setRange(Character.Min_Value, Character.Max_Value, false, Integer.Max_Value); Try selecting a smaller number of rows. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com - Aaron Morton

Re: Multiple Data Center shows very uneven load

2012-12-12 Thread Sergey Olefir
I do have a (good?) reason for ByteOrderedPartitioner - I need to be able to do range queries. At the same time I'm aware of the need to balance the cluster - so I'm hashing my keys and prefixing them with latin letters from a to p (16 in total) - hence the tokens I'm using. Based on my

Re: Multiple Data Center shows very uneven load

2012-12-12 Thread Sergey Olefir
Nick Bailey-2 wrote Dropping a keyspace causes a snapshot to be taken of the keyspace before it is removed from the schema. So it won't actually delete any data. You can manually delete the data from /var/lib/cassandra/ ks /lt;cf[s]gt;/snapshots Indeed, it looks like snapshot is on the file

Re: Null Error Running pig_cassandra

2012-12-12 Thread aaron morton
there is about 3 checks that should have caught the Null. at org.apache.cassandra.hadoop.ConfigHelper.getInputSlicePredicate(ConfigHelper.java:176) This line does not match the source code for the 1.2.0-beta3 tag. Can you try it with the 1.1.7 bin distro ? Cheers -

Re: upgrade from 0.8.5 to 1.1.6, now it cannot find schema

2012-12-12 Thread aaron morton
in-vm cassandra Embedded ? The location of the SSTables has changed in 1.1, they are know in /var/lib/cassandra/data/KS_NAME/CF_NAME/SSTable.data Is the data in the right place ? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton

Re: Multiple Data Center shows very uneven load

2012-12-12 Thread aaron morton
try nodetool drain. It will flush everything to disk and the commit log will be truncated. HH can be ignored. If you really want them gone they can be purged using the JMX interface, or you can stop the node and delete the sstables. Cheers - Aaron Morton Freelance Cassandra

Datastax C*ollege Credit Webinar Series : Create your first Java App w/ Cassandra

2012-12-12 Thread Brian O'Neill
FWIW -- I'm presenting tomorrow for the Datastax C*ollege Credit Webinar Series: http://brianoneill.blogspot.com/2012/12/presenting-for-datastax-college-credit.html I hope to make CQL part of the presentation and show how it integrates with the Java APIs. If you are interested, drop in. -brian

Re: Why Secondary indexes is so slowly by my test?

2012-12-12 Thread Chengying Fang
You are right, Dean. It's due to the heavy result returned by query, not index itself. According to my test, if the result rows less than 5000, it's very quick. But how to limit the result? It seems row limit is a good choice. But if do so, some rows I wanted maybe miss because the row order

Re: Why Secondary indexes is so slowly by my test?

2012-12-12 Thread aaron morton
The IndexClause for the get_indexed_slices takes a start key. You can page the results from your secondary index query by making multiple calls with a sane count and including a start key. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton