You can use Apache PIG to load data and filter it by row key, filter in pig is
very fast.
Regards
Shamim
11.12.2012, 20:46, Ayush V. ayushv...@gmail.com:
I'm working on Cassandra Hadoop intergration (MapReduce). We have used Random
Partioner to insert data to gain faster write. Now we have
Pure marketing comparing apples to oranges.
Was Cassandra usage optimized?
- What consistency level was used? (fastest reads with ONE)
- Does Cassandra client used was token aware? (make request to appropriate node)
- Was dynamic snitch turned off? (prevent forward request to other replica if
Hey Aaron,
That sounds sensible - thanks for the heads up.
Cheers,
Ben
On Dec 10, 2012, at 0:47, aaron morton aa...@thelastpickle.com wrote:
(and if the message is being decoded on the server site as a complete
message, then presumably the same resident memory consumption applies there
You could always try PlayOrm's query capability on top of cassandra ;)….it
works for us.
Dean
From: Chengying Fang cyf...@ngnsoft.commailto:cyf...@ngnsoft.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date:
On Tue, Dec 11, 2012 at 4:28 PM, Michael Kjellman
mkjell...@barracuda.com wrote:
Awesome (and very welcomed news), what kind of failure conditions can we
expect if a node goes down during the migration?
A shuffle is just a bunch of moves mapped out ahead of time, and
worked through by each node
When trying to run the example-script.pig, I get the following error, null
error.
tsunami:pig schappetj$ bin/pig_cassandra -x local example-script.pig
Using /Library/pig-0.10.0/pig-0.10.0.jar.
2012-12-12 11:02:54,079 [main] INFO org.apache.pig.Main - Apache Pig
version 0.10.0 (r1328203)
When I configured rpc_address with public IP, cassandra is not starting up.
It's trowing 'unable to create thrift socket on public IP. When I changed
it to private IP, it was good.
java.lang.RuntimeException: Unable to create thrift socket to /
107.21.80.94:9160
at
if dataset fits into memory and data used in test almost fits into
memory then cassandra is slow compared to other leading nosql databases,
it can go up to 10:1 ratio. Check infinispan benchmarks. Common use
pattern is to use memcached on top of cassandra.
cassandra is good if you have way
Yes That worked. Thanks for the pointer. Once the broadcast_address is
pointed to public IP, end points are coming with public IP. so Hectors
NodeAutoDiscoveryService matches with the existing host and not treating it
as new node.
On Wed, Dec 12, 2012 at 11:10 PM, Andrey Ilinykh
Can you please put together a test case using CQL 3 to write and read the data
and create a ticket at https://issues.apache.org/jira/browse/CASSANDRA ?
Thanks
Aaron
-
Aaron Morton
Freelance Cassandra Developer
New Zealand
@aaronmorton
http://www.thelastpickle.com
On
c:\SERVERS\apache-cassandra-1.1.6\binnodetool -h 11.111.111.1 ring
Starting NodeTool
Address DC RackStatus State Load
Effective-Ownership Token
Token(bytes[6c03])
11.111.111.1VA SVA Up Normal 1.44 GB
33.33%
sliceRangeQuery.setRange(Character.Min_Value, Character.Max_Value, false,
Integer.Max_Value);
Try selecting a smaller number of rows.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
New Zealand
@aaronmorton
http://www.thelastpickle.com
-
Aaron Morton
I do have a (good?) reason for ByteOrderedPartitioner - I need to be able to
do range queries. At the same time I'm aware of the need to balance the
cluster - so I'm hashing my keys and prefixing them with latin letters from
a to p (16 in total) - hence the tokens I'm using.
Based on my
Nick Bailey-2 wrote
Dropping a keyspace causes a snapshot to be taken of the keyspace before
it
is removed from the schema. So it won't actually delete any data. You can
manually delete the data from /var/lib/cassandra/
ks
/lt;cf[s]gt;/snapshots
Indeed, it looks like snapshot is on the file
there is about 3 checks that should have caught the Null.
at
org.apache.cassandra.hadoop.ConfigHelper.getInputSlicePredicate(ConfigHelper.java:176)
This line does not match the source code for the 1.2.0-beta3 tag.
Can you try it with the 1.1.7 bin distro ?
Cheers
-
in-vm cassandra
Embedded ?
The location of the SSTables has changed in 1.1, they are know in
/var/lib/cassandra/data/KS_NAME/CF_NAME/SSTable.data Is the data in the right
place ?
Cheers
-
Aaron Morton
Freelance Cassandra Developer
New Zealand
@aaronmorton
try nodetool drain. It will flush everything to disk and the commit log will be
truncated.
HH can be ignored. If you really want them gone they can be purged using the
JMX interface, or you can stop the node and delete the sstables.
Cheers
-
Aaron Morton
Freelance Cassandra
FWIW --
I'm presenting tomorrow for the Datastax C*ollege Credit Webinar Series:
http://brianoneill.blogspot.com/2012/12/presenting-for-datastax-college-credit.html
I hope to make CQL part of the presentation and show how it integrates
with the Java APIs.
If you are interested, drop in.
-brian
You are right, Dean. It's due to the heavy result returned by query, not index
itself. According to my test, if the result rows less than 5000, it's very
quick. But how to limit the result? It seems row limit is a good choice. But if
do so, some rows I wanted maybe miss because the row order
The IndexClause for the get_indexed_slices takes a start key. You can page the
results from your secondary index query by making multiple calls with a sane
count and including a start key.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
New Zealand
@aaronmorton
20 matches
Mail list logo