Re: RE: Hector samples -- where?

2010-05-26 Thread Ran Tavory
it's here http://github.com/rantav/hector/blob/master/src/test/java/me/prettyprint/cassandra/service/KeyspaceTest.java On Wed, May 26, 2010 at 8:18 AM, Nicholas Sun nick@raytheon.com wrote: Could you please provide some indication as to their location? Thanks. Nick *From:* Ran

batch mutation : how to delete whole row?

2010-05-26 Thread gabriele renzi
Hi everyone, in our test code we perform a dummy clear by reading all the rows and deleting them (while waiting for cassandra 0.7 CASSANDRA-531). A couple of days ago I updated our code to perform this operation using batchMutate, but there seem to be no way to perform a deletion of the whole

Re: batch mutation : how to delete whole row?

2010-05-26 Thread Sylvain Lebresne
This has been fixed in 0.7 (https://issues.apache.org/jira/browse/CASSANDRA-1027). Not sure this has been merged in 0.6 though. On Wed, May 26, 2010 at 9:05 AM, gabriele renzi rff@gmail.com wrote: Hi everyone, in our test code we perform a dummy clear by reading all the rows and deleting

Re: batch mutation : how to delete whole row?

2010-05-26 Thread Mishail
You could either use 1 remove(keyspace, key, column_path, timestamp, consistency_level) per aech key, or wait till https://issues.apache.org/jira/browse/CASSANDRA-494 fixed (to use SliceRange in the Deletion) gabriele renzi wrote: Is it correct that I cannot perform a row delete via

Re: batch mutation : how to delete whole row?

2010-05-26 Thread gabriele renzi
On Wed, May 26, 2010 at 9:54 AM, Mishail mishail.mish...@gmail.com wrote: You could either use 1 remove(keyspace, key, column_path, timestamp, consistency_level) per aech key, or wait till https://issues.apache.org/jira/browse/CASSANDRA-494 fixed (to use SliceRange in the Deletion) thanks,

Re: Order Preserving Partitioner

2010-05-26 Thread David Boxenhorn
Just in case you don't know: You can do range searches on keys even with Random Partitioner, you just won't get the results in order. If this is good enough for you (e.g. if you can order the results on the client, or if you just need to get the right answer, but not the right order), then you

Re: Questions regarding batch mutates and transactions

2010-05-26 Thread Ran Tavory
The summary of your question is: is batch_mutate atomic in the general sense, meaning when used with multiple keys, multiple column families etc, correct? On Wed, May 26, 2010 at 12:45 PM, Todd Nine t...@spidertracks.co.nz wrote: Hey guys, I originally asked this on the Hector group, but no

RE: Moving/copying columns in between ColumnFamilies

2010-05-26 Thread Dop Sun
There are no single API call to achieve this. It’s read and write, plus a delete (if move) API calls I guess. From: Utku Can Topçu [mailto:u...@topcu.gen.tr] Sent: Wednesday, May 26, 2010 9:09 PM To: user@cassandra.apache.org Subject: Moving/copying columns in between ColumnFamilies

Re: Avro Example Code

2010-05-26 Thread Jeff Hammerbacher
I've got a mostly working Avro server and client for HBase at http://github.com/hammer/hbase-trunk-with-avro and http://github.com/hammer/pyhbase. If you replace scan with slice, it shouldn't be too much different for Cassandra... On Mon, May 17, 2010 at 10:31 AM, Wellman, David da...@tynt.com

Re: Avro Example Code

2010-05-26 Thread David Wellman
Fantastic! Thank you. On May 26, 2010, at 8:38 AM, Jeff Hammerbacher wrote: I've got a mostly working Avro server and client for HBase at http://github.com/hammer/hbase-trunk-with-avro and http://github.com/hammer/pyhbase. If you replace scan with slice, it shouldn't be too much different

using more than 50% of disk space

2010-05-26 Thread Sean Bridges
We're investigating Cassandra, and we are looking for a way to get Cassandra use more than 50% of it's data disks. Is this possible? For major compactions, it looks like we can use more than 50% of the disk if we use multiple similarly sized column families. If we had 10 column families of the

Two threads inserting columns into same key followed by read gets unexpected results

2010-05-26 Thread Scott McCarty
Hi, I'm seeing a problem with inserting columns into one key using multiple threads and I'm not sure if it's a bug or if it's my misunderstanding of how insert/get_slice should work. My setup is that I have two separate client processes, each with a single thread, writing concurrently to

Re: Order Preserving Partitioner

2010-05-26 Thread Peter Hsu
Correct me if I'm wrong here. Even though you can get your results with Random Partitioner, it's a lot less efficient if you're going across different machines to get your results. If you're doing a lot of range queries, it makes sense to have things ordered sequentially so that if you do

Re: using more than 50% of disk space

2010-05-26 Thread Sean Bridges
So after CASSANDRA-579, anti compaction won't be done on the source node, and we can use more than 50% of the disk space if we use multiple column families? Thanks, Sean On Wed, May 26, 2010 at 10:01 AM, Stu Hood stu.h...@rackspace.com wrote: See

Subscribe

2010-05-26 Thread Nazario Parsacala
Sent from my iPhone

Doing joins between column familes

2010-05-26 Thread Dodong Juan
So I am not sure if you guys are familiar with OCM . Basically it is an ORM for Cassandra. Been testing it So I have created model that has the following object relationship. OCM generates the code from this that allows me to do easy programmatic query from Java to Cassandra.

Re: nodetool move looks stuck

2010-05-26 Thread Jonathan Ellis
Are there any exceptions in the log like the one in https://issues.apache.org/jira/browse/CASSANDRA-1019 ? If so you'll need to restart the moving node and try again. On Wed, May 26, 2010 at 3:54 AM, Ran Tavory ran...@gmail.com wrote: I ran nodetool move on one of the nodes and it seems stuck

Re: Order Preserving Partitioner

2010-05-26 Thread Jonathan Shook
I don't think that queries on a key range are valid unless you are using OPP. As far as hashing the key for OPP goes, I take it to be the same a not using OPP. It's really a matter of where it gets done, but it has much the same effect. (I think) Jonathan On Wed, May 26, 2010 at 12:51 PM, Peter

Re: Doing joins between column familes

2010-05-26 Thread Charlie Mason
On Wed, May 26, 2010 at 7:45 PM, Dodong Juan dodongj...@gmail.com wrote: So I am not sure if you guys are familiar with OCM . Basically it is an ORM for Cassandra. Been testing it In case anyone is interested I have posted a reply on the OCM issue tracker where this was also raised.

Re: Error reporting Key cache hit rate with cfstats or with JMX

2010-05-26 Thread Ran Tavory
If I disable row cache the numbers look good - key cache hit rate is 0, so it seems to be related to row cache. Interestingly, after running for a really long time and with both row and keys caches I do start to see Key cache hit rate 0 but the numbers are so small that it doesn't make sense. I

Re: Doing joins between column familes

2010-05-26 Thread Jonathan Shook
I wrote some Iterable* methods to do this for column families that share key structure with OPP. It is on the hector examples page. Caveat emptor. It does iterative chunking of the working set for each column family, so that you can set the nominal transfer size when you construct the

Best Timestamp?

2010-05-26 Thread Steven Haar
What is the best timestamp to use while using Cassandra with C#? I have been using DateTime.Now.Ticks, but I have seen others using different things. Thanks.

Re: Best Timestamp?

2010-05-26 Thread Mark Robson
On 26 May 2010 22:42, Steven Haar sh...@vintagesoftware.com wrote: What is the best timestamp to use while using Cassandra with C#? I have been using DateTime.Now.Ticks, but I have seen others using different things. The standard that most clients seem to use is epoch-microseconds, or

Re: Best Timestamp?

2010-05-26 Thread Miguel Verde
Right, in C# this would be (not the most efficient way, but you get the idea): long timestamp = (DateTime.Now.Ticks - new DateTime(1970, 1, 1).Ticks)/10; On Wed, May 26, 2010 at 4:50 PM, Mark Robson mar...@gmail.com wrote: On 26 May 2010 22:42, Steven Haar sh...@vintagesoftware.com wrote:

Re: Best Timestamp?

2010-05-26 Thread Mark Robson
On 26 May 2010 22:56, Miguel Verde miguelitov...@gmail.com wrote: Right, in C# this would be (not the most efficient way, but you get the idea): long timestamp = (DateTime.Now.Ticks - new DateTime(1970, 1, 1).Ticks)/10; Yeah, you're fine provided: a) All your client applications (which

Re: Error reporting Key cache hit rate with cfstats or with JMX

2010-05-26 Thread Jonathan Ellis
It sure sounds like you're seeing the my row cache contains the entire hot data set, so the key cache only gets the cold reads effect. On Wed, May 26, 2010 at 2:54 PM, Ran Tavory ran...@gmail.com wrote: If I disable row cache the numbers look good - key cache hit rate is 0, so it seems to be

Thoughts on adding complex queries to Cassandra

2010-05-26 Thread Jeremy Davis
Are there any thoughts on adding a more complex query to Cassandra? At a high level what I'm wondering is: Would it be possible/desirable/in keeping with the Cassandra plan, to add something like a javascript blob on to a get range slice etc, that does some further filtering on the results before

RE: Thoughts on adding complex queries to Cassandra

2010-05-26 Thread Nicholas Sun
I'm very curious on this topic as well. Mainly, I'd like to know is this functionality handled through Map/Reduce HADOOP operations? Nick From: Jeremy Davis [mailto:jerdavis.cassan...@gmail.com] Sent: Wednesday, May 26, 2010 3:31 PM To: user@cassandra.apache.org Subject: Thoughts on

Cassandra's 2GB row limit and indexing

2010-05-26 Thread Richard West
Hi all, I'm currently looking at new database options for a URL shortener in order to scale well with increased traffic as we add new features. Cassandra seems to be a good fit for many of our requirements, but I'm struggling a bit to find ways of designing certain indexes in Cassandra due to its

Re: Cassandra's 2GB row limit and indexing

2010-05-26 Thread Jonathan Shook
The example is a little confusing. .. but .. 1) sharding You can square the capacity by having a 2-level map. CF1-row-value-CF2-row-value This means finding some natural subgrouping or hash that provides a good distribution. 2) hashing You can also use some additional key hashing to spread the

Re: Anyone using hadoop/MapReduce integration currently?

2010-05-26 Thread 朱蓝天
2010/5/26 Utku Can Topçu u...@topcu.gen.tr Hi Jeremy, Why are you using Cassandra versus using data stored in HDFS or HBase? - I'm thinking of using it for realtime streaming of user data. While streaming the requests, I'm also using Lucandra for indexing the data in realtime. It's a

Re: Error reporting Key cache hit rate with cfstats or with JMX

2010-05-26 Thread Ran Tavory
so the row cache contains both rows and keys and if I have large enough row cache (in particular if row cache size equals key cache size) then it's just wasteful to keep another key cache and I should eliminate the key-cache, correct? On Thu, May 27, 2010 at 1:21 AM, Jonathan Ellis