Re: Cassandra error with large connection

2010-02-02 Thread JKnight JKnight
Thank you very much, Mr Jonathan. On Mon, Feb 1, 2010 at 11:04 AM, Jonathan Ellis jbel...@gmail.com wrote: On Mon, Feb 1, 2010 at 10:03 AM, Jonathan Ellis jbel...@gmail.com wrote: I see a lot of CLOSE_WAIT TCP connection. Also, this sounds like you are not properly pooling client

Re: Sample applications

2010-02-02 Thread Erik Holstad
Hi Carlos! I'm also really new to Cassandra but here are a couple of links that I found useful: http://wiki.apache.org/cassandra/ClientExamples http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model and one of the presentations like:

RE: Sample applications

2010-02-02 Thread Carlos Sanchez
Thanks Erik From: Erik Holstad [mailto:erikhols...@gmail.com] Sent: Tuesday, February 02, 2010 9:08 AM To: cassandra-user@incubator.apache.org Subject: Re: Sample applications Hi Carlos! I'm also really new to Cassandra but here are a couple of links that I found useful:

How to retrieve keys from Cassandra ?

2010-02-02 Thread Sébastien Pierre
Hi all, I would like to know how to retrieve the list of available keys available for a specific column. There is the get_key_range method, but it is only available when using the OrderPreservingPartitioner -- I use a RandomPartitioner. Does this mean that when using a RandomPartitioner, you

Re: Best design in Cassandra

2010-02-02 Thread Erik Holstad
On Mon, Feb 1, 2010 at 3:31 PM, Brandon Williams dri...@gmail.com wrote: On Mon, Feb 1, 2010 at 5:20 PM, Erik Holstad erikhols...@gmail.comwrote: Hey! Have a couple of questions about the best way to use Cassandra. Using the random partitioner + the multi_get calls vs order preservation +

Re: How to retrieve keys from Cassandra ?

2010-02-02 Thread Jonathan Ellis
More or less (but see https://issues.apache.org/jira/browse/CASSANDRA-745, in 0.6). Think of it this way: when you have a few billion keys, how useful is it to list them? -Jonathan 2010/2/2 Sébastien Pierre sebastien.pie...@gmail.com: Hi all, I would like to know how to retrieve the list of

Re: How to retrieve keys from Cassandra ?

2010-02-02 Thread Erik Holstad
Hi Sebastien! I'm totally new to Cassandra, but as far as I know there is no way of getting just the keys that are in the database, they are not stored separately but only with the data itself. Why do you want a list of keys, what are you going to use them for? Maybe there is another way of

Re: Best design in Cassandra

2010-02-02 Thread Brandon Williams
On Tue, Feb 2, 2010 at 9:27 AM, Erik Holstad erikhols...@gmail.com wrote: A supercolumn can still only compare subcolumns in a single way. Yeah, I know that, but you can have a super column per sort order without having to restart the cluster. You get a CompareWith for the columns, and a

Re: Best design in Cassandra

2010-02-02 Thread Erik Holstad
On Tue, Feb 2, 2010 at 7:45 AM, Brandon Williams dri...@gmail.com wrote: On Tue, Feb 2, 2010 at 9:27 AM, Erik Holstad erikhols...@gmail.comwrote: A supercolumn can still only compare subcolumns in a single way. Yeah, I know that, but you can have a super column per sort order without having

Re: Did CASSANDRA-647 get fixed in 0.5?

2010-02-02 Thread Omer van der Horst Jansen
Here it is: https://issues.apache.org/jira/browse/CASSANDRA-752 From: Jonathan Ellis jbel...@gmail.com To: cassandra-user@incubator.apache.org Sent: Mon, February 1, 2010 5:22:13 PM Subject: Re: Did CASSANDRA-647 get fixed in 0.5? Can you create a ticket for

Re: How to retrieve keys from Cassandra ?

2010-02-02 Thread Sébastien Pierre
Hi all, It's basically for knowing what's inside the db, as I've been toying with Cassandra for some time, I have keys that are no longer useful and should be removed. I'm also storing HTTP logs in cassandra, where keys follow this convention campaign:CAMPAIGN_ID:MMDD. So for instance, if

Re: How to retrieve keys from Cassandra ?

2010-02-02 Thread Sébastien Pierre
Hi Jonathan, In my case, I'll have much more columns (thousands to millions) than keys in logs (campaign x days), so it's not an issue to retrieve all of them. Also, if you assume that you can't retrieve values from Cassandra, just because you're using the wrong key (say your using user/10

Re: How to retrieve keys from Cassandra ?

2010-02-02 Thread Brandon Williams
2010/2/2 Sébastien Pierre sebastien.pie...@gmail.com Hi Jonathan, In my case, I'll have much more columns (thousands to millions) than keys in logs (campaign x days), so it's not an issue to retrieve all of them. If that's the case, your dataset is small enough that you could maintain an

Reverse sort order comparator?

2010-02-02 Thread Erik Holstad
Hey! I'm looking for a comparator that sort columns in reverse order on for example bytes? I saw that you can write your own comparator class, but just thought that someone must have done that already. -- Regards Erik

Re: Reverse sort order comparator?

2010-02-02 Thread Jonathan Ellis
you can scan in reversed (reversed=True in slicerange) w/o needing a custom comparator. On Tue, Feb 2, 2010 at 11:21 AM, Erik Holstad erikhols...@gmail.com wrote: Hey! I'm looking for a comparator that sort columns in reverse order on for example bytes? I saw that you can write your own

Re: Reverse sort order comparator?

2010-02-02 Thread Brandon Williams
On Tue, Feb 2, 2010 at 11:21 AM, Erik Holstad erikhols...@gmail.com wrote: Hey! I'm looking for a comparator that sort columns in reverse order on for example bytes? I saw that you can write your own comparator class, but just thought that someone must have done that already. When you

Re: Reverse sort order comparator?

2010-02-02 Thread Erik Holstad
Thanks guys! So I want to use sliceRange but thinking about using the count parameter. For example give me the first x columns, next call I would like to call it with a start value and a count. If I was to use the reverse param in sliceRange I would have to fetch all the columns first, right?

Re: Reverse sort order comparator?

2010-02-02 Thread Brandon Williams
On Tue, Feb 2, 2010 at 11:29 AM, Erik Holstad erikhols...@gmail.com wrote: Thanks guys! So I want to use sliceRange but thinking about using the count parameter. For example give me the first x columns, next call I would like to call it with a start value and a count. If I was to use the

Key/row names?

2010-02-02 Thread Erik Holstad
Is there a way to use a byte[] as the key instead of a string? If not what is the main reason for using strings for the key but the columns and the values can be byte[]? Is it just to be able to use it as the key in a Map etc or are there other reasons? -- Regards Erik

Re: Reverse sort order comparator?

2010-02-02 Thread Erik Holstad
On Tue, Feb 2, 2010 at 9:35 AM, Brandon Williams dri...@gmail.com wrote: On Tue, Feb 2, 2010 at 11:29 AM, Erik Holstad erikhols...@gmail.comwrote: Thanks guys! So I want to use sliceRange but thinking about using the count parameter. For example give me the first x columns, next call I

Re: Key/row names?

2010-02-02 Thread Jonathan Ellis
On Tue, Feb 2, 2010 at 11:36 AM, Erik Holstad erikhols...@gmail.com wrote: Is there a way to use a byte[] as the key instead of a string? no. If not what is the main reason for using strings for the key but the columns and the values can be byte[]? historical baggage. we might switch to

Using column plus value or only column?

2010-02-02 Thread Erik Holstad
Sorry that there are a lot of questions from me this week, just trying to better understand the best way to use Cassandra :) Let us say that you know the length of your key, everything is standardized, are there people out there that just tag the value onto the key so that you don't have to pay

Re: Reverse sort order comparator?

2010-02-02 Thread Erik Holstad
On Tue, Feb 2, 2010 at 9:57 AM, Brandon Williams dri...@gmail.com wrote: On Tue, Feb 2, 2010 at 11:39 AM, Erik Holstad erikhols...@gmail.comwrote: Wow that sounds really good. So you are saying if I set it to reverse sort order and count 10 for the first round I get the last 10, for the

Re: Key/row names?

2010-02-02 Thread Erik Holstad
Thank you! On Tue, Feb 2, 2010 at 9:41 AM, Jonathan Ellis jbel...@gmail.com wrote: On Tue, Feb 2, 2010 at 11:36 AM, Erik Holstad erikhols...@gmail.com wrote: Is there a way to use a byte[] as the key instead of a string? no. If not what is the main reason for using strings for the key

Re: How to retrieve keys from Cassandra ?

2010-02-02 Thread Jean-Denis Greze
Ok, so 0.6's https://issues.apache.org/jira/browse/CASSANDRA-745 permits someone using RandomPartitioner to pass start= and finish= to get all of the rows in their cluster, although in an extremely inefficient way. We are in a situation like Pierre's, where we need to know what's currently in the

Re: get_slice() slow if more number of columns present in a SCF.

2010-02-02 Thread Nathan McCall
Thank you for the benchmarks. What version of Cassandra are you using? I had about 80% performance improvement on single node reads after using a trunk build with the results from https://issues.apache.org/jira/browse/CASSANDRA-688 (result caching) and playing around with the configuration. I am

Re: How to retrieve keys from Cassandra ?

2010-02-02 Thread Jonathan Ellis
On Tue, Feb 2, 2010 at 12:51 PM, Jean-Denis Greze jeande...@6coders.com wrote: Anyway, partially to address the efficiency concern, I've been playing around with the idea of having 745-like functionality on a per-node basis: a call to get all of the keys on a particular node as opposed to the

Re: get_slice() slow if more number of columns present in a SCF.

2010-02-02 Thread Brandon Williams
On Tue, Feb 2, 2010 at 9:27 AM, envio user enviou...@gmail.com wrote: All, Here are some tests[batch_insert() and get_slice()] I performed on cassandra. snip I am ok with TEST1A and TEST1B. I want to populate the SCF with 500 columns and read 25 columns per key. snip This test is

Re: Using column plus value or only column?

2010-02-02 Thread Nathan McCall
If I understand you correctly, I think I have a decent example. I have a ColumnFamily which models user preferences for a site in our system: UserPreferences : { 123_EDD43E57589F12032AF73E23A6AF3F47 : { favorite_color : red, ... } } I structured it this way because we have a lot of

Re: Using column plus value or only column?

2010-02-02 Thread Erik Holstad
Thanks Nate for the example. I was thinking more a long the lines of something like: If you have a family Data : { row1 : { col1:val1, row2 : { col1:val2, ... } } Using Sorts : { sort_row : { sortKey1_datarow1: [], sortKey2_datarow2: [] } } Instead of Sorts : {

order-preserving partitioner per CF?

2010-02-02 Thread Wojciech Kaczmarek
Hi, I'm evaluating Cassandra since few days and I'd say it has really high coolness factor! :) My biggest question so far is about order-preserving partitioner. I'd like to have such partitioner for a specific column family, having random partitioner for others. Is it possible wrt to the current

Re: order-preserving partitioner per CF?

2010-02-02 Thread Jonathan Ellis
On Tue, Feb 2, 2010 at 2:53 PM, Wojciech Kaczmarek kaczmare...@gmail.com wrote: Hi, I'm evaluating Cassandra since few days and I'd say it has really high coolness factor! :) My biggest question so far is about order-preserving partitioner. I'd like to have such partitioner for a specific

Re: easy interface to Cassandra

2010-02-02 Thread Ted Zlatanov
On Tue, 19 Jan 2010 08:09:13 -0600 Ted Zlatanov t...@lifelogs.com wrote: TZ My proposal is as follows: TZ - provide an IPluggableAPI interface; classes that implement it are TZ essentially standalone Cassandra servers. Maybe this can just TZ parallel Thread and implement Runnable. TZ -

Re: order-preserving partitioner per CF?

2010-02-02 Thread Wojciech Kaczmarek
On Tue, Feb 2, 2010 at 21:57, Jonathan Ellis jbel...@gmail.com wrote: My biggest question so far is about order-preserving partitioner. I'd like to have such partitioner for a specific column family, having random partitioner for others. Is it possible wrt to the current architecture? No.

Re: Using column plus value or only column?

2010-02-02 Thread Nathan McCall
Erik, Sure, you could and depending on the workload, that might be quite efficient for small pieces of data. However, this also sounds like something that might be better addressed with the addition of a SuperColumn on Sorts and getting rid of Data altogether: Sorts : { sort_row_1 : {

Re: Using column plus value or only column?

2010-02-02 Thread Erik Holstad
@Nathan So what I'm planning to do is to store multiple sort orders for the same data, where they all use the same data table just fetches it in different orders, so to say. I want to be able to rad the different sort orders from the front and from the back to get both regular and reverse sort

Re: order-preserving partitioner per CF?

2010-02-02 Thread Wojciech Kaczmarek
Yeah excellent. I checked that it's doable to convert the data to another Partitioner using json backup tools - cool. I will probably write own partitioner so it's good I won't loose my test data (though I assume I need to pack all my data back to one node, export to json, delete sstables, change

Re: order-preserving partitioner per CF?

2010-02-02 Thread Jonathan Ellis
just remember that you can't mix nodes w/ different partitioner types in the same cluster. On Tue, Feb 2, 2010 at 5:04 PM, Wojciech Kaczmarek kaczmare...@gmail.com wrote: Yeah excellent. I checked that it's doable to convert the data to another Partitioner using json backup tools - cool. I

Re: Using column plus value or only column?

2010-02-02 Thread Nathan McCall
Erik, You can do an inverse with 'reversed=true' in SliceRange as part of the SlicePredicate for both get_slice or get_range_slice. I have not tried reverse=true on SuperColumn results, but I dont think there is any difference there - what can't be changed is how things are ordered but direction

Re: Using column plus value or only column?

2010-02-02 Thread Erik Holstad
Hey Nate! What I wanted to do with the get_range_slice was to receive the keys in the inverted order, so that I could so offset limit queries on key ranges in reverse order. Like you said, this can be done for both columns and super columns with help of the SliceRange, but not on keys afaik, but

Re: Using column plus value or only column?

2010-02-02 Thread Jonathan Ellis
Right, we don't currently support scanning rows in reverse order, but that is only because nobody has wanted it badly enough to code it. :) On Tue, Feb 2, 2010 at 6:06 PM, Erik Holstad erikhols...@gmail.com wrote: Hey Nate! What I wanted to do with the get_range_slice was to receive the keys in

Re: Using column plus value or only column?

2010-02-02 Thread Erik Holstad
I don't understand what you mean ;) Will see what happens when we are done with this first project, will see if we can get some time to give back. -- Regards Erik

Re: Using column plus value or only column?

2010-02-02 Thread Nathan McCall
Ok - I was afraid I was going to miss something with the generic example before - my apologies on that. You cannot impose an order on keys like that as far as I am aware. I think maintaining a Sort CF as you had originally is a decent approach. Cheers, -Nate On Tue, Feb 2, 2010 at 4:06 PM, Erik

Re: Using column plus value or only column?

2010-02-02 Thread Erik Holstad
Don't be silly, thanks a lot for helping me out! -- Regards Erik

Re: How do cassandra clients failover?

2010-02-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Mon, Feb 1, 2010 at 7:38 PM, Jonathan Ellis jbel...@gmail.com wrote: No.  Thrift is just an RPC mechanism.  Whether RRDNS, software or hardware load balancing, or client-based failover like Gary describes is best is not a one-size-fits-all answer. Everyone who uses Cassandra would need to