On Sun, Apr 25, 2010 at 11:14 AM, Bob Hutchison
<hutch-li...@recursive.ca> wrote:
>
> Hi,
>
> I'm new to Cassandra and trying to work out how to do something that I've 
> implemented any number of times (e.g. TokyoCabinet, Perst, even the 
> filesystem using grep :-) I've managed to get some of this working in 
> Cassandra but not all.
>
> So here's the core of the situation.
>
> I have this opaque chunk of data that I want to store in Cassandra and then 
> find it again.
>
> I can generate a key when the data is created very easily, and I've stored it 
> in a straight forward manner: in a column with a key whose value is the data. 
> And I can retrieve it when I know the key. No difficulties here at all, works 
> fine.
>
> Now I want to index this data taking what I imagine to be a pretty typical 
> approach.
>
> Lets say there's two many-to-one indexes: 'colour', and 'size'. Each colour 
> value will have more than one chunk of data, same for size.
>
> What I thought I'd do is make a super column and index the chunk of data kind 
> of like: { 'colour' => { 'blue' => 1 }, 'size' => { 'large' => 1}} with the 
> key equal to the key of the chunk of data. And Cassandra stores it without 
> error like that. So using the Ruby gem, it'd be something along the lines of:
>
>  cassandra.insert(:Indexes, key-of-the-chunk-of-data, { 'colour' => { 'blue' 
> => 1 }, 'size' => { 'large' => 1 } })
>
> Q1: is this a reasonable approach? It *seems* to be what I've read is 
> supposed to be done. The 1 is meaningless. Anyway, it executes without error 
> in Ruby.

No. In order to index your data, you need to invert it. Since you're
working in ruby I'd recommend CassandraObject:
http://github.com/nzKoz/cassandra_object. It has indexing built in.

-ryan

> Q2: what is the syntax of the (Ruby) query to find the keys of all 'blue' 
> chunks of data? I'm assuming get_range is the correct method, but what are 
> the parameters? The docs say: get_range(column_family, options={}) but that 
> seems to be missing a bit of detail, in particular the super column name.
>
> Q2a: So I know there's a :start and :finish key supported in the options 
> hash, inclusive, exclusive respectively. How do you define a range for equals 
> with a UTF8 key? Surely not 'blue'.succ?? or by some kind of suffix??
>
> Q2b: How do you specify the super column name 'colour'? Looking at the (Ruby) 
> source of the get_range method and I'm unconvinced that this is implemented 
> (seems to be a constant '' used where the super column name makes sense to 
> be.)
>
> Anyway I ended up hacking at the Ruby gem's source to use the column name 
> where the '' was in the original, and didn't really get anywhere useful (I 
> can find nothing, or everything, nothing in between).
>
> Q3: If I am correct about what is supposed to be done, does the Ruby gem 
> support it?
>
> Q4: Does anyone know of some Ruby code that does and indexed lookup that they 
> could point me at. (lots of code that indexes but nothing that searches by 
> the index)
>
> I'll try to take a look at some of the other Cassandra client implementations 
> and see if I can get this model to work. Maybe just a Ruby problem?? With any 
> luck, it'll be me messing up.
>
> If it'd help I can post the source of what I have, but it'll need some 
> cleanup. Let me know.
>
> Thanks for taking the time to read this far :-)
>
> Bob
>
> ----
> Bob Hutchison
> Recursive Design Inc.
> http://www.recursive.ca/
> weblog: http://xampl.com/so
>
>
> ----
> Bob Hutchison
> Recursive Design Inc.
> http://www.recursive.ca/
> weblog: http://xampl.com/so
>
>
>
>
>

Reply via email to