On Sun, Apr 25, 2010 at 11:14 AM, Bob Hutchison <hutch-li...@recursive.ca> wrote: > > Hi, > > I'm new to Cassandra and trying to work out how to do something that I've > implemented any number of times (e.g. TokyoCabinet, Perst, even the > filesystem using grep :-) I've managed to get some of this working in > Cassandra but not all. > > So here's the core of the situation. > > I have this opaque chunk of data that I want to store in Cassandra and then > find it again. > > I can generate a key when the data is created very easily, and I've stored it > in a straight forward manner: in a column with a key whose value is the data. > And I can retrieve it when I know the key. No difficulties here at all, works > fine. > > Now I want to index this data taking what I imagine to be a pretty typical > approach. > > Lets say there's two many-to-one indexes: 'colour', and 'size'. Each colour > value will have more than one chunk of data, same for size. > > What I thought I'd do is make a super column and index the chunk of data kind > of like: { 'colour' => { 'blue' => 1 }, 'size' => { 'large' => 1}} with the > key equal to the key of the chunk of data. And Cassandra stores it without > error like that. So using the Ruby gem, it'd be something along the lines of: > > cassandra.insert(:Indexes, key-of-the-chunk-of-data, { 'colour' => { 'blue' > => 1 }, 'size' => { 'large' => 1 } }) > > Q1: is this a reasonable approach? It *seems* to be what I've read is > supposed to be done. The 1 is meaningless. Anyway, it executes without error > in Ruby.
No. In order to index your data, you need to invert it. Since you're working in ruby I'd recommend CassandraObject: http://github.com/nzKoz/cassandra_object. It has indexing built in. -ryan > Q2: what is the syntax of the (Ruby) query to find the keys of all 'blue' > chunks of data? I'm assuming get_range is the correct method, but what are > the parameters? The docs say: get_range(column_family, options={}) but that > seems to be missing a bit of detail, in particular the super column name. > > Q2a: So I know there's a :start and :finish key supported in the options > hash, inclusive, exclusive respectively. How do you define a range for equals > with a UTF8 key? Surely not 'blue'.succ?? or by some kind of suffix?? > > Q2b: How do you specify the super column name 'colour'? Looking at the (Ruby) > source of the get_range method and I'm unconvinced that this is implemented > (seems to be a constant '' used where the super column name makes sense to > be.) > > Anyway I ended up hacking at the Ruby gem's source to use the column name > where the '' was in the original, and didn't really get anywhere useful (I > can find nothing, or everything, nothing in between). > > Q3: If I am correct about what is supposed to be done, does the Ruby gem > support it? > > Q4: Does anyone know of some Ruby code that does and indexed lookup that they > could point me at. (lots of code that indexes but nothing that searches by > the index) > > I'll try to take a look at some of the other Cassandra client implementations > and see if I can get this model to work. Maybe just a Ruby problem?? With any > luck, it'll be me messing up. > > If it'd help I can post the source of what I have, but it'll need some > cleanup. Let me know. > > Thanks for taking the time to read this far :-) > > Bob > > ---- > Bob Hutchison > Recursive Design Inc. > http://www.recursive.ca/ > weblog: http://xampl.com/so > > > ---- > Bob Hutchison > Recursive Design Inc. > http://www.recursive.ca/ > weblog: http://xampl.com/so > > > > >