Thx Mike, makes perfect sense. I'm using opentsdb, so my schema is fixed. Metric is at the front of my key (my [A]) and dataserver is at the end (my [C]). I need to be able to query by either or, and simply inverting the rowkey allows me to use the opentsdb apis...by leaving the cf:cq and value as is.
My initial attempt works, but I'm getting socket timeouts when I increase volume. I have some more debugging to do. Thx On Jun 14, 2013 10:46 AM, "Michael Segel" <[email protected]> wrote: > Not to beat a dead horse... > > I did want to touch a bit more on the schema design issues and > considerations. > > If you have a really wide composite key and you're only storing a single > cell, you will end up with a very long (tall) table. > > Does this make sense? > > Would it make more sense in using a smaller key and then storing multiple > cells with part of the rowkey as a column qualifier? > > Using your example... you have [A,B,C] as your rowkey and then Column1 > with a value. > > You could make the row key [A, B] with the column qualifier [C] storing > the value there. > > Does that make sense? > > -Mike > > On Jun 13, 2013, at 9:51 PM, Michel Segel <[email protected]> > wrote: > > > Ok... > > > > But then you are duplicating the data, so you will have to reconcile the > two sets and there is a possibility that the data sets are out of sync. > > > > I don't know your entire Schema, but if the row key is larger than the > value, you may want to think about changing the Schema. > > > > > > Sent from a remote device. Please excuse any typos... > > > > Mike Segel > > > > On Jun 13, 2013, at 9:34 PM, rob mancuso <[email protected]> wrote: > > > >> Thx Mike, for the most part. > >> > >> My key is substantially larger than my value, so I was thinking of > leaving > >> the cq->value stuff as is and just inverting the rowkey. > >> > >> So the original table would have > >> > >> [A, B, C] cf1:cq1 val1 > >> > >> And the secondary table would have > >> > >> [C, B, A] cf1:cq1 val1 > >> On Jun 10, 2013 3:42 PM, "Michael Segel" <[email protected]> > wrote: > >> > >>> > >>> If I understand you ... > >>> > >>> You have the row key = [A,B,C] > >>> You want to create an inverted mapping of Key [C] => {[A,B,C]} > >>> > >>> That is to say that your inverted index would be all of the rows where > the > >>> value of C = x . > >>> And x is some value. > >>> > >>> You should have to worry about column qualifiers just the values of A > , B > >>> and C. > >>> > >>> In this case, the columns in your index will also be the values of the > >>> tuples. > >>> You really don't need C because you already have it, but then you'd > need > >>> to remember to add it to the pair (A, B) that you are storing. > >>> I'd say waste the space and store (A,B,C) but that's just me. > >>> > >>> > >>> Is that what you want to do? > >>> > >>> -Mike > >>> > >>> On Jun 9, 2013, at 12:16 PM, rob mancuso <[email protected]> wrote: > >>> > >>>> Thx Anoop, I believe this is what I'm looking for. > >>>> > >>>> Regarding my use case, my rowkey is [A,B,C], but i also have a > >>> requirement > >>>> to access data by [C] only. So I'm looking to use a post-put > coprocessor > >>>> to maintain one secondary index table where the rowkey starts with > [C]. > >>> My > >>>> cqs are numerics representing time and can be any number btw 1 and > 3600 > >>> (ie > >>>> seconds within an hour). Because I won't know the cq value for each > >>>> incoming put (just the cf), I need something to deconstruct the put > into > >>> a > >>>> list of cqs ...which I believe you've provided with getFamilyMap. > >>>> > >>>> Thx again! > >>>> On Jun 9, 2013 12:47 AM, "Anoop John" <[email protected]> wrote: > >>>> > >>>>> You want to have an index per every CF+CQ right? You want to > maintain > >>> diff > >>>>> tables for diff columns? > >>>>> > >>>>> Put is having getFamilyMap method Map CF vs List KVs. From this > List of > >>>>> KVs you can get all the CQ names and values etc.. > >>>>> > >>>>> -Anoop- > >>>>> > >>>>> On Sat, Jun 8, 2013 at 11:24 PM, rob mancuso <[email protected]> > >>> wrote: > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I'm looking to write a post-put observer coprocessor to maintain a > >>>>>> secondary index. Basically, my current rowkey design is a > composite of > >>>>>> A,B,C and I want to be able to also access data by C. So all i'm > >>> looking > >>>>>> to do is invert the rowkey and apply it for all cf:cq values that > come > >>>>> in. > >>>>>> > >>>>>> My problem (i think), is that in all the good examples i've seen, > they > >>>>> all > >>>>>> deconstruct the Put by calling put.get(<cf>,<cq>)...implying they > know > >>>>> the > >>>>>> qualifier ahead of time. I'm looking to specify the family and > >>> generate > >>>>> a > >>>>>> put to the secondary index table for all qualifiers ...not knowing > or > >>>>>> caring what the qualifier is. > >>>>>> > >>>>>> Any pointers would be appreciated, > >>>>>> Thx - Rob > >>>>>> > >>>>>> Is there a way > >>> > >>> > > > >
