Re: observer coprocessor question regarding puts

Michael Segel Fri, 14 Jun 2013 07:47:19 -0700

Not to beat a dead horse... 

I did want to touch a bit more on the schema design issues and considerations.


If you have a really wide composite key and you're only storing a single cell, 
you will end up with a very long (tall) table. 

Does this make sense? 

Would it make more sense in using a smaller key and then storing multiple cells 
with part of the rowkey as a column qualifier? 

Using your example... you have [A,B,C] as your rowkey and then Column1 with a 
value. 

You could make the row key [A, B] with the column qualifier [C] storing the 
value there. 

Does that make sense? 

-Mike

On Jun 13, 2013, at 9:51 PM, Michel Segel <[email protected]> wrote:

> Ok...
> 
> But then you are duplicating the data, so you will have to reconcile the two 
> sets and there is a possibility that the data sets are out of sync.
> 
> I don't know your entire Schema, but if the row key is larger than the value, 
> you may want to think about changing the Schema.
> 
> 
> Sent from a remote device. Please excuse any typos...
> 
> Mike Segel
> 
> On Jun 13, 2013, at 9:34 PM, rob mancuso <[email protected]> wrote:
> 
>> Thx Mike, for the most part.
>> 
>> My key is substantially larger than my value, so I was thinking of leaving
>> the cq->value stuff as is and just inverting the rowkey.
>> 
>> So the original table would have
>> 
>> [A, B, C] cf1:cq1 val1
>> 
>> And the secondary table would have
>> 
>> [C, B, A] cf1:cq1 val1
>> On Jun 10, 2013 3:42 PM, "Michael Segel" <[email protected]> wrote:
>> 
>>> 
>>> If I understand you ...
>>> 
>>> You have the row key = [A,B,C]
>>> You want to create an inverted mapping of  Key [C] => {[A,B,C]}
>>> 
>>> That is to say that your inverted index would be all of the rows where the
>>> value of C = x  .
>>> And x is some value.
>>> 
>>> You should have to worry about column qualifiers just the values of A , B
>>> and C.
>>> 
>>> In this case, the columns in your index will also be the values of the
>>> tuples.
>>> You really don't need C because you already have it, but then you'd need
>>> to remember to add it to the pair (A, B) that you are storing.
>>> I'd say waste the space and store (A,B,C) but that's just me.
>>> 
>>> 
>>> Is that what you want to do?
>>> 
>>> -Mike
>>> 
>>> On Jun 9, 2013, at 12:16 PM, rob mancuso <[email protected]> wrote:
>>> 
>>>> Thx Anoop, I believe this is what I'm looking for.
>>>> 
>>>> Regarding my use case,  my rowkey is [A,B,C], but i also have a
>>> requirement
>>>> to access data by [C] only.  So I'm looking to use a post-put coprocessor
>>>> to maintain one secondary index table where the rowkey starts with [C].
>>> My
>>>> cqs are numerics representing time and can be any number btw 1 and 3600
>>> (ie
>>>> seconds within an hour). Because I won't know the cq value for each
>>>> incoming put (just the cf), I need something to deconstruct the put into
>>> a
>>>> list of cqs ...which I believe you've provided with getFamilyMap.
>>>> 
>>>> Thx again!
>>>> On Jun 9, 2013 12:47 AM, "Anoop John" <[email protected]> wrote:
>>>> 
>>>>> You want to have an index per every CF+CQ right?  You want to maintain
>>> diff
>>>>> tables for diff columns?
>>>>> 
>>>>> Put is having getFamilyMap method Map CF vs List KVs.  From this List of
>>>>> KVs you can get all the CQ names and values etc..
>>>>> 
>>>>> -Anoop-
>>>>> 
>>>>> On Sat, Jun 8, 2013 at 11:24 PM, rob mancuso <[email protected]>
>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I'm looking to write a post-put observer coprocessor to maintain a
>>>>>> secondary index.  Basically, my current rowkey design is a composite of
>>>>>> A,B,C and I want to be able to also access data by C.  So all i'm
>>> looking
>>>>>> to do is invert the rowkey and apply it for all cf:cq values that come
>>>>> in.
>>>>>> 
>>>>>> My problem (i think), is that in all the good examples i've seen, they
>>>>> all
>>>>>> deconstruct the Put by calling put.get(<cf>,<cq>)...implying they know
>>>>> the
>>>>>> qualifier ahead of time.  I'm looking to specify the family and
>>> generate
>>>>> a
>>>>>> put to the secondary index table for all qualifiers ...not knowing or
>>>>>> caring what the qualifier is.
>>>>>> 
>>>>>> Any pointers would be appreciated,
>>>>>> Thx - Rob
>>>>>> 
>>>>>> Is there a way
>>> 
>>> 
>

Re: observer coprocessor question regarding puts

Reply via email to