Re: HBase Design : Column name v/s Version

Sagar Naik Fri, 24 Jan 2014 11:38:00 -0800

I do not have to purge the data.
I always need all the versions.

But Dhaval, raised a valid point of 100K versions and no pagination
support based on versions.


-Sagar

On 1/24/14 11:23 AM, "Vladimir Rodionov" <[email protected]> wrote:

>One downside of using synthetic versions is you won't be able to use TTL,
>which gives you automatic purge of stale data for free
>Have you thought already how to purge old data?
>
>Best regards,
>Vladimir Rodionov
>Principal Platform Engineer
>Carrier IQ, www.carrieriq.com
>e-mail: [email protected]
>
>________________________________________
>From: Sagar Naik [[email protected]]
>Sent: Friday, January 24, 2014 10:46 AM
>To: [email protected]; Dhaval Shah
>Subject: Re: HBase Design : Column name v/s Version
>
>Thanks for clarifying,
>
>I will be using custom version numbers (auto incrementing on the client
>side) and not timestamps.
>Two clients do not update the same row
>
>
>-Sagar
>
>On 1/24/14 10:33 AM, "Dhaval Shah" <[email protected]> wrote:
>
>>I am talking about schema 2. Schema 1 would definitely work. Schema 2 can
>>have the version collisions if you decide to use timestamps as versions
>>
>>Regards,
>>
>>Dhaval
>>
>>
>>----- Original Message -----
>>From: Sagar Naik <[email protected]>
>>To: "[email protected]" <[email protected]>; Dhaval Shah
>><[email protected]>
>>Cc:
>>Sent: Friday, 24 January 2014 1:07 PM
>>Subject: Re: HBase Design : Column name v/s Version
>>
>>I am not sure I understand you correctly.
>>I assume you are talking abt schema 1.
>>In this case I m appending the version number to the column name.
>>
>>The column_names are different (data_1/data_2) for value_1 and value_2
>>respectively.
>>
>>
>>-Sagar
>>
>>
>>On 1/24/14 9:47 AM, "Dhaval Shah" <[email protected]> wrote:
>>
>>>Versions in HBase are timestamps by default. If you intend to continue
>>>using the timestamps, what will happen when someone writes value_1 and
>>>value_2 at the exact same time?
>>>
>>>Regards,
>>>
>>>Dhaval
>>>
>>>
>>>----- Original Message -----
>>>From: Sagar Naik <[email protected]>
>>>To: "[email protected]" <[email protected]>
>>>Cc:
>>>Sent: Friday, 24 January 2014 12:27 PM
>>>Subject: HBase Design : Column name v/s Version
>>>
>>>Hi,
>>>
>>>I have a choice to maintain to data either in column values or as
>>>versioned data.
>>>This data is not a versioned copy per se.
>>>
>>>The access pattern on this get all the data every time
>>>
>>>So the schema choices are :
>>>Schema 1:
>>>1. column_name/qualifier => data_1. column_value => value_1
>>>1.a. column_name/qualifier => data_2. column_value => value_2,value_2.a
>>>
>>>1.b. column_name/qualifier => data_3. column_value => value_3
>>>
>>>To get all the values for "data", I will have to use ColumnPrefixFilter
>>>with prefix set "data"
>>>
>>>Schema 2:
>>>2. column_name/qualifier => data. version=> 1, column_value => value_1
>>>
>>>2.a. column_name/qualifier => data. version=> 2, column_value =>
>>>value_2,value_2.a
>>>
>>>2.b. column_name/qualifier => data. version=> 3, column_value => value_3
>>>To get all the values for "data" , I will do a simple get operation to
>>>get
>>>all the versions.
>>>
>>>Number of versions can go from: 10 to 100K
>>>
>>>Get operation perf should beat the Filter perf.
>>>Comparing 100K values will be costly as the # versions increase.
>>>
>>>I would like to know if there are drawbacks in going the version route.
>>>
>>>
>>>
>>>
>>>-Sagar
>>>
>>
>
>
>Confidentiality Notice:  The information contained in this message,
>including any attachments hereto, may be confidential and is intended to
>be read only by the individual or entity to whom this message is
>addressed. If the reader of this message is not the intended recipient or
>an agent or designee of the intended recipient, please note that any
>review, use, disclosure or distribution of this message or its
>attachments, in any form, is strictly prohibited.  If you have received
>this message in error, please immediately notify the sender and/or
>[email protected] and delete or destroy any copy of this
>message and its attachments.

Re: HBase Design : Column name v/s Version

Reply via email to