Please see http://hbase.apache.org/book.html#schema.versions


On Fri, Jan 24, 2014 at 9:27 AM, Sagar Naik <[email protected]> wrote:

> Hi,
>
> I have a choice to maintain to data either in column values or as
> versioned data.
> This data is not a versioned copy per se.
>
> The access pattern on this get all the data every time
>
> So the schema choices are :
> Schema 1:
> 1. column_name/qualifier => data_1. column_value => value_1
> 1.a. column_name/qualifier => data_2. column_value => value_2,value_2.a
>
> 1.b. column_name/qualifier => data_3. column_value => value_3
>
> To get all the values for "data", I will have to use ColumnPrefixFilter
> with prefix set "data"
>
> Schema 2:
> 2. column_name/qualifier => data. version=> 1, column_value => value_1
>
> 2.a. column_name/qualifier => data. version=> 2, column_value =>
> value_2,value_2.a
>
> 2.b. column_name/qualifier => data. version=> 3, column_value => value_3
> To get all the values for "data" , I will do a simple get operation to get
> all the versions.
>
> Number of versions can go from: 10 to 100K
>
> Get operation perf should beat the Filter perf.
> Comparing 100K values will be costly as the # versions increase.
>
> I would like to know if there are drawbacks in going the version route.
>
>
>
>
> -Sagar
>
>

Reply via email to