Please see http://hbase.apache.org/book.html#schema.versions
On Fri, Jan 24, 2014 at 9:27 AM, Sagar Naik <[email protected]> wrote: > Hi, > > I have a choice to maintain to data either in column values or as > versioned data. > This data is not a versioned copy per se. > > The access pattern on this get all the data every time > > So the schema choices are : > Schema 1: > 1. column_name/qualifier => data_1. column_value => value_1 > 1.a. column_name/qualifier => data_2. column_value => value_2,value_2.a > > 1.b. column_name/qualifier => data_3. column_value => value_3 > > To get all the values for "data", I will have to use ColumnPrefixFilter > with prefix set "data" > > Schema 2: > 2. column_name/qualifier => data. version=> 1, column_value => value_1 > > 2.a. column_name/qualifier => data. version=> 2, column_value => > value_2,value_2.a > > 2.b. column_name/qualifier => data. version=> 3, column_value => value_3 > To get all the values for "data" , I will do a simple get operation to get > all the versions. > > Number of versions can go from: 10 to 100K > > Get operation perf should beat the Filter perf. > Comparing 100K values will be costly as the # versions increase. > > I would like to know if there are drawbacks in going the version route. > > > > > -Sagar > >
