I am not sure I understand you correctly. I assume you are talking abt schema 1. In this case I m appending the version number to the column name.
The column_names are different (data_1/data_2) for value_1 and value_2 respectively. -Sagar On 1/24/14 9:47 AM, "Dhaval Shah" <[email protected]> wrote: >Versions in HBase are timestamps by default. If you intend to continue >using the timestamps, what will happen when someone writes value_1 and >value_2 at the exact same time? > >Regards, > >Dhaval > > >----- Original Message ----- >From: Sagar Naik <[email protected]> >To: "[email protected]" <[email protected]> >Cc: >Sent: Friday, 24 January 2014 12:27 PM >Subject: HBase Design : Column name v/s Version > >Hi, > >I have a choice to maintain to data either in column values or as >versioned data. >This data is not a versioned copy per se. > >The access pattern on this get all the data every time > >So the schema choices are : >Schema 1: >1. column_name/qualifier => data_1. column_value => value_1 >1.a. column_name/qualifier => data_2. column_value => value_2,value_2.a > >1.b. column_name/qualifier => data_3. column_value => value_3 > >To get all the values for "data", I will have to use ColumnPrefixFilter >with prefix set "data" > >Schema 2: >2. column_name/qualifier => data. version=> 1, column_value => value_1 > >2.a. column_name/qualifier => data. version=> 2, column_value => >value_2,value_2.a > >2.b. column_name/qualifier => data. version=> 3, column_value => value_3 >To get all the values for "data" , I will do a simple get operation to get >all the versions. > >Number of versions can go from: 10 to 100K > >Get operation perf should beat the Filter perf. >Comparing 100K values will be costly as the # versions increase. > >I would like to know if there are drawbacks in going the version route. > > > > >-Sagar >
