Hi, Thanks for your detailed explanation.
The address will be multiple ones for a single customer. For example a same customer can hold home address, office address, etc., hence I grouped into different column family. 1. Is my approach is correct? 2. What can we have as a rowkey for both these column families? 3. I think customer Number is sequence hence planning to include YYYYMMDD along with customer number in the rowkey. Is that fine? Regards, Rams On 24-Dec-2012, at 7:54 PM, Jean-Marc Spaggiari <[email protected]> wrote: > Hi Rams, > > How are you going to access you data? > > HBase will create one cell (Which mean rowkey+timestamp+...+data) for > eache cell. > > Are you really going to sometime access Address Line1 without > accessing Address Line2? > > Are you really going to access the City wihtout accessing the State? > > If not, why not just put a JSon object with all this data in a single cell? > > So at the end your table will look llike: > > *Table Name : Customer* > * > * > *Field Name Column Family* > Customer Information CF1 > Address CF1 > > > In Customer Information you bundle: > Customer Number CF1 > DOB CF1 > FName CF1 > MName CF1 > LName CF1 > > And in Address you bundle: > Address Type CF2 > Address Line1 CF2 > Address Line2 CF2 > Address Line3 CF2 > Address Line4 CF2 > State CF2 > City CF2 > Country CF2 > > But if you always access the address when you access the customer > information, then the best way might be to just put all those field in > a single JSon object, and have just one CF and on C in your table... > > Regarding the key, if you customer number is sequential and you insert > based on this field, you will hotspot one server at a time... If the > number is "random", then it's ok. > > HTH. > > JM > > 2012/12/24, Mohammad Tariq <[email protected]>: >> it is. but why do you want to do that? you will run into issues once your >> data starts growing. each cell, along with the actual value stores few >> additional things, *row, column *and the *version. *as a result you will >> loose space if you do that. >> >> Best Regards, >> Tariq >> +91-9741563634 >> https://mtariq.jux.com/ >> >> >> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan < >> [email protected]> wrote: >> >>> Hi, >>> >>> Is it ok to have same column into different column familes? >>> >>> regards, >>> Rams >>> >>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <[email protected]> >>> wrote: >>> >>>> you are creating 2 different rows here. cf means how column are clubbed >>>> together as a single entity which is represented by that cf. but here >>>> you >>>> are creating 2 different rows having one cf each, CF1 and CF2 >>> respectively. >>>> if you want to have 1 row with 2 cf, you have to do use same rowkey for >>>> both the cf. >>>> >>>> >>>> >>>> Best Regards, >>>> Tariq >>>> +91-9741563634 >>>> https://mtariq.jux.com/ >>>> >>>> >>>> On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan < >>>> [email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> *Table Name : Customer* >>>>> * >>>>> * >>>>> *Field Name Column Family* >>>>> Customer Number CF1 >>>>> DOB CF1 >>>>> FName CF1 >>>>> MName CF1 >>>>> LName CF1 >>>>> Address Type CF2 >>>>> Address Line1 CF2 >>>>> Address Line2 CF2 >>>>> Address Line3 CF2 >>>>> Address Line4 CF2 >>>>> State CF2 >>>>> City CF2 >>>>> Country CF2 >>>>> >>>>> Is it good to have rowkey as follows for the same table? >>>>> >>>>> Rowkey Design: >>>>> -------------- >>>>> For CF1 : Customer Number + YYYYMMD (business date) >>>>> For CF2 : Customer Number + Address Type >>>>> >>>>> Note : >>>>> Address Type can be any of HOME/OFFICE/OTHERS >>>>> >>>>> regards, >>>>> Rams >>
