Hi,

I am also looking answer for similar question. In my scenario we will be having 
petabytes of data to handle. Currently I am working with schema which has 3-4 
column family with them. What the major issues we can face if we have multiple 
column family.

I have read that each column family will be stored as separate Hfile in 
regionserver and if we search by row-id and column family that will be useful 
as client will go to Hfile for specific column family.
If we have flat table structure then we will land up either having more tables 
with data replication because of the data dependencies on each other.

Please suggest


-----Original Message-----
From: Imran M Yousuf [mailto:[email protected]]
Sent: Saturday, September 10, 2011 6:55 AM
To: [email protected]
Subject: Re: Using multiple column families

Hi J-D,

Thanks for your feedback.

(replies inline)
On Sat, Sep 10, 2011 at 5:39 AM, Jean-Daniel Cryans <[email protected]> wrote:
> 20k rows? If this is your only use case, you don't need HBase :)
>

Its one of several others

> If it's 20k rows times a gazillion columns per row, then I would
> recommend flattening out the rows instead.
>

Well, our guess is at the moment their would not be more than 500 cells per 
family to start with.

> If it's just one small table among others, then you probably won't be
> bothered by the multiple families.
>

We actually have many other tables which are flattened out to a single column 
family and this is one table for which we are using more than 1 column family.

Thanks once again.

Imran

> J-D
>
> On Thu, Sep 8, 2011 at 10:07 PM, Imran M Yousuf <[email protected]> wrote:
>> Hi,
>>
>> Firstly, I have read in the mailing list before that having more than
>> 1 column family is not recommended. I am more interested to know
>> whether it is a problem in my use case as well or not.
>>
>> I have a strong entitly and it has 6 weak entities all with 1-to-many
>> cardinal relationship to the strong entity. Furthermore, they are all
>> loaded in mutually exclusive manner, i.e. if A is strong entity and
>> its weak entities are P, Q, R, S, T, U in that case no 2 weak
>> entities are accessed at once. Moreover their lifecycles are
>> independent of each other. My current implementation is I have one
>> column family for the strong entity and one for each weak entities.
>> So for a given row I only load one column family at a time. The
>> obvious advantages are that
>> - deleting strong entity automatically deletes the weak entities as
>> they are a single row, delete all of a kind weak entity for a
>> specific weak entity is as simple as deleting all cells in a column
>> family for a row. Our assumption (pretty high than what we expect) is
>> that we will not have more than 20k rows in that table. Under these
>> circumstance how bad is it to have 7 column families?
>>
>> We would be glad if you would kindly share thoughts and feedback on this 
>> issue.
>>
>> Thank you,
>>
>> --
>> Imran M Yousuf
>> Entrepreneur & CEO
>> Smart IT Engineering Ltd.
>> Dhaka, Bangladesh
>> Twitter: @imyousuf - http://twitter.com/imyousuf
>> Blog: http://imyousuf-tech.blogs.smartitengineering.com/
>> Mobile: +880-1711402557
>>
>



--
Imran M Yousuf
Entrepreneur & CEO
Smart IT Engineering Ltd.
Dhaka, Bangladesh
Twitter: @imyousuf - http://twitter.com/imyousuf
Blog: http://imyousuf-tech.blogs.smartitengineering.com/
Mobile: +880-1711402557

::DISCLAIMER::
-----------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
It shall not attach any liability on the originator or HCL or its affiliates. 
Any views or opinions presented in
this email are solely those of the author and may not necessarily reflect the 
opinions of HCL or its affiliates.
Any form of reproduction, dissemination, copying, disclosure, modification, 
distribution and / or publication of
this message without the prior written consent of the author of this e-mail is 
strictly prohibited. If you have
received this email in error please delete it and notify the sender 
immediately. Before opening any mail and
attachments please check them for viruses and defect.

-----------------------------------------------------------------------------------------------------------------------

Reply via email to