It depends on how you access the table.  Three to four column families
may be appropriate schema if you are accessing individual cfs mostly.
Its when you do x-cf accesses, that things can slow (If most of your
accesses are getting all data -- then just have one cf).  Multiple cfs
too if all active at the one time can make the server internal
accounting a little messy.  We've not spent much time studying and
optimizing for this case; e.g. mult-cf flushing, compacting, querying.
 Because of this, query times can be slower.

St.Ack

On Mon, Sep 12, 2011 at 12:05 AM, Stuti Awasthi <[email protected]> wrote:
> Hi,
>
> I am also looking answer for similar question. In my scenario we will be 
> having petabytes of data to handle. Currently I am working with schema which 
> has 3-4 column family with them. What the major issues we can face if we have 
> multiple column family.
>
> I have read that each column family will be stored as separate Hfile in 
> regionserver and if we search by row-id and column family that will be useful 
> as client will go to Hfile for specific column family.
> If we have flat table structure then we will land up either having more 
> tables with data replication because of the data dependencies on each other.
>
> Please suggest
>
>
> -----Original Message-----
> From: Imran M Yousuf [mailto:[email protected]]
> Sent: Saturday, September 10, 2011 6:55 AM
> To: [email protected]
> Subject: Re: Using multiple column families
>
> Hi J-D,
>
> Thanks for your feedback.
>
> (replies inline)
> On Sat, Sep 10, 2011 at 5:39 AM, Jean-Daniel Cryans <[email protected]> 
> wrote:
>> 20k rows? If this is your only use case, you don't need HBase :)
>>
>
> Its one of several others
>
>> If it's 20k rows times a gazillion columns per row, then I would
>> recommend flattening out the rows instead.
>>
>
> Well, our guess is at the moment their would not be more than 500 cells per 
> family to start with.
>
>> If it's just one small table among others, then you probably won't be
>> bothered by the multiple families.
>>
>
> We actually have many other tables which are flattened out to a single column 
> family and this is one table for which we are using more than 1 column family.
>
> Thanks once again.
>
> Imran
>
>> J-D
>>
>> On Thu, Sep 8, 2011 at 10:07 PM, Imran M Yousuf <[email protected]> wrote:
>>> Hi,
>>>
>>> Firstly, I have read in the mailing list before that having more than
>>> 1 column family is not recommended. I am more interested to know
>>> whether it is a problem in my use case as well or not.
>>>
>>> I have a strong entitly and it has 6 weak entities all with 1-to-many
>>> cardinal relationship to the strong entity. Furthermore, they are all
>>> loaded in mutually exclusive manner, i.e. if A is strong entity and
>>> its weak entities are P, Q, R, S, T, U in that case no 2 weak
>>> entities are accessed at once. Moreover their lifecycles are
>>> independent of each other. My current implementation is I have one
>>> column family for the strong entity and one for each weak entities.
>>> So for a given row I only load one column family at a time. The
>>> obvious advantages are that
>>> - deleting strong entity automatically deletes the weak entities as
>>> they are a single row, delete all of a kind weak entity for a
>>> specific weak entity is as simple as deleting all cells in a column
>>> family for a row. Our assumption (pretty high than what we expect) is
>>> that we will not have more than 20k rows in that table. Under these
>>> circumstance how bad is it to have 7 column families?
>>>
>>> We would be glad if you would kindly share thoughts and feedback on this 
>>> issue.
>>>
>>> Thank you,
>>>
>>> --
>>> Imran M Yousuf
>>> Entrepreneur & CEO
>>> Smart IT Engineering Ltd.
>>> Dhaka, Bangladesh
>>> Twitter: @imyousuf - http://twitter.com/imyousuf
>>> Blog: http://imyousuf-tech.blogs.smartitengineering.com/
>>> Mobile: +880-1711402557
>>>
>>
>
>
>
> --
> Imran M Yousuf
> Entrepreneur & CEO
> Smart IT Engineering Ltd.
> Dhaka, Bangladesh
> Twitter: @imyousuf - http://twitter.com/imyousuf
> Blog: http://imyousuf-tech.blogs.smartitengineering.com/
> Mobile: +880-1711402557
>
> ::DISCLAIMER::
> -----------------------------------------------------------------------------------------------------------------------
>
> The contents of this e-mail and any attachment(s) are confidential and 
> intended for the named recipient(s) only.
> It shall not attach any liability on the originator or HCL or its affiliates. 
> Any views or opinions presented in
> this email are solely those of the author and may not necessarily reflect the 
> opinions of HCL or its affiliates.
> Any form of reproduction, dissemination, copying, disclosure, modification, 
> distribution and / or publication of
> this message without the prior written consent of the author of this e-mail 
> is strictly prohibited. If you have
> received this email in error please delete it and notify the sender 
> immediately. Before opening any mail and
> attachments please check them for viruses and defect.
>
> -----------------------------------------------------------------------------------------------------------------------
>

Reply via email to