> Table 2 provides some actual CF/table numbers.  One of the crawl tables has
> 16 CFs and one of the Google Base tables had 29 CFs

What's Google doing in BigTable that enables so many CFs?

Is the cost in HBase the seek to each individual key in the CFs, or is
it the cost of loading each block into RAM (?), which could be
alleviated though bypassing the block cache and accessing the blocks
as if they're local.

On Mon, Jun 13, 2011 at 2:35 PM, Leif Wickland <[email protected]> wrote:
> Thanks for replying, J-D.
>
> My interpretation is that they try to keep that number low, from page 2:
>>
>> "It is our intent that the number of distinct column families in a
>> table be small (in the hundreds at most)"
>>
>
> Table 2 provides some actual CF/table numbers.  One of the crawl tables has
> 16 CFs and one of the Google Base tables had 29 CFs.
>
>
>> Could you just store that in the same family?
>>
>
> Yup.  I could.  Their would be a little weirdness to it, but I think it's
> doable.  It seems like that's the consensus suggestion.
>
>
>> Row locking is rarely a good idea, it doesn't scale and they currently
>> aren't persisted anywhere except the RS memory (so if it dies...).
>> Using a single family might be better for you.
>
>
> Thanks for the pointer.
>
> Leif
>

Reply via email to