> Table 2 provides some actual CF/table numbers. One of the crawl tables has > 16 CFs and one of the Google Base tables had 29 CFs
What's Google doing in BigTable that enables so many CFs? Is the cost in HBase the seek to each individual key in the CFs, or is it the cost of loading each block into RAM (?), which could be alleviated though bypassing the block cache and accessing the blocks as if they're local. On Mon, Jun 13, 2011 at 2:35 PM, Leif Wickland <[email protected]> wrote: > Thanks for replying, J-D. > > My interpretation is that they try to keep that number low, from page 2: >> >> "It is our intent that the number of distinct column families in a >> table be small (in the hundreds at most)" >> > > Table 2 provides some actual CF/table numbers. One of the crawl tables has > 16 CFs and one of the Google Base tables had 29 CFs. > > >> Could you just store that in the same family? >> > > Yup. I could. Their would be a little weirdness to it, but I think it's > doable. It seems like that's the consensus suggestion. > > >> Row locking is rarely a good idea, it doesn't scale and they currently >> aren't persisted anywhere except the RS memory (so if it dies...). >> Using a single family might be better for you. > > > Thanks for the pointer. > > Leif >
