On Wed, Mar 16, 2011 at 11:30 PM, Otis Gospodnetic <[email protected]> wrote: > If I'm reading http://hbase.apache.org/book/schema.html#number.of.cfs > correctly, > the advice is not to have more than 2-3 CFs per table? > And what happens if I have say 6 CFs per table? > > Again if I read the above page correctly, the problem is that uneven data > distribution will mean that whenever 1 of my CFs needs to be flushed, the > remaining 5 CFs will also get flushed at the same time, and this may (or > will?) > trigger compaction for all CFs' files creating a sudden IO hit? > > Is there a good solution for this problem? > Should one then have 6 different tables, each with just 1 CF instead of > having 1 > table with 6 CFs? >
Just to say that the reason we do not do > 3-4 CFs in a row well is because we haven't done the work to make it work nicely. As is, we do dumb stuff like the above mentioned flush all CFs if one is at limit even if others are small but then we also do stuff like serialize lookups across the CFs instead of running queries in parallel if the query is x-CFs (fixing this is one of the oldest issues in hbase). St.Ack
