Per column family flush is in hbase 1.1 onwards Cheers
On Fri, Jun 23, 2017 at 2:18 AM, Alexander Ilyin <[email protected]> wrote: > Brian, Ted, thank you for your answers. > > Ted, could you point out the HBase version where per column family flush > first appeared? > > > On Thu, Jun 22, 2017 at 4:06 PM, Ted Yu <[email protected]> wrote: > > > bq. HBase doesn't do well with more than 2-3 column families > > > > The above is out of date - we have per column family flush which would > > reduce the number of small hfiles. > > > > bq. Why can't we just create several tables instead? > > > > Currently hbase doesn't provide transaction across region boundary. This > > means with more than one table, burden is on application code to > > achieve transaction > > where needed. > > Since the multiple tables tend to have same row key design as you > > mentioned, region servers carry more regions, increasing load on > assignment > > manager / balancer, etc. > > > > Cheers > > > > On Thu, Jun 22, 2017 at 5:44 AM, Alexander Ilyin <[email protected] > > > > wrote: > > > > > Hi, > > > > > > A general question regarding column families. It is said in the doc > that > > > HBase doesn't do well with more than 2-3 column families because > flushing > > > and compactions are done on a per region basis which should be > addressed > > in > > > the future: http://hbase.apache.org/book.html#number.of.cfs > > > > > > Is it still the case in new versions of HBase or there were some > > > improvements on this? > > > > > > I also don't understand why using several column families might be > useful > > > even if data access is column scoped. Why can't we just create several > > > tables instead? Row key is stored with every cell anyway and it's > > possible > > > to filter by column when querying. > > > > > > In general, I don't see when it might make sense to have more than one > > > column family in a table with current limitations. > > > > > > Thanks in advance. > > > > > >
