Brian, Ted, thank you for your answers.

Ted, could you point out the HBase version where per column family flush
first appeared?


On Thu, Jun 22, 2017 at 4:06 PM, Ted Yu <[email protected]> wrote:

> bq. HBase doesn't do well with more than 2-3 column families
>
> The above is out of date - we have per column family flush which would
> reduce the number of small hfiles.
>
> bq. Why can't we just create several tables instead?
>
> Currently hbase doesn't provide transaction across region boundary. This
> means with more than one table, burden is on application code to
> achieve transaction
> where needed.
> Since the multiple tables tend to have same row key design as you
> mentioned, region servers carry more regions, increasing load on assignment
> manager / balancer, etc.
>
> Cheers
>
> On Thu, Jun 22, 2017 at 5:44 AM, Alexander Ilyin <[email protected]>
> wrote:
>
> > Hi,
> >
> > A general question regarding column families. It is said in the doc that
> > HBase doesn't do well with more than 2-3 column families because flushing
> > and compactions are done on a per region basis which should be addressed
> in
> > the future: http://hbase.apache.org/book.html#number.of.cfs
> >
> > Is it still the case in new versions of HBase or there were some
> > improvements on this?
> >
> > I also don't understand why using several column families might be useful
> > even if data access is column scoped. Why can't we just create several
> > tables instead? Row key is stored with every cell anyway and it's
> possible
> > to filter by column when querying.
> >
> > In general, I don't see when it might make sense to have more than one
> > column family in a table with current limitations.
> >
> > Thanks in advance.
> >
>

Reply via email to