HBASE-3721 was integrated to trunk, not 0.90.x
HBASE-3871 is under review.

So I would interpret my answer as tilting toward outputing Hfiles that fit
within a single Region.

If, after your effort, there're still some HFiles that don't fit. You can
try my patches.

Thanks

2011/5/25 Panayotis Antonopoulos <[email protected]>

>
> So your answer would be that it is better to have the best possible load
> balancing during the reduce phase instead of taking care to output Hfiles
> that fit within a single Region, because splitting done by Incremental Load
> is rather fast?
>
> > Date: Wed, 25 May 2011 09:20:10 -0700
> > Subject: Re: HFiles that fit within a single region VS better load
> balancing at reduce phase
> > From: [email protected]
> > To: [email protected]
> >
> > LoadIncrementalHFiles would split HFile if it doesn't fit within a single
> > region.
> >
> > Please refer to the following JIRAs which speedup LoadIncrementalHFiles:
> > https://issues.apache.org/jira/browse/HBASE-3871
> > https://issues.apache.org/jira/browse/HBASE-3721
> >
> > Note: parallelizing splitting of HFile(s) by LoadIncrementalHFiles is
> done
> > on a single machine.
> >
> > Thanks
> >
> > 2011/5/25 Panayotis Antonopoulos <[email protected]>
> >
> > >
> > > Hello,
> > > I am currently working on a MR job that will output HFiles that will be
> > > bulk loaded in an HBase Table.
> > > According to the HBase site in order for the bulk loading to be
> efficient
> > > each HFile of the MR job should fit within a single region.
> > > In order to achieve that I use the TotalOrderPartitioner so that each
> > > reducer gets Key/Value pairs from a single region.
> > > However this prevents partitioning Mapper's output in equal splits so
> that
> > > I have the best possible load balancing during the reduce phase.
> > >
> > > So I would like to ask you how important is to create HFiles that fit
> > > within a single region.
> > > If it makes bulk loading much faster probably it is better to sacrifice
> > > load balancing.
> > > But is this the case?
> > > Has anyone tried both choices?
> > >
> > > Thank you in advance!
> > > Panagiotis.
> > >
>
>

Reply via email to