Hi there,

Scott will most probably correct me if I'm wrong, or provide more details,
but I think they are already creating the HFiles on the servers the regions
belong to (MR). So after loading it into HBase, everything should be on the
same node and data locally should be preserved. If there is a split of the
region, then balancer might break this, but in general, it should be
correct.

JM


2013/8/13 Elliott Clark <[email protected]>

> On Mon, Aug 12, 2013 at 9:58 PM, lars hofhansl <[email protected]> wrote:
> > For example we could add an RPC to the regionserver and have the
> regionserver who would own the region copy the appropriate part of the file
> (then the data would be local). Or even simpler, instead of actually
> copying the files we could just copy in the reference files and let the
> usual compactions take care of the reference files.
>
> That will already be taken care of in trunk.  The favored nodes will
> assign preference to data nodes.  Then since we queue compactions for
> anything with reference files everything should be created on the
> local server and two others.  Then if the balancer needs to it should
> have two other targets where most things are data local.
>

Reply via email to