Hi Nicolas,

Thank you for your explanation. I understand the issue here, it's as I
suspected - the client Java API is not privy to the region operations. I'll
look at alternative solutions.

Gokul.

On 4 March 2015 at 14:05, Nicolas Liochon <[email protected]> wrote:

> It's going to be fairly difficult imho.
> What you need to look at is region. Tables are split in regions. Regions
> are allocated to region server (i.e. an hbase node). Reads and writes are
> directed to the region server owning the region. Regions can move from one
> region server to another, that's the job of the load balancer. Regions can
> be split at any moment. In the HBase client API, you don't really see these
> regions: it's managed internally by HBase (my guess is that locations are
> available anyway, but I'm not sure). If you want locality, you need to run
> the user code on the region server owning the region you're
> reading/writing. But it could be a premature and costly optimization.
>
> Nicolas
>
> On Wed, Mar 4, 2015 at 6:46 AM, Gokul Balakrishnan <[email protected]>
> wrote:
>
> > Hello,
> >
> > I'm fairly new to HBase so would be grateful for any assistance.
> >
> > My project is as follows: use HBase as an underlying data store for an
> > analytics cluster (powered by Apache Spark).
> >
> > In doing this, I'm wondering how I may set about leveraging the locality
> of
> > the HBase data during processing (in other words, if the Spark instance
> is
> > running on a node that also houses HBase data, how to make use of the
> local
> > data first).
> >
> > Is there some form of metadata offered by the Java API which I could then
> > use to organise the data into (virtual) groups based on the locality to
> be
> > passed forward to Spark? It could be something that *identifies on which
> > node a particular row resides*. I found [1] but I'm not sure if this is
> > what I'm looking for. Could someone please point me in the right
> direction?
> >
> > [1] https://issues.apache.org/jira/browse/HBASE-12361
> >
> > Thanks so much!
> > Gokul Balakrishnan.
> >
>

Reply via email to