Hi Nicolas, Thank you for your explanation. I understand the issue here, it's as I suspected - the client Java API is not privy to the region operations. I'll look at alternative solutions.
Gokul. On 4 March 2015 at 14:05, Nicolas Liochon <[email protected]> wrote: > It's going to be fairly difficult imho. > What you need to look at is region. Tables are split in regions. Regions > are allocated to region server (i.e. an hbase node). Reads and writes are > directed to the region server owning the region. Regions can move from one > region server to another, that's the job of the load balancer. Regions can > be split at any moment. In the HBase client API, you don't really see these > regions: it's managed internally by HBase (my guess is that locations are > available anyway, but I'm not sure). If you want locality, you need to run > the user code on the region server owning the region you're > reading/writing. But it could be a premature and costly optimization. > > Nicolas > > On Wed, Mar 4, 2015 at 6:46 AM, Gokul Balakrishnan <[email protected]> > wrote: > > > Hello, > > > > I'm fairly new to HBase so would be grateful for any assistance. > > > > My project is as follows: use HBase as an underlying data store for an > > analytics cluster (powered by Apache Spark). > > > > In doing this, I'm wondering how I may set about leveraging the locality > of > > the HBase data during processing (in other words, if the Spark instance > is > > running on a node that also houses HBase data, how to make use of the > local > > data first). > > > > Is there some form of metadata offered by the Java API which I could then > > use to organise the data into (virtual) groups based on the locality to > be > > passed forward to Spark? It could be something that *identifies on which > > node a particular row resides*. I found [1] but I'm not sure if this is > > what I'm looking for. Could someone please point me in the right > direction? > > > > [1] https://issues.apache.org/jira/browse/HBASE-12361 > > > > Thanks so much! > > Gokul Balakrishnan. > > >
