Sounds like you're wanting to do a lot of what the TableInputFormat facilitates for mapreduce programs. Probably you can use code from that package to turn a Scan into input splits, which contain region name and RegionServer location, and consume those from your custom coordinator.
-n On Tuesday, February 3, 2015, Demai Ni <nid...@gmail.com> wrote: > hi, Guys, > > I am looking for a way to Read HBase table through MPP(Postgres-XC). And > hoping to get some suggestions to either validate or invalidate the > approach. > > Kind of like Apache Drill, but through PostgresSQL. Long story about why > Postgres, and how c/c++ will give me headache for months to come. :-) I > will leave it as is for now. > > The design is to have distributed Postgres-XC installed on the same HBase > cluster, so Postgres' datanodes are on the same physical node as HBase's > regionServers. connect HBase from PostgresSQL through existing HBase client > code. > > Step1: At Postgres coordinator node(like Master of HBase), use > HTable.getRegionLocations to get all Regions of a particular table: > NavigableMap<HRegionInfo, ServerName> > Step 2: iterate through above NavigatbleMap to map HBase ServerName to > PG-XC's dataNode. The goal is to let the dataNode of Postgres handle the > regions on its own physical machine. > Step 3: Postgres coordinator node send the execution plan to Postgres > datanode , through a existing framework called foreign data wrapper. > Step 4: Postgres DataNode iterate through its assigned regions, and open a > HBase Client.Scan() with .setStartRow and .setStopRow so it will only read > the assigned region. I was hoping to use HRegionInfo.regionId directly, > but can find such API in Client.Scan > Step 5: Posgres DataNode further analyse the retrieve data. > > So in short, the architect design is to leverage Postgres optimizer to > parse SQL Query, and use Postgres DataNode as HBase' client to read HBase > regions directly in parallel. With the hope to 1) read HRegion locally; 2) > leverage existing HBase filters. > > On step4 above, is there a way to talk to RegionSever directly without > communicating with HMaster? > > Similar ideas(Drill for one, how about HP vertica?) are brought up before, > and discussed. So before I am heading down the same road, Can I pick your > brain, please shed me some light? or prevent me from doing something > stupid? > > Many thanks > > Demai >