Sounds like you're wanting to do a lot of what the TableInputFormat
facilitates for mapreduce programs. Probably you can use code from that
package to turn a Scan into input splits, which contain region name
and RegionServer location, and consume those from your custom coordinator.

-n

On Tuesday, February 3, 2015, Demai Ni <nid...@gmail.com> wrote:

> hi, Guys,
>
> I am looking for a way to Read HBase table through MPP(Postgres-XC). And
> hoping to get some suggestions to either validate or invalidate the
> approach.
>
> Kind of like Apache Drill, but through PostgresSQL. Long story about why
> Postgres, and how c/c++ will give me headache for months to come. :-) I
> will leave it as is for now.
>
> The design is to have distributed Postgres-XC installed on the same HBase
> cluster, so Postgres' datanodes are on the same physical node as HBase's
> regionServers. connect HBase from PostgresSQL through existing HBase client
> code.
>
> Step1: At Postgres coordinator node(like Master of HBase), use
> HTable.getRegionLocations to get all Regions of a particular table:
> NavigableMap<HRegionInfo, ServerName>
> Step 2: iterate through above NavigatbleMap to map HBase ServerName to
> PG-XC's dataNode. The goal is to let the dataNode of Postgres handle the
> regions on its own physical machine.
> Step 3: Postgres coordinator node send the execution plan to Postgres
> datanode , through a existing framework called foreign data wrapper.
> Step 4: Postgres DataNode iterate through its assigned regions, and open a
> HBase Client.Scan() with .setStartRow and .setStopRow so it will only read
> the assigned region.  I was hoping to use HRegionInfo.regionId directly,
> but can find such API in Client.Scan
> Step 5: Posgres DataNode further analyse the retrieve data.
>
> So in short, the architect design is to leverage Postgres optimizer to
> parse SQL Query, and use Postgres DataNode as HBase' client to read HBase
> regions directly in parallel. With the hope to 1) read HRegion locally; 2)
> leverage existing HBase filters.
>
> On step4 above, is there a way to talk to RegionSever directly without
> communicating with HMaster?
>
> Similar ideas(Drill for one, how about HP vertica?) are brought up before,
> and discussed.  So before I am heading down the same road, Can I pick your
> brain, please shed me some light? or prevent me from doing something
> stupid?
>
> Many thanks
>
> Demai
>

Reply via email to