Hi,

I am considering doing Reduce-Side-Joins, where one input would be read from HDFS and another one from a HBase Table.

is it somehow possible to use

TableMapReduceUtil.initTableMapperJob(table, scan, Mapper_HBase.class, ..., job);

and

MultipleInputs(job, path, ..., Mapper_HDFS.class)

in the same time for one job?
It seems, MultipleInputs(...) gets the priority when i tried to use both. The Mapper_HBase was not executed. It executes, when i remove the MultipleInputs.


And is there something equivalent to MultipleInputs() for HBase Tables? e.g. MultipleTableInputs()? I saw there was a request here
https://issues.apache.org/jira/browse/HBASE-2965


A workaround would be to write the Scan Results to HDFS first and do the reduce-side join by using MultipleInputs. But i wanted to avoid this additional I/O overhead.

Thanks,
Christopher



Reply via email to