[
https://issues.apache.org/jira/browse/HBASE-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193027#comment-13193027
]
Alexey Romanenko commented on HBASE-2965:
-----------------------------------------
It seems this is not implemented yet, isn't it?
> Implement MultipleTableInputs which is analogous to MultipleInputs in Hadoop
> ----------------------------------------------------------------------------
>
> Key: HBASE-2965
> URL: https://issues.apache.org/jira/browse/HBASE-2965
> Project: HBase
> Issue Type: New Feature
> Components: mapred, mapreduce
> Reporter: Adam Warrington
> Assignee: Ophir Cohen
> Priority: Minor
>
> This feature would be helpful for doing reduce side joins, or even passing
> similarly structured data from multiple tables through map reduce. The API I
> envision would be very similar to the already existent MultipleInputs, parts
> of which could be reused.
> MultipleTableInputs would have a public api like:
> class MultipleTableInputs {
> public static void addInputTable(Job job, Table table, Scan scan, Class<?
> extends TableInputFormatBase> inputFormatClass, Class<? extends Mapper>
> mapperClass);
> };
> MultipleTableInputs would build a mapping of Tables to configured
> TableInputFormats the same way MultipleInputs builds a mapping between Paths
> and InputFormats. Since most people will probably use TableInputFormat.class
> as the input format class, the MultipleTableInput implementation will have to
> replace the TableInputFormatBase's private scan and table members that are
> configured when an instance of TableInputFormat is created (from within its
> setConf() method) by calling setScan and setHTable with the table and scan
> that are passed into addInputTable above. MultipleTableInputFormat's
> addInputTable() member function would also set the input format for the job
> to DelegatingTableInputFormat, described below.
> A new class called DelegatingTableInputFormat would be analogous to
> DelegatingInputFormat, where getSplits() would return TaggedInputSplits (same
> TaggedInputSplit object that the Hadoop DelegatingInputFormat uses), which
> tag the split with its InputFormat and Mapper. These are created by looping
> through the HTable to InputFormat mappings, and calling getSplits on each
> input format, and using the split, the input format, and mapper as
> constructor args to TaggedInputSplits.
> The createRecordReader() function in DelegatingTableInputFormat could have
> the same implementation as the Hadoop DelegatingInputFormat.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira