[ 
https://issues.apache.org/jira/browse/ACCUMULO-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777678#comment-13777678
 ] 

Corey J. Nolet commented on ACCUMULO-391:
-----------------------------------------

I don't like enforcing the user to follow a specific configuration order. I 
know they'll only need to configure it once but that's a tedious trial and 
error process until they either pull down the codebase or figure out the right 
order in which to call the methods. Perhaps a nice warning during 
getInputSplits() or initialize() in the mappers would be enough for someone to 
see in the logs why their stuff failed. I agree with William- they'll only need 
to do this once in most cases.

On the other topic- the iterators, ranges, and columns are inherently tied to a 
table. In the case of a single table input format, I can see why separate 
methods could be used. I like the idea of having a TableConfiguration object 
that has the iterators, ranges, and columns serialized within it. It would 
simplify the API immensely as well as the concerns that each configuration is 
in a valid state by the time the getInputSplits() method is called. Perhaps 
this could also be used in the MultiTableBatchScanner implementation.

That's a significant API change to introduce in 1.6.0. We could get away with 
backwards compatibility by having the current set table methods (setting a 
single table) hydrate a TableConfiguration object under the hood that could be 
treated as a "default table".
                
> Multi-table Accumulo input format
> ---------------------------------
>
>                 Key: ACCUMULO-391
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-391
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: John Vines
>            Assignee: Corey J. Nolet
>            Priority: Minor
>              Labels: mapreduce,
>             Fix For: 1.6.0
>
>         Attachments: ACCUMULO-391.patch, multi-table-if.patch, 
> new-multitable-if.patch
>
>
> Just realized we had no MR input method which supports multiple Tables for an 
> input format. I would see it making the table the mapper's key and making the 
> Key/Value a tuple, or alternatively have the Table/Key be the key tuple and 
> stick with Values being the value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to