[
https://issues.apache.org/jira/browse/ACCUMULO-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14002705#comment-14002705
]
Corey J. Nolet commented on ACCUMULO-2553:
------------------------------------------
[~kturner], I may also investigate having reducers that can process ranges for
multiple groups as well. I suppose that could cut down on the number of
reducers needed. The group gets passed into the reducer with the key (I have a
GroupedKey class now that encapsulates the groupname and the key) so I know
which folder in which to write the file. Wondering if it'd be worth trying to
make the # of sub-bins independent too to help cut down on hotspots.
> AccumuloFileOutputFormat should be able to support output for multiple tables.
> ------------------------------------------------------------------------------
>
> Key: ACCUMULO-2553
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2553
> Project: Accumulo
> Issue Type: New Feature
> Reporter: Corey J. Nolet
> Assignee: Corey J. Nolet
> Priority: Minor
>
> This may not necessarily be something that would require changes in the
> AccumuloFileOutputFormat itself. Perhaps the ability to use it with Hadoop's
> MultipleOutputs is really the solution.
> It would be useful if the user could specify multiple directories where
> RFiles should be placed and have a mechanism for populating the RFiles in the
> necessary directories based on a table name or group name.
--
This message was sent by Atlassian JIRA
(v6.2#6252)