[ 
https://issues.apache.org/jira/browse/ACCUMULO-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14002705#comment-14002705
 ] 

Corey J. Nolet commented on ACCUMULO-2553:
------------------------------------------

[~kturner], I may also investigate having reducers that can process ranges for 
multiple groups as well. I suppose that could cut down on the number of 
reducers needed. The group gets passed into the reducer with the key (I have a 
GroupedKey class now that encapsulates the groupname and the key) so I know 
which folder in which to write the file. Wondering if it'd be worth trying to 
make the # of sub-bins independent too to help cut down on hotspots.

> AccumuloFileOutputFormat should be able to support output for multiple tables.
> ------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-2553
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2553
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Corey J. Nolet
>            Assignee: Corey J. Nolet
>            Priority: Minor
>
> This may not necessarily be something that would require changes in the 
> AccumuloFileOutputFormat itself. Perhaps the ability to use it with Hadoop's 
> MultipleOutputs is really the solution.
> It would be useful if the user could specify multiple directories where 
> RFiles should be placed and have a mechanism for populating the RFiles in the 
> necessary directories based on a table name or group name. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to