[
https://issues.apache.org/jira/browse/CASSANDRA-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13586953#comment-13586953
]
Illarion Kovalchuk commented on CASSANDRA-5251:
-----------------------------------------------
Well, in our case we have multiple cf's, keeping different aspects of
information about same objects. We want to merge them in a single map-reduce
pass, in a way that mapper gets data from all column families (distinguishing
them by context.getCurrentSplit()).
I think you're right and if it causes random I/O, could you please suggest a
workaround?
Thank you.
> Hadoop support should be able to work with multiple column families
> -------------------------------------------------------------------
>
> Key: CASSANDRA-5251
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5251
> Project: Cassandra
> Issue Type: Improvement
> Components: Hadoop
> Affects Versions: 1.1.0, 1.1.11, 1.2.0, 2.0
> Reporter: Illarion Kovalchuk
> Priority: Minor
> Attachments: trunk-5251.txt
>
>
> This patch affects api, so I changed hadoop example in it. The main
> difference is that now ColumnFamilyInput format generates splits for all
> input column families, and ColumnFamilyOutputFormat works not with
> List<Mutation>, but with List<Pair<String,Mutation>>, where Pair.left is for
> column family name.
> Thank you
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira