[
https://issues.apache.org/jira/browse/CASSANDRA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002930#comment-13002930
]
Matt Kennedy commented on CASSANDRA-2245:
-----------------------------------------
I've taken a crack at coding this up, but I'm not thrilled with the results. I
agree with Brandon that CASSANDRA-1600 is the best way to deal with this issue.
The get_indexed_slices method doesn't offer the parameter for a key_range that
makes this useful for a map reduce job. I'm reviewing that discussion at the
moment to see if there is a way to get a patch for something like this
functionality out prior to 0.8 without breaking the thrift API.
> Enable map reduce to use indexes for ColumnFamilyInputFormat
> ------------------------------------------------------------
>
> Key: CASSANDRA-2245
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2245
> Project: Cassandra
> Issue Type: Improvement
> Components: Hadoop
> Affects Versions: 0.7.2
> Environment: Cassandra 0.7 or later and Hadoop 0.20.1 or later
> Reporter: Matt Kennedy
> Priority: Minor
> Labels: hadoop
> Fix For: 0.8
>
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> Enable the ability to run a MapReduce job that takes a value in an indexed
> column as a parameter, and use that to select the data that the MapReduce job
> operates on. Right now, it looks like this isn't possible because
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader will only fetch data
> with get_range_slices, not get_indexed_slices.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira