[
https://issues.apache.org/jira/browse/CASSANDRA-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871177#action_12871177
]
Jeremy Hanna commented on CASSANDRA-1124:
-----------------------------------------
I had been looking for uses of the NetworkTopology class since that was not
deprecated and is still used in various places in the code as well. That was
what the deprecated mapred.FileInputFormat used.
As far as using all JobTrackers or extending the DNSToSwitchMapping, I wonder
if there could be a cleaner way to do it if it were exposed better in hadoop to
external projects, i.e. creating a hadoop ticket to allow external code to
better make use of locality, in a cleaner way - then helping that along. I
just think that since this isn't a huge priority, it might be worthwhile to see
if we can't help build a bridge for other projects at the same time.
> Improve Cassandra to MapReduce locality sharing
> -----------------------------------------------
>
> Key: CASSANDRA-1124
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1124
> Project: Cassandra
> Issue Type: Improvement
> Components: Hadoop
> Reporter: Jeremy Hanna
> Priority: Minor
>
> Currently, the hadoop integration only passes the data's local node
> information (ColumnFamilyRecordReader-RowIterator-getLocation). Hadoop can
> take advantage of full locality and it's possible that we have full locality
> configured in Cassandra.
> So this improvement is for adding the full locality of the data into the
> String in a way that hadoop can make use of it with its Job/Task Trackers.
> This will allow for jobs to be potentially on the same rack and/or datacenter
> if possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.