[
https://issues.apache.org/jira/browse/CRUNCH-231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Wills updated CRUNCH-231:
------------------------------
Attachment: mapred.patch
This is a patch I put together to determine if such a thing was even possible,
and it turns out that w/some crazy reflection and javassist hacking, it is. We
end up wrapping the instances inside of DoFns, so they can be integrated as
part of any Crunch pipeline (i.e., you can mix and match existing Mappers and
Reducers w/new DoFns and other library calls in the same pipeline and
underlying MapReduce execution.)
The patch supports both the old mapred.* APIs and the newer mapreduce.* APIs,
with the mapred APIs being a bit easier/cleaner to support. There's still more
integration testing that needs to be filled in, but I thought I would post this
to see if anyone wanted to weigh in on this before I took it much further.
> Support legacy Mappers and Reducers in Crunch pipelines
> -------------------------------------------------------
>
> Key: CRUNCH-231
> URL: https://issues.apache.org/jira/browse/CRUNCH-231
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Reporter: Josh Wills
> Assignee: Josh Wills
> Attachments: mapred.patch
>
>
> I've had a few requests for Crunch to support existing Mappers and Reducers
> using the underlying Java APIs as part of regular pipelines, so that users
> could evolve existing MapReduce jobs into Crunch pipelines gradually, instead
> of being forced to rewrite everything all at once in order to map it onto
> Crunch's model.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira