[ https://issues.apache.org/jira/browse/CRUNCH-441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071647#comment-14071647 ]
David Whiting commented on CRUNCH-441: -------------------------------------- After two of us spending a full day on it, we determined the following: - Tez DAGs map reasonably well to the Graphs build by Crunch's MapReduce implementation, there's no reason why thing shouldn't be possible in theory. - Tez's API is much lower level than we expected, meaning that the implementation might well be more complex than we anticipated. It does appear to have a slightly higher-level Map-Reduce-Reduce implementation which could make an easier transition for Crunch, but this was difficult to find information about. - The Tez API does not yet seem to be particularly stable right now. We probably won't have time to look at this again for a while, so if anyone wants to take the baton it'd be really great. There are presumably implementaitons around for similar things (such is in the Hive source and in a Cascading branch somewhere) that could be used for reference; otherwise maybe we'll take another look when the API and docs seem a bit more stable and complete. > Crunch on Tez > ------------- > > Key: CRUNCH-441 > URL: https://issues.apache.org/jira/browse/CRUNCH-441 > Project: Crunch > Issue Type: Improvement > Reporter: David Whiting > > Tez is potentially a better drop-in replacement for MR than Spark on many > existing Hadoop environments, because it doesn't require always-on resources > and is less memory-hungry than Spark whilst still providing huge performance > gains as can be seen in new versions of Hive. -- This message was sent by Atlassian JIRA (v6.2#6252)