[ 
https://issues.apache.org/jira/browse/CRUNCH-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13700935#comment-13700935
 ] 

Tom White commented on CRUNCH-235:
----------------------------------

The TaskAttemptContext stuff is a problem, but it should only be a problem for 
Crunch itself, not for Crunch apps. This change would help a bit by removing 
the leakage of incompatible classes into Crunch apps. I agree that you'd have 
to select the right Crunch JAR, although that could be done at runtime, without 
any recompilation.

I did look to see whether the approach we recently used in Parquet to produce a 
single JAR would work in Crunch 
(https://github.com/Parquet/parquet-mr/pull/32#issuecomment-17283008), but as 
you point out, the Crunch code is too performance critical to use the 
reflection-based approach there. In Parquet it's not a problem since it's only 
per-task or per-job calls to get Configuration that need to be done 
reflectively. In Crunch, OutputEmitter calls write() for every record, so it's 
not a good candidate for reflection.
                
> Avoid exposing incompatible Hadoop classes in Crunch API
> --------------------------------------------------------
>
>                 Key: CRUNCH-235
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-235
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: CRUNCH-235.patch
>
>
> Between Hadoop 1 and 2, org.apache.hadoop.mapreduce.Counter changed from a 
> class to an interface. Therefore, exposing Counter in Crunch's API means that 
> Crunch programs may need to be recompiled when moving from a Hadoop 1 to a 
> Hadoop 2 cluster. It would be nice to avoid the need to recompile. 
> Note that Crunch itself has two artifacts - one for each major version of 
> Hadoop - and the change proposed here would not alter that, it would just 
> mean that the same Crunch program binary could be used with either Crunch 
> artifact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to