[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062412#comment-14062412
 ] 

Sangjin Lee commented on MAPREDUCE-5957:
----------------------------------------

Thanks for your comment Jason.

I've been going back and forth between the two approaches on this. On the one 
hand, if we make the job classloader available early (but hold back setting the 
TCCL) this one time change would be sufficient to cover future changes when 
another instance of custom class loading is needed. That's why I gravitated to 
that approach. It is true that if the custom class needs to load another class 
via TCCL in its constructor (and a couple of other methods that get called by 
MRAppMaster) then it is a problem. It's a fairly uncommon scenario, but I can't 
say it should never happen.

Surrounding custom class loading with setting and unsetting of the job 
classloader (both as the configuration classloader and as TCCL) does solve that 
problem. And I can't think of a case where making it available as TCCL during 
that time period would cause a different type of problems (along the line of 
MAPREDUCE-5751). Even if this thing invokes jetty initialization for example, 
it would have compelled its own copy of jetty and things would stay consistent 
within that class namespace.

The main problem I have with this approach is that it's bit more expensive to 
maintain. Every time new code is added to load a custom class in MRAppMaster, 
we *must* remember to wrap it with setting and unsetting the job classloader. 
It's perfectly doable, but leaves room for making mistakes.

I can bring up the second version of the patch that implements the other 
approach. Shall we discuss that a little more then? Let me know your thoughts.

> AM throws ClassNotFoundException with job classloader enabled if custom 
> output format/committer is used
> -------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5957
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch
>
>
> With the job classloader enabled, the MR AM throws ClassNotFoundException if 
> a custom output format class is specified.
> {noformat}
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class 
> com.foo.test.TestOutputFormat not found
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374)
>       at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> Class com.foo.test.TestOutputFormat not found
>       at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895)
>       at 
> org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469)
>       ... 8 more
> Caused by: java.lang.ClassNotFoundException: Class 
> com.foo.test.TestOutputFormat not found
>       at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801)
>       at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893)
>       ... 10 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to