[
https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062412#comment-14062412
]
Sangjin Lee commented on MAPREDUCE-5957:
----------------------------------------
Thanks for your comment Jason.
I've been going back and forth between the two approaches on this. On the one
hand, if we make the job classloader available early (but hold back setting the
TCCL) this one time change would be sufficient to cover future changes when
another instance of custom class loading is needed. That's why I gravitated to
that approach. It is true that if the custom class needs to load another class
via TCCL in its constructor (and a couple of other methods that get called by
MRAppMaster) then it is a problem. It's a fairly uncommon scenario, but I can't
say it should never happen.
Surrounding custom class loading with setting and unsetting of the job
classloader (both as the configuration classloader and as TCCL) does solve that
problem. And I can't think of a case where making it available as TCCL during
that time period would cause a different type of problems (along the line of
MAPREDUCE-5751). Even if this thing invokes jetty initialization for example,
it would have compelled its own copy of jetty and things would stay consistent
within that class namespace.
The main problem I have with this approach is that it's bit more expensive to
maintain. Every time new code is added to load a custom class in MRAppMaster,
we *must* remember to wrap it with setting and unsetting the job classloader.
It's perfectly doable, but leaves room for making mistakes.
I can bring up the second version of the patch that implements the other
approach. Shall we discuss that a little more then? Let me know your thoughts.
> AM throws ClassNotFoundException with job classloader enabled if custom
> output format/committer is used
> -------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5957
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 2.4.0
> Reporter: Sangjin Lee
> Assignee: Sangjin Lee
> Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch
>
>
> With the job classloader enabled, the MR AM throws ClassNotFoundException if
> a custom output format class is specified.
> {noformat}
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
> com.foo.test.TestOutputFormat not found
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
> Class com.foo.test.TestOutputFormat not found
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895)
> at
> org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469)
> ... 8 more
> Caused by: java.lang.ClassNotFoundException: Class
> com.foo.test.TestOutputFormat not found
> at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801)
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893)
> ... 10 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)