Sangjin Lee created MAPREDUCE-5146:
--------------------------------------
Summary: application classloader may be used too early to load
classes
Key: MAPREDUCE-5146
URL: https://issues.apache.org/jira/browse/MAPREDUCE-5146
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: task
Affects Versions: 2.0.3-alpha
Reporter: Sangjin Lee
At least in the case of YarnChild, the application classloader is set fairly
early (both in Configuration and as a TCCL). This has an effect of using the
application classloader unexpectedly early.
There is a fair amount of code that gets invoked between setting the
classloader and executing mapper/reducer task.
For example, I saw that the application classloader was asked to load a DOM
parser class (com.sun.org.apache.xerces...) as part of initializing the
filesystem. Luckily, in most cases this would be delegated to the parent
classloader as the job classpath would not have those classes.
However, in general, this behavior carries the risk of loading the same class
twice, and potentially causing problems such as ClassCastException. Those would
turn into nasty bugs that are hard to fix.
It would be good to either set the application classloader as late as possible
or place clearer limitations so it loads only the mapper/reducer classes and
their dependencies.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira