Sangjin Lee created MAPREDUCE-5146:
--------------------------------------

             Summary: application classloader may be used too early to load 
classes
                 Key: MAPREDUCE-5146
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5146
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: task
    Affects Versions: 2.0.3-alpha
            Reporter: Sangjin Lee


At least in the case of YarnChild, the application classloader is set fairly 
early (both in Configuration and as a TCCL). This has an effect of using the 
application classloader unexpectedly early.

There is a fair amount of code that gets invoked between setting the 
classloader and executing mapper/reducer task.

For example, I saw that the application classloader was asked to load a DOM 
parser class (com.sun.org.apache.xerces...) as part of initializing the 
filesystem. Luckily, in most cases this would be delegated to the parent 
classloader as the job classpath would not have those classes.

However, in general, this behavior carries the risk of loading the same class 
twice, and potentially causing problems such as ClassCastException. Those would 
turn into nasty bugs that are hard to fix.

It would be good to either set the application classloader as late as possible 
or place clearer limitations so it loads only the mapper/reducer classes and 
their dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to