Re: RunJar classloader issues

Ferdy Galema Wed, 05 Oct 2011 01:09:45 -0700

Bumping this thread because currently I'm more aware of what is actuallyhappening. If I understand correctly, when submitting jobs using RunJarthe classpath is extended using a new classloader. This classloader addsthe unzipped contents from the jar to the current thread classpath(contextClassLoader). This brings 2 issues to mind:

1) In RunJar, when constructing the new URLClassLoader, would it not bebetter to chain the *previously* contextClassLoader instead of using thesystem classloader? (The latter is used when the classloader argument isomitted in the ctor of URLClassLoader, which is what RunJar does). Thisis a truely a minor issue, since most of the times RunJar is used as aresult of invocating 'hadoop jar' from the command line and thereforethe previous thread contextClassLoader actually will be the systemclassloader. I bring this up for at least trying to understand the process.

2) To proceed on my previous findings on AbstractMapWritable, I thinkthe problem of it unable to find classes is because it is loaded by aparent classloader (system classloader) instead of the new childclassloader set by RunJar. The classloader of AbstractMapWritable is notthis child classloader because it is already loaded (indirectly inConfiguration) before the thread contextClassLoader is replaced inRunJar, therefore it is unable to find certain extracted classes. So whydoes AbstractMapWritable use the classloader of it's class[Class.forName(className)] instead of the current thread[Class.forName(className, true,Thread.currentThread().getContextClassLoader())]. Is it not wiser toalways use the latter construction in general classloading code?


Ferdy.

On 09/09/2011 11:54 AM, Ferdy Galema wrote:

Sometimes when running hadoop jobs using the 'hadoop jar' commandthere are issues with the classloader. I presume these are caused byclasses that are loaded BEFORE the commands main is invoced. Forexample, when relying on the MapWritable in the command, it is notpossible to use a class that is not in the default idToClassMap.MapWritable.class is loaded before the user job is unpacked andtherefore it's classloader will not be able to find custom classes.(At least classes that are in the RunJar it's classloader classpath).
I could not find any issues or discussion about this so I assume it issomewhat of an obscure issue (please correct me if I'm wrong). AnywayI would like to hear what you think of this and perhaps discuss apossible solution. Such as spawning the command in a new JVM.MapWritable or rather AbstractMapWritable uses aClass.forName(className) construction, maybe this can be changed sothat uses the classloader of the current thread instead of its ownclass. (Will this work?)
A workaround for now is to explicitely put the jar itself on theclasspath, i.e. 'env HADOOP_CLASSPATH=myJar hadoop jar myJar'.

Re: RunJar classloader issues

Reply via email to