Re: Class Loader Problem

Grant Ingersoll Thu, 06 Mar 2008 04:26:10 -0800

Yeah, I agree it is a headache for the future. It is already a bitproblematic in that we have to build the jar before tests are run.


On Mar 6, 2008, at 5:30 AM, Dawid Weiss wrote:

As a side note -- Hadoop uses the simplest trick possible to figureout the JAR location of the originating class -- it attempts to loada resource named after the class' bytecode...
 private static String findContainingJar(Class my_class) {
   ClassLoader loader = my_class.getClassLoader();
String class_file = my_class.getName().replaceAll("\\.", "/") +".class";
   try {
     for(Enumeration itr = loader.getResources(class_file);
         itr.hasMoreElements();) {
       URL url = (URL) itr.nextElement();
       if ("jar".equals(url.getProtocol())) {
         String toReturn = url.getPath();
         if (toReturn.startsWith("file:")) {
           toReturn = toReturn.substring("file:".length());
         }
         toReturn = URLDecoder.decode(toReturn, "UTF-8");
         return toReturn.replaceAll("!.*$", "");
       }
     }
   } catch (IOException e) {
     throw new RuntimeException(e);
   }
   return null;
 }
Note the "replaceAll" line -- it truncates inside-JAR path from jarlocation. I also looked at the submitter and isolation runner, theyseem to work according to my intuition I presented earlier (threadcontext class loader has pointers to the invoked JAR plus all jarsunder lib/), there should be no need to specify jars explicitly. Ieven tend to think this is a headache for the future...
Dawid

Dawid Weiss wrote:
I changed the main's to pass in the location of the jar, since theANT task puts the jar in basedir/dist. I made a comment about iton Mahout-3. The Canopy driver should do the right thing????? Ialso did the same thing w/ the k-means.
I honestly don't think the JAR file must be specified as part ofthe JobConf. This is a hint, but it's a hint used only in veryspecial cases (which I can't think of, to be honest). To myunderstanding, the situation is like this:- When you assemble a job JAR, you should package it with allrequired dependencies under {jarfile.jar}/lib folder.- All these classes are visible through context class loader set byHadoop, so no special JAR tricks are required. When you submit aHadoop job (remotely), you point to the JAR file with alldependencies and Hadoop can take it from there.- When you run in-memory task tracker (for debugging or locally),all the classes should be available through normal classpath andcontext class loader (again) should resolve them successfully.Can you enlighten me when pointing an explicit JAR file for JobConfis required?
Dawid


--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ

Re: Class Loader Problem

Reply via email to