Hi Dawid,

I figured somebody who really understands class loaders would be able to
improve on my initial implementation. I don't have a small test case for
this at the moment, but you should be able to duplicate it easily by
creating a new DistanceMeasure in a test project and then calling the
CanopyClusteringJob as in the code fragment below. You can reuse some of
the Canopy test case code to populate your initial dataset.

BTW, the original code worked fine when running locally from Eclipse,
and I only saw the failures when running on a remote cluster. Evidently,
Eclipse's classpath environment is different than that of a deployed map
task.

Jeff

-----Original Message-----
From: Dawid Weiss [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 06, 2008 12:51 AM
To: [email protected]
Subject: Re: Class Loader Problem


Hi guys,

I just looked at the code and noticed you use Class-relative
classloader:

Class cl = Class.forName(job.get(DISTANCE_MEASURE_KEY));

This is effectively an attempt to load a class using the caller's class
class 
loader (the class loader is loaded via 
ClassLoader.getCallerClassLoader()).Usually it makes more sense to use
thread's 
context class loader (they may be different), so:

Thread.currentThread().getContextClassLoader().loadClass(...);

I teach classes today, but I'll review the code and see if I can fix it.
Jeff, 
would you by any chance have an assembled-and-ready test case or example
that 
causes this problem?

Dawid


Ted Dunning wrote:
> Hmmm...
> 
> Is there a more elegant way to go here?  Is there a way for the
> CanopyClusteringJob to infer which jar by looking at the class?  I
think
> that Hadoop does something like this via the class loader.
> 
> This current method looks ripe for very obscure bugs.
> 
> 
> On 3/5/08 4:49 PM, "Grant Ingersoll" <[EMAIL PROTECTED]> wrote:
> 
>> I changed the main's to pass in the location of the jar, since the
ANT
>> task puts the jar in basedir/dist.  I made a comment about it on
>> Mahout-3.  The Canopy driver should do the right thing?????  I also
>> did the same thing w/ the k-means.
>>
>>
>> On Mar 5, 2008, at 2:52 PM, Jeff Eastman wrote:
>>
>>> Here's my job driver, it works fine with ManhattanDistanceMeasure
but
>>> not SystemLoadDistanceMeasure.
>>>
>>> Jeff
>>>
>>> public static void main(String[] args) {
>>>    String input = args[0];
>>>    String output = args[1];
>>>    int t1 = new Integer(args[2]);
>>>    int t2 = new Integer(args[3]);
>>>    JobConf conf = new JobConf(
>>>        com.collabnet.hadoop.systemload.access.DriverA.class);
>>>    Path outPath = new Path(output);
>>>    try {
>>>      FileSystem dfs = FileSystem.get(conf);
>>>      if (dfs.exists(outPath))
>>>        dfs.delete(outPath);
>>>      DriverA.runJob(input, output);
>>>      DriverP.runJob(input, output);
>>>      DriverC.runJob(output, output);
>>>      CanopyClusteringJob.runJob(output + "/combined", output,
>>>          SystemLoadDistanceMeasure.class.getName(), t1, t2,
>>>          "apache-mahout-0.1-dev.jar");
>>>      DriverS.runJob(output + "/clusters", output);
>>>    } catch (IOException e) {
>>>      e.printStackTrace();
>>>    }
>>>  }
>>>
>>> -----Original Message-----
>>> From: Ted Dunning [mailto:[EMAIL PROTECTED]
>>> Sent: Wednesday, March 05, 2008 11:44 AM
>>> To: [email protected]
>>> Subject: Re: Class Loader Problem
>>>
>>>
>>> Where is your code?
>>>
>>>
>>> On 3/5/08 11:28 AM, "Jeff Eastman" <[EMAIL PROTECTED]> wrote:
>>>
>>>> I'm wondering if you can see anything
>>>> wrong with my packaging or, perhaps, how the Canopy class is going
>>> about
>>>> instantiating it.
>>
> 

Reply via email to