Running locally and running in a true distributed environment are worlds
different.  With local execution, the fact that the main has a reference to
the class means that it has been loaded and the mapper and reducer will be
able to see it.  In distributed operation, the fact that the main program
has a reference has no bearing on whether the mapper or reducer will see the
class.


On 3/6/08 9:21 AM, "Jeff Eastman" <[EMAIL PROTECTED]> wrote:

> Hi Dawid,
> 
> I figured somebody who really understands class loaders would be able to
> improve on my initial implementation. I don't have a small test case for
> this at the moment, but you should be able to duplicate it easily by
> creating a new DistanceMeasure in a test project and then calling the
> CanopyClusteringJob as in the code fragment below. You can reuse some of
> the Canopy test case code to populate your initial dataset.
> 
> BTW, the original code worked fine when running locally from Eclipse,
> and I only saw the failures when running on a remote cluster. Evidently,
> Eclipse's classpath environment is different than that of a deployed map
> task.
> 
> Jeff
> 
> -----Original Message-----
> From: Dawid Weiss [mailto:[EMAIL PROTECTED]
> Sent: Thursday, March 06, 2008 12:51 AM
> To: [email protected]
> Subject: Re: Class Loader Problem
> 
> 
> Hi guys,
> 
> I just looked at the code and noticed you use Class-relative
> classloader:
> 
> Class cl = Class.forName(job.get(DISTANCE_MEASURE_KEY));
> 
> This is effectively an attempt to load a class using the caller's class
> class 
> loader (the class loader is loaded via
> ClassLoader.getCallerClassLoader()).Usually it makes more sense to use
> thread's 
> context class loader (they may be different), so:
> 
> Thread.currentThread().getContextClassLoader().loadClass(...);
> 
> I teach classes today, but I'll review the code and see if I can fix it.
> Jeff, 
> would you by any chance have an assembled-and-ready test case or example
> that 
> causes this problem?
> 
> Dawid
> 
> 
> Ted Dunning wrote:
>> Hmmm...
>> 
>> Is there a more elegant way to go here?  Is there a way for the
>> CanopyClusteringJob to infer which jar by looking at the class?  I
> think
>> that Hadoop does something like this via the class loader.
>> 
>> This current method looks ripe for very obscure bugs.
>> 
>> 
>> On 3/5/08 4:49 PM, "Grant Ingersoll" <[EMAIL PROTECTED]> wrote:
>> 
>>> I changed the main's to pass in the location of the jar, since the
> ANT
>>> task puts the jar in basedir/dist.  I made a comment about it on
>>> Mahout-3.  The Canopy driver should do the right thing?????  I also
>>> did the same thing w/ the k-means.
>>> 
>>> 
>>> On Mar 5, 2008, at 2:52 PM, Jeff Eastman wrote:
>>> 
>>>> Here's my job driver, it works fine with ManhattanDistanceMeasure
> but
>>>> not SystemLoadDistanceMeasure.
>>>> 
>>>> Jeff
>>>> 
>>>> public static void main(String[] args) {
>>>>    String input = args[0];
>>>>    String output = args[1];
>>>>    int t1 = new Integer(args[2]);
>>>>    int t2 = new Integer(args[3]);
>>>>    JobConf conf = new JobConf(
>>>>        com.collabnet.hadoop.systemload.access.DriverA.class);
>>>>    Path outPath = new Path(output);
>>>>    try {
>>>>      FileSystem dfs = FileSystem.get(conf);
>>>>      if (dfs.exists(outPath))
>>>>        dfs.delete(outPath);
>>>>      DriverA.runJob(input, output);
>>>>      DriverP.runJob(input, output);
>>>>      DriverC.runJob(output, output);
>>>>      CanopyClusteringJob.runJob(output + "/combined", output,
>>>>          SystemLoadDistanceMeasure.class.getName(), t1, t2,
>>>>          "apache-mahout-0.1-dev.jar");
>>>>      DriverS.runJob(output + "/clusters", output);
>>>>    } catch (IOException e) {
>>>>      e.printStackTrace();
>>>>    }
>>>>  }
>>>> 
>>>> -----Original Message-----
>>>> From: Ted Dunning [mailto:[EMAIL PROTECTED]
>>>> Sent: Wednesday, March 05, 2008 11:44 AM
>>>> To: [email protected]
>>>> Subject: Re: Class Loader Problem
>>>> 
>>>> 
>>>> Where is your code?
>>>> 
>>>> 
>>>> On 3/5/08 11:28 AM, "Jeff Eastman" <[EMAIL PROTECTED]> wrote:
>>>> 
>>>>> I'm wondering if you can see anything
>>>>> wrong with my packaging or, perhaps, how the Canopy class is going
>>>> about
>>>>> instantiating it.
>>> 
>> 

Reply via email to