There isn't a good solution for 0.5.

The code that calls setJarByClass has to pass a class that is NOT in
the lib directory, but rather in the unpacked classes. It's really
easy to build a hadoop job with Mahout that violates that rule due to
all the static methods that create jobs.

We seem to have a consensus to rework all the jobs as beans so that
this can be wrestled into control.



On Sun, May 8, 2011 at 6:16 PM, Jake Mannix <[email protected]> wrote:
> On Sun, May 8, 2011 at 2:58 PM, Sean Owen <[email protected]> wrote:
>
>> If I recall the last discussion on this correctly --
>>
>> No you don't want to put anything in Hadoop's lib/ directory. Even if
>> you can, that's not the "right" way.
>> You want to use the job file indeed, which should contain all dependencies.
>> However, it packages dependencies as jars-in-the-jar, which doesn't
>> work for Hadoop.
>>
>
> I thought that hadoop was totally fine with jars inside of the jar, if
> they're
> in the lib directory?
>
>
>> I think if you modify the Maven build to just repackage all classes
>> into the main jar, it works. It works for me at least.
>>
>
> Clearly we're not expecting people to do this.  I wasn't even running with
> special new classes, it wasn't finding *Vector* - if this doesn't work on
> a real cluster, then most of our entire codebase (which requires
> mahout-math) doesn't work.
>
>  -jake
>

Reply via email to