Re: -libjars doesn't work with MR job

Victor Hsieh Thu, 21 Jan 2010 18:25:48 -0800

After working on it for a while, I realized that it's a mistake.  It
actually works well.  But there are two points needed make sure, both
CLASSPATH and -libjars.


For CLASSPATH, we need to make sure depended jars should be included
in while launching mapreduce task, while -libjars is also important to
contain all necessary jars in mapper and reducer.

Victor

On Wed, Jan 20, 2010 at 2:35 PM, Victor Hsieh <[email protected]> wrote:
> Yes, it can be done by doing so, and then restart the cluster (since
> tasktrackers need to know new jars was added).  But I'm maintaining a
> cluster for different users, thus looking for a solution without restart.
> Thanks,
> Victor
>
> On Wed, Jan 20, 2010 at 2:17 PM, Rekha Joshi <[email protected]> wrote:
>>
>> Not sure what error you get and if it is suggestive, but attimes where you
>> place the libjars option can make a difference.You can try adding the jar to
>> your HADOOP_CLASSPATH and then executing?
>>
>> Cheers,
>> /R
>>
>>
>> On 1/20/10 9:50 AM, "Victor Hsieh" <[email protected]> wrote:
>>
>> Hi,
>>
>> I was trying to run a mapreduce job with some jars but failed.  It seems
>> that jars specified in command line -libjars was not shipped to mapreduce
>> worker together.
>>
>> After digging into the code, I found that deprecated API and current are
>> different from -libjars behavior (also -files and -archives).  In
>> deprecated
>> API, JobClient.runJob() will copy -libjars to DistributedCache (more
>> precisely, GenericOptionParser parses the -libjars, saving as "tmpjars" in
>> configuration, then JobClient upload tmpjars).  However, in current API, I
>> didn't see anything related (by grepping tmpjar or something in
>> hadoop-0.20.1/src/).
>>
>> Is there any helper function or something in current API?  Or I need to do
>> it myself like what JobClient do?
>>
>> Help appreciated.
>>
>> Victor
>>
>
>

Re: -libjars doesn't work with MR job

Reply via email to