Your problem may be related to:

http://issues.apache.org/jira/browse/HADOOP-1622

Runping


=================

Ted,

Means going the HADOOP_CLASSPATH route, ie. creating a separate 
directory for those shared jars and then set it once in the 
hadoop-env.sh, I think this will work for me too, I am in the process of

setting a separate CONF_DIR anyways after my recent update - where I 
forgot a couple of files to copy them into the new tree.

I was following this: 
http://www.mail-archive.com/[EMAIL PROTECTED]/msg02860.ht
ml 

Which I could not find on the Wiki really, although the above is a 
commit. Am I missing something?

Lars


Ted Dunning wrote:
> /lib is definitely the way to go.
>
> But adding gobs and gobs of stuff there makes jobs start slowly
because you
> have to propagate a multi-megabyte blob to lots of worker nodes.
>
> I would consider adding universally used jars to the hadoop class path
on
> every node, but I would also expect to face configuration management
> nightmares (small ones, though) from doing this.
>
>
> On 1/7/08 11:50 AM, "Lars George" <[EMAIL PROTECTED]> wrote:
>
>   
>> Arun,
>>
>> Ah yes, I see it now in JobClient. OK, then how are the required aux
>> libs handled? I assume a /lib inside the job jar is the only way to
go?
>>
>> I saw the discussion on the Wiki about adding Hbase permanently to
the
>> HADOOP_CLASSPATH, but then I also have to deploy the Lucene jar
files,
>> Xerces etc. I guess it is better if I add everything non-Hadoop into
the
>> job jar's lib directory?
>>
>> Thanks again for the help,
>> Lars
>>
>>
>> Arun C Murthy wrote:
>>     
>>> On Mon, Jan 07, 2008 at 08:24:36AM -0800, Lars George wrote:
>>>   
>>>       
>>>> Hi,
>>>>
>>>> Maybe someone here can help me with a rather noob question. Where
do I
>>>> have to put my custom jar to run it as a map/reduce job? Anywhere
and
>>>> then specifying the HADOOP_CLASSPATH variable in hadoop-env.sh?
>>>>
>>>>     
>>>>         
>>> Once you have your jar and submit it for your job via the *hadoop
jar*
>>> command the framework takes care of distributing the software for
nodes on
>>> which your maps/reduces are scheduled:
>>> $ hadoop jar <custom_jar> <custom_args>
>>>
>>> The detail is that the framework copies your jar from the submission
node to
>>> the HDFS and then copies it onto the execution node.
>>>
>>> Does
http://lucene.apache.org/hadoop/docs/r0.15.1/mapred_tutorial.html#Usage
>>> help?
>>>
>>> Arun
>>>
>>>   
>>>       
>>>> Also, since I am using the Hadoop API already from our server code,
it
>>>> seems natural to launch jobs from within our code. Are there any
issue
>>>> with that? I assume I have to copy the jar files first and make
them
>>>> available as per my question above, but then I am ready to start it
from
>>>> my own code?
>>>>
>>>> I have read most Wiki entries and while the actual workings are
>>>> described quite nicely, I could not find an answer to the questions
>>>> above. The demos are already in place and can be started as is
without
>>>> the need of making them available.
>>>>
>>>> Again, I apologize for being a noobie.
>>>>
>>>> Lars
>>>>     
>>>>         
>>>   
>>>       
>
>   

Reply via email to