You can also pass it to most jobs with $ nutch <job>
-Dhadoop.tmp.dir=bla args. This can be even automatic with some shell
scripting.
On Fri, 2 Mar 2012 00:49:36 -0500, Jeremy Villalobos
<jeremyvillalo...@gmail.com> wrote:
It is a small number of crawlers, so I copied a runtime for each.
therefore different configuration files.
Jeremy
On Thu, Mar 1, 2012 at 10:57 PM, remi tassing wrote:
How did you define that property so it's different so each job?
Remi
On Friday, March 2, 2012, Jeremy Villalobos
wrote:
That is what I was looking for, thank you.
>
> this property was added to:
> $NUCHT_DIR/runtime/local/conf/nutch-site.xml
>
> Jeremy
>
> On Thu, Mar 1, 2012 at 7:01 PM, Markus Jelsma wrote:
>
>> you can either:
>>
>> 1. run on hadoop
>> 2. not run multiple concurrent jobs on a local machine
>> 3. set a hadoop.tmp.dir per job
>> 4. merge all crawls to a single crawl
>>
>>
>> On Thu, 1 Mar 2012 16:26:00 -0500, Jeremy Villalobos <
>> jeremyvillalo...@gmail.com [4]> wrote:
>>
>>> Hello:
>>>
>>> I am running multiple small crawls on one machine. I notice
that they
are
>>> conflicting because they all access
>>>
>>> /tmp/hadoop-username/mapred
>>>
>>> How do I change the location of this folder ?
>>>
>>> Do I have use hadoop to run multiple crawlers each specific to a
site ?
>>>
>>> thanks
>>>
>>> Jeremy
>>>
>>
>> --
>> Markus Jelsma - CTO - Openindex
>> http://www.linkedin.com/in/**markus17 [5]
>> 050-8536600 / 06-50258350
>>
>
Links:
------
[1] mailto:tassingr...@gmail.com
[2] mailto:jeremyvillalo...@gmail.com
[3] mailto:markus.jel...@openindex.io
[4] mailto:jeremyvillalo...@gmail.com
[5] http://www.linkedin.com/in/**markus17
[6] http://www.linkedin.com/in/markus17
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536600 / 06-50258350