According to code in bin/nutch if you have .job file in you NUTCH_HOME then it 
means that you run it in deploy mode. If there is no .job file then you run it 
in local mode, so you do not need to build nutch each time you change conf 
files.

Alex.

 

 

 

-----Original Message-----
From: Christopher Gross <cogr...@gmail.com>
To: user <user@nutch.apache.org>
Sent: Tue, Oct 2, 2012 5:31 am
Subject: Re: Building Nutch 2.0


Well i'm not using the "deploy" directory (and I can't get the hadoop
to work, so the .job file shouldn't matter).

I just don't see how changing the configurations (like the agent name
string) would warrant rebuilding the project.  I can understand it if
you're switching between the storage mechanism (MySQL db vs HBase)
because it is only including what is necessary (though it would be
better to just have it all there in some cases), but for just a simple
change I don't quite get it.

Lewis -- if every time I change something minor like "http.agent.name"
in a config file, will I have to rebuild?

-- Chris


On Mon, Oct 1, 2012 at 4:49 PM,  <alx...@aim.com> wrote:
> It seems to me that if you run nutch in deploy mode and make changes to 
> config 
files then you need to rebuild .job file again unless you specify config_dir 
option in hadoop command.
>
> Alex.
>
>
> -----Original Message-----
> From: Christopher Gross <cogr...@gmail.com>
> To: user <user@nutch.apache.org>
> Sent: Mon, Oct 1, 2012 1:22 pm
> Subject: Re: Building Nutch 2.0
>
>
> I have my 1.3 set up in a /proj/nutch/ directory that has the bin,
> conf, lib, logs, ..etc.., with NUTCH_HOME pointing there.  I don't
> quite see what the difference would be for 2.x as long as NUTCH_HOME
> pointed to the right place.
>
> Is there documentation anywhere on how to do a deployment?
>
> -- Chris
>
>
> On Mon, Oct 1, 2012 at 3:59 PM, Lewis John Mcgibbney
> <lewis.mcgibb...@gmail.com> wrote:
>> Hi Chris,
>>
>> On Mon, Oct 1, 2012 at 8:52 PM, Christopher Gross <cogr...@gmail.com> wrote:
>>> OK, I added the port being used by hbase to iptables, and now I'm farther.
>>>
>>> I'm getting:
>>> 12/10/01 19:44:17 ERROR fetcher.FetcherJob: Fetcher: No agents listed
>>> in 'http.agent.name' property.
>>>
>>> But I do have an entry there, and it matches the first in the
>>> robots.agents as well.
>>
>> This can only mean that you have not recompiled this stuff into the
>> runtime/local directory.
>>
>>>
>>> How should I have this laid out?  Should I be running out of the
>>> 'runtime' dir, or is it fine that I've pulled all those files out and
>>> into a /proj/nutch-2.1/ directory (so there's a bin, conf, lib,
>>> ..etc.. in there, with NUTCH_HOME pointing to that dir).
>>
>> OK so you are running locally. I can't say whether its OK to copy the
>> directories and their content elsewhere as I've never done it however
>> I would avoid unless absolutely necessary. It terms of the directory
>> layout Nutch 2.x is identical to 1.x.
>>
>> It really helps if you make explicit which back end you intend to use
>> as the config may alter accordingly.
>
>

 

Reply via email to