According to code in bin/nutch if you have .job file in you NUTCH_HOME then it means that you run it in deploy mode. If there is no .job file then you run it in local mode, so you do not need to build nutch each time you change conf files.
Alex. -----Original Message----- From: Christopher Gross <cogr...@gmail.com> To: user <user@nutch.apache.org> Sent: Tue, Oct 2, 2012 5:31 am Subject: Re: Building Nutch 2.0 Well i'm not using the "deploy" directory (and I can't get the hadoop to work, so the .job file shouldn't matter). I just don't see how changing the configurations (like the agent name string) would warrant rebuilding the project. I can understand it if you're switching between the storage mechanism (MySQL db vs HBase) because it is only including what is necessary (though it would be better to just have it all there in some cases), but for just a simple change I don't quite get it. Lewis -- if every time I change something minor like "http.agent.name" in a config file, will I have to rebuild? -- Chris On Mon, Oct 1, 2012 at 4:49 PM, <alx...@aim.com> wrote: > It seems to me that if you run nutch in deploy mode and make changes to > config files then you need to rebuild .job file again unless you specify config_dir option in hadoop command. > > Alex. > > > -----Original Message----- > From: Christopher Gross <cogr...@gmail.com> > To: user <user@nutch.apache.org> > Sent: Mon, Oct 1, 2012 1:22 pm > Subject: Re: Building Nutch 2.0 > > > I have my 1.3 set up in a /proj/nutch/ directory that has the bin, > conf, lib, logs, ..etc.., with NUTCH_HOME pointing there. I don't > quite see what the difference would be for 2.x as long as NUTCH_HOME > pointed to the right place. > > Is there documentation anywhere on how to do a deployment? > > -- Chris > > > On Mon, Oct 1, 2012 at 3:59 PM, Lewis John Mcgibbney > <lewis.mcgibb...@gmail.com> wrote: >> Hi Chris, >> >> On Mon, Oct 1, 2012 at 8:52 PM, Christopher Gross <cogr...@gmail.com> wrote: >>> OK, I added the port being used by hbase to iptables, and now I'm farther. >>> >>> I'm getting: >>> 12/10/01 19:44:17 ERROR fetcher.FetcherJob: Fetcher: No agents listed >>> in 'http.agent.name' property. >>> >>> But I do have an entry there, and it matches the first in the >>> robots.agents as well. >> >> This can only mean that you have not recompiled this stuff into the >> runtime/local directory. >> >>> >>> How should I have this laid out? Should I be running out of the >>> 'runtime' dir, or is it fine that I've pulled all those files out and >>> into a /proj/nutch-2.1/ directory (so there's a bin, conf, lib, >>> ..etc.. in there, with NUTCH_HOME pointing to that dir). >> >> OK so you are running locally. I can't say whether its OK to copy the >> directories and their content elsewhere as I've never done it however >> I would avoid unless absolutely necessary. It terms of the directory >> layout Nutch 2.x is identical to 1.x. >> >> It really helps if you make explicit which back end you intend to use >> as the config may alter accordingly. > >