Hi:

Have you looked at the nutch-default.xml config file under
<name>searcher.dir</name> ??
You need to modify this to reflect DFS where your crawl directory is I
think you will have something like /user/nutch etc etc.. you can find
it by trying the following

bin/hadoop dfs

and

bin/hadoop dfs -ls

do you see anything there?? (Previously NDFS)

I am not sure this will help...

On 2/7/06, Bernd Fehling <[EMAIL PROTECTED]> wrote:
> For those of you who are also reinventing the wheel like me
> getting nutch-0.8-dev with MapReduce running on a single box
> here are some updates.
> This is about revision #374443.
>
> The DmozParser class mentioned in "quick tutorial for nutch
> 0.8 and later" seams to be in "org.apache.nutch.tools.DmozParser"
> and not "org.apache.nutch.crawl.DmozParser"
>
> Against all odd I managed to get a single web page fetched
> as the log from my web server tells and also the tasktracker
> log.
>
> Set all named properties in file nutch-default.xml containing
> the substring "verbose" to "true" to get more info from the
> log files.
>
> As far as I could figure out, there will be no index under
> "/tmp/nutch/mapred/local/index/" directory.
> It think it will be included in a file named "/tmp/nutch/ndfs/name/edits"
>
> The user interface is running and I keep the ROOT/WEB-INF/classes
> in sync with nutch/conf/ directory. The footer.html file
> is missing in each language directory. So copy it from e.g.
> include/footer.html to en/include/footer.html.
>
> What I didn't manage is getting access to the index from the
> user interface. How does the user interface know that I
> named my index "myindexTargetFolder" as in the tutorial?
> Mystery...
> Maybe a property to set somewhere...
>
> Regards,
> Bernd
>
>
> Bernd Fehling schrieb:
> > Went through the tutorial for nutch 0.8.
> > No further error messages.
> > All seams to run fine but where is the index?
> >
> > Used a single URL to start with but searching for
> > any term from that site gives no results.
> > I guess there is no index at all?
> >
> > Where to find a crawler log file?
> >
> > Bernd
> >
> > Stefan Groschupf schrieb:
> >
> >>> Is it just a simple text file with one URL per line?
> >>
> >>
> >> Yes
> >>
> >
> >
>

Reply via email to