This is a new feature in the 0.7 version. Previously, the url listing was a 
file, but it's now a directory. It's most probably documented in the release 
notes, but the change hasn't followed through to the tutorials just yet. If 
you check the mailing list archive, there are a couple of threads on this 
topic.

Fredrik

On 9/7/05, Earl Cahill <[EMAIL PROTECTED]> wrote:
> 
> Though, my last email was more about documenting the
> whole setup process, it looks like the error I
> mentioned was fixed by creating a directory and
> putting a urls file in that directory. It also looks
> like the name of the file doesn't matter. So I made a
> myurls directory, put a urls file in there and then
> ran
> 
> bin/nutch crawl myurls -dir crawl.test -depth 3
> 
> But, yeah, would like to put such steps in a tutorial.
> 
> 
> It looks like the front page got hit, and that's about
> it, so there is more to do.
> 
> Earl
> 
> --- Earl Cahill <[EMAIL PROTECTED]> wrote:
> 
> > howdy,
> >
> > I have been looking around for a nutch/mapred
> > tutorial
> > and haven't had much luck. I found this one
> >
> > http://lucene.apache.org/nutch/tutorial.html
> >
> > which did help me get a crawl going on trunk, but no
> > such luck in branches/mapred. I set the urls file
> > and
> > the filter in the same way that I did for trunk and
> > I
> > get
> >
> > 050907 013817 parsing
> >
> file:/home/nutch/nutch/branches/mapred/conf/nutch-site.xml
> > java.io.IOException: No input files in:
> > [Ljava.io.File;@32b0bad7
> > at
> >
> org.apache.nutch.mapred.InputFormatBase.listFiles(InputFormatBase.java:74)
> > at
> >
> org.apache.nutch.mapred.InputFormatBase.getSplits(InputFormatBase.java:84)
> > at
> >
> org.apache.nutch.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:59)
> >
> > Guess I am wondering if a detailed tutorial for
> > mapred
> > exists. Seems like doug was saying that it didn't.
> > I
> > would be up for walking through getting a crawl
> > going
> > and documenting my steps, but won't dive in if one
> > already exists. Also wondering if I would/could put
> > my doc on the wiki.
> >
> > Thanks,
> > Earl
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam
> > protection around
> > http://mail.yahoo.com
> >
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>

Reply via email to