Though, my last email was more about documenting the whole setup process, it looks like the error I mentioned was fixed by creating a directory and putting a urls file in that directory. It also looks like the name of the file doesn't matter. So I made a myurls directory, put a urls file in there and then ran
bin/nutch crawl myurls -dir crawl.test -depth 3 But, yeah, would like to put such steps in a tutorial. It looks like the front page got hit, and that's about it, so there is more to do. Earl --- Earl Cahill <[EMAIL PROTECTED]> wrote: > howdy, > > I have been looking around for a nutch/mapred > tutorial > and haven't had much luck. I found this one > > http://lucene.apache.org/nutch/tutorial.html > > which did help me get a crawl going on trunk, but no > such luck in branches/mapred. I set the urls file > and > the filter in the same way that I did for trunk and > I > get > > 050907 013817 parsing > file:/home/nutch/nutch/branches/mapred/conf/nutch-site.xml > java.io.IOException: No input files in: > [Ljava.io.File;@32b0bad7 > at > org.apache.nutch.mapred.InputFormatBase.listFiles(InputFormatBase.java:74) > at > org.apache.nutch.mapred.InputFormatBase.getSplits(InputFormatBase.java:84) > at > org.apache.nutch.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:59) > > Guess I am wondering if a detailed tutorial for > mapred > exists. Seems like doug was saying that it didn't. > I > would be up for walking through getting a crawl > going > and documenting my steps, but won't dive in if one > already exists. Also wondering if I would/could put > my doc on the wiki. > > Thanks, > Earl > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam > protection around > http://mail.yahoo.com > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
