howdy,

I have been looking around for a nutch/mapred tutorial
and haven't had much luck.  I found this one

http://lucene.apache.org/nutch/tutorial.html

which did help me get a crawl going on trunk, but no
such luck in branches/mapred.  I set the urls file and
the filter in the same way that I did for trunk and I
get 

050907 013817 parsing
file:/home/nutch/nutch/branches/mapred/conf/nutch-site.xml
java.io.IOException: No input files in:
[Ljava.io.File;@32b0bad7
        at
org.apache.nutch.mapred.InputFormatBase.listFiles(InputFormatBase.java:74)
        at
org.apache.nutch.mapred.InputFormatBase.getSplits(InputFormatBase.java:84)
        at
org.apache.nutch.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:59)

Guess I am wondering if a detailed tutorial for mapred
exists.  Seems like doug was saying that it didn't.  I
would be up for walking through getting a crawl going
and documenting my steps, but won't dive in if one
already exists.  Also wondering if I would/could put
my doc on the wiki.

Thanks,
Earl

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Reply via email to