On Nov 24, 2011, at 3:21 AM, Julien Nioche wrote: >> >> OK, nm. This *is* different behavior from 1.3 apparently, but I figured out >> how to make it work in 1.4 (instead of editing the global, top-level >> conf/nutch-default.xml, >> I needed to edit runtime/local/conf/nutch-default.xml). Crawling is >> forging ahead. >> > > yep, I think this is documented on the Wiki. It is partially why I > suggested that we deliver the content of runtime/local as our binary > release for next time. Most people use Nutch in local mode so this would > make their lives easier, as for the advanced users (read pseudo or real > distributed) they need to recompile the job file anyway and I'd expect them > to use the src release
+1, I'll be happy to edit build.xml and make that happen for 1.5. In the meanwhile, time to figure out why I still can't get it to crawl the PDFs... :( Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

