> Resources such as the URL filter and normalizer rule files > are usually defined as pure files without path and are located > on the classpath. So it should work if > C:/server/nutch/conf/ > is in the classpath and the resources are simply named "regex-urlfilter.txt" > resp. "regex-normalize.xml".
Thanks for the information. It works now by putting the files into the classpath and just using the filenames. Everything works now and I can start a crawl cycle from my Java application. One question though: Is there a way to get some more verbose information out of the crawl process than just the logging information? I intend something like the urls crawled, the ones waiting to be crawled, current status etc? Programmatically I can only infer at what stage the process is (injecting, fetching etc.), but no details. Injector Generator and Fetcher classes seem not to contain any useful methods for that purpose. Any hints? Regards, Max -- View this message in context: http://lucene.472066.n3.nabble.com/Integrating-Nutch-tp3996461p3998591.html Sent from the Nutch - User mailing list archive at Nabble.com.

