[Nutch-general] FileNotFoundException on crawl

Michael Levy Fri, 14 Apr 2006 13:10:01 -0700

I'm getting the error below. When I look at the rootUrlFile value, it
seems as though it is trying to read a file named "urls.txt -dir
crawled" rather than recognizing the -dir parameter. Any ideas? This is
running on Solaris 9, if that makes any difference.


If I merely run "bin/nutch crawl urls.txt" this problem doesn't occur.

Thanks!


bash-2.05# bin/nutch crawl urls.txt -dir crawled
060414 155951 parsing
file:/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/conf/nutch-default.xml
060414 155951 parsing
file:/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/conf/crawl-tool.xml
060414 155951 parsing
file:/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/conf/nutch-site.xml
060414 155951 No FS indicated, using default:local
060414 155951 crawl started in: crawl-20060414155951
060414 155951 rootUrlFile = urls.txt -dir crawled
060414 155951 threads = 10
060414 155951 depth = 5
060414 155952 Created webdb at
LocalFS,/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/crawl-20060414155951/db
Exception in thread "main" java.io.FileNotFoundException: urls.txt -dir
crawled (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:106)
at java.io.FileReader.<init>(FileReader.java:55)
at org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:372)
at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] FileNotFoundException on crawl

Reply via email to