Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by NeilMcAndrew: http://wiki.apache.org/nutch/ErrorMessages ------------------------------------------------------------------------------ - = Error messages, reasons and solutions = Please feel free to add error messages, reasons and solutions! @@ -10, +9 @@ * Fetching * Updating * Searching + + + + == FileNotFoundException: 1 == + + delay 1 fails + crawltest and subdirectories are created; also ant compiles no probs; ROOT.war is installed and runs; urls file exists. Adding ./ or full path as x below changes nothing. Server runs squid on 80 and real Apache 1.3 on 81. Catalina is on 8080 and is up and running. + + + /x/nutch/nutch-0.7 # bin/nutch crawl /x/nutch/nutch-0.7/urls -dir /x/nutch/nutch-0.7/crawl.test -threads 2 -delay 1 -depth 3 + run java in /usr/local/java/j2sdk1.4.2 + 050827 032536 parsing file:/x/nutch/nutch-0.7/conf/nutch-default.xml + 050827 032536 parsing file:/x/nutch/nutch-0.7/conf/crawl-tool.xml + 050827 032536 parsing file:/x/nutch/nutch-0.7/conf/nutch-site.xml + 050827 032537 No FS indicated, using default:local + 050827 032537 crawl started in: /x/nutch/nutch-0.7/crawl.test + 050827 032537 rootUrlFile = 1 + 050827 032537 threads = 2 + 050827 032537 depth = 3 + 050827 032537 Created webdb at LocalFS,/x/nutch/nutch-0.7/crawl.test/db + Exception in thread "main" java.io.FileNotFoundException: 1 (No such file or directory) + at java.io.FileInputStream.open(Native Method) + at java.io.FileInputStream.<init>(FileInputStream.java:106) + at java.io.FileReader.<init>(FileReader.java:55) + at org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:372) + at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535) + at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134) + + + crawl test exists + + ls -R crawl.test/ + crawl.test/: + . .. db + + crawl.test/db: + . .. dbreadlock dbwritelock webdb + + crawl.test/db/webdb: + . .. linksByMD5 linksByURL pagesByMD5 pagesByURL + + crawl.test/db/webdb/linksByMD5: + . .. data index + + crawl.test/db/webdb/linksByURL: + . .. data index + + crawl.test/db/webdb/pagesByMD5: + . .. data index + + crawl.test/db/webdb/pagesByURL: + . .. data index + + ------------- + export NUTCH_JAVA_HOME is set and working.. + + + It always fails with above error, while omitting the delay tag seems to work :\ ... + I tried putting the -delay tag at several places above, it always fails + + nutch 0.7 + Apache Tomcat/5.0.19 jdsk 1.4.2-b28 Sun Microsystems Inc. Linux (Suse 8.2 1.5 years old but updated) Linux Kernel 2.4.21 i386 + + + + == Errors Fetching ==
