Hi Florian,

Where is your urls file located?. If you created urls in the conf folder
then you have to call:

bin/nutch crawl conf/urls -dir crawlresults/ -depth 2 - topN 1000

Good luck

Detlev



I am running cygwin (I know), with jdk1.5.0 and tomcat 4.1
>From cygwin I run:

bin/nutch crawl urls -dir crawlresults/ -depth 2 - topN 1000

results:

run java in C:/program files/java/jdk1.5.0/
060223 123010 parsing file:/c:/cygwin/home/falieson/nutch/conf/nutch-
default.xml

060223 123010 parsing file:/c:/cygwin/home/falieson/nutch/conf/crawl-
tool.xml
060223 123010 parsing file:/c:/cygwin/home/falieson/nutch/conf/nutch-
site.xml
060223 123010 No FS indicated, using default:local
060223 123010 rootUrlFile = 10000
060223 123010 thread = 10
060223 123010 depth = 2
060223 123011 Created webdb at
LocalFS,C:\cygwin\home\falieson\crawlresults\db
Exception in thread "main" java.io.FileNotFoundException: 10000 <the system
cannot find the file specified>
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(FileInputStream.java:106)
    at java.io.FileReader.<init>(FileReader.java:55)
    at org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java
:372)
    at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
    at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)


~~~
bin/nutch crawl urls -dir crawled -depth 3

results:

run java in C:/program files/java/jdk1.5.0/
060223 123832 parsing file:/C:/cygwin/home/falieson/nutch/conf/nutch-
default.xml
060223 123832 parsing file:/C:/cygwin/home/falieson/nutch/conf/crawl-
tool.xml
060223 123832 parsing file:/C:/cygwin/home/falieson/nutch/conf/nutch-
site.xml
060223 123832 No FS indicated, using default:local
060223 123832 crawl started in: crawled
060223 123832 rootUrlFile = urls
060223 123832 threads = 10
060223 123832 depth = 3
060223 123832 Created webdb at LocalFS,C:\cygwin\home\falieson\crawled\db
Exception in thread "main" java.io.FileNotFoundException: urls (The system
cannot find the file specified)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(FileInputStream.java:106)
    at java.io.FileReader.<init>(FileReader.java:55)
    at org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java
:372)
    at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
    at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)


~~
TIA
--
Best Regards,
Florian Mettetal



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to