Hi Florian,
Where is your urls file located?. If you created urls in the conf folder
then you have to call:
bin/nutch crawl conf/urls -dir crawlresults/ -depth 2 - topN 1000
Good luck
Detlev
I am running cygwin (I know), with jdk1.5.0 and tomcat 4.1
>From cygwin I run:
bin/nutch crawl urls -dir crawlresults/ -depth 2 - topN 1000
results:
run java in C:/program files/java/jdk1.5.0/
060223 123010 parsing file:/c:/cygwin/home/falieson/nutch/conf/nutch-
default.xml
060223 123010 parsing file:/c:/cygwin/home/falieson/nutch/conf/crawl-
tool.xml
060223 123010 parsing file:/c:/cygwin/home/falieson/nutch/conf/nutch-
site.xml
060223 123010 No FS indicated, using default:local
060223 123010 rootUrlFile = 10000
060223 123010 thread = 10
060223 123010 depth = 2
060223 123011 Created webdb at
LocalFS,C:\cygwin\home\falieson\crawlresults\db
Exception in thread "main" java.io.FileNotFoundException: 10000 <the system
cannot find the file specified>
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:106)
at java.io.FileReader.<init>(FileReader.java:55)
at org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java
:372)
at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)
~~~
bin/nutch crawl urls -dir crawled -depth 3
results:
run java in C:/program files/java/jdk1.5.0/
060223 123832 parsing file:/C:/cygwin/home/falieson/nutch/conf/nutch-
default.xml
060223 123832 parsing file:/C:/cygwin/home/falieson/nutch/conf/crawl-
tool.xml
060223 123832 parsing file:/C:/cygwin/home/falieson/nutch/conf/nutch-
site.xml
060223 123832 No FS indicated, using default:local
060223 123832 crawl started in: crawled
060223 123832 rootUrlFile = urls
060223 123832 threads = 10
060223 123832 depth = 3
060223 123832 Created webdb at LocalFS,C:\cygwin\home\falieson\crawled\db
Exception in thread "main" java.io.FileNotFoundException: urls (The system
cannot find the file specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:106)
at java.io.FileReader.<init>(FileReader.java:55)
at org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java
:372)
at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)
~~
TIA
--
Best Regards,
Florian Mettetal
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general