Here you go.. bin/nutch crawl urls -dir crawl.test -depth 3 >& crawl.log
the filename "urls" above is the text file that you created. It can be anywhere just make sure you have correct path to it on the command line (i.e. bin/nutch crawl /home/xxx/urls.txt etc.. same goes for the "crawl.test" which is your crawl directory. Regards On 7/24/05, blackwater dev <[EMAIL PROTECTED]> wrote: > I am a nutch newbie and I have created a simple urls file with one > domain. I have tried putting it in a few places but am getting > errors. Where should it go? I am running the crawl command from the > tutorial. > > Thanks! > > > expr: syntax error > 050724 081642 No NutchFileSystem indicated, so defaulting to local fs. > 050724 081642 loading file:/Users/e/nutch-0.6/conf/nutch-default.xml > 050724 081643 loading file:/Users/e/nutch-0.6/conf/crawl-tool.xml > 050724 081643 loading file:/Users/e/nutch-0.6/conf/nutch-site.xml > 050724 081643 crawl started in: crawl.test > 050724 081643 rootUrlFile = urls > 050724 081643 threads = 10 > 050724 081643 depth = 3 > 050724 081643 Created webdb at LocalFS,/Users/e/nutch-0.6/crawl.test/db > Exception in thread "main" java.io.FileNotFoundException: urls (No > such file or directory) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.<init>(FileInputStream.java:106) > at java.io.FileReader.<init>(FileReader.java:55) > at net.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:359) > at net.nutch.db.WebDBInjector.main(WebDBInjector.java:510) > at net.nutch.tools.CrawlTool.main(CrawlTool.java:121) > -- Best Regards Zaheed Haque Phone : +46 735 000006 E.mail: [EMAIL PROTECTED] ------------------------------------------------------- SF.Net email is sponsored by: Discover Easy Linux Migration Strategies from IBM. Find simple to follow Roadmaps, straightforward articles, informative Webcasts and more! Get everything you need to get up to speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
