Here you go..

bin/nutch crawl urls -dir crawl.test -depth 3 >& crawl.log

the filename "urls" above is the text file that you created. It can be
anywhere just make sure you have correct path to it on the command
line (i.e. bin/nutch crawl /home/xxx/urls.txt etc.. same goes for the
"crawl.test" which is your crawl directory.

Regards 
On 7/24/05, blackwater dev <[EMAIL PROTECTED]> wrote:
> I am a nutch newbie and I have created a simple urls file with one
> domain.  I have tried putting it in a few places but am getting
> errors.  Where should it go?  I am running the crawl command from the
> tutorial.
> 
> Thanks!
> 
> 
> expr: syntax error
> 050724 081642 No NutchFileSystem indicated, so defaulting to local fs.
> 050724 081642 loading file:/Users/e/nutch-0.6/conf/nutch-default.xml
> 050724 081643 loading file:/Users/e/nutch-0.6/conf/crawl-tool.xml
> 050724 081643 loading file:/Users/e/nutch-0.6/conf/nutch-site.xml
> 050724 081643 crawl started in: crawl.test
> 050724 081643 rootUrlFile = urls
> 050724 081643 threads = 10
> 050724 081643 depth = 3
> 050724 081643 Created webdb at LocalFS,/Users/e/nutch-0.6/crawl.test/db
> Exception in thread "main" java.io.FileNotFoundException: urls (No
> such file or directory)
>        at java.io.FileInputStream.open(Native Method)
>        at java.io.FileInputStream.<init>(FileInputStream.java:106)
>        at java.io.FileReader.<init>(FileReader.java:55)
>        at net.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:359)
>        at net.nutch.db.WebDBInjector.main(WebDBInjector.java:510)
>        at net.nutch.tools.CrawlTool.main(CrawlTool.java:121)
> 


-- 
Best Regards
Zaheed Haque
Phone : +46 735 000006
E.mail: [EMAIL PROTECTED]


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to