I've narrowed down the problem.  It is not parsing my
command line correctly.  You'll notice it says file
not found "urls.txt -dir crawl.test" rather than just
urls.txt.  Also even though I specified depth 2 it
thinks it is 5.  If I just do "bin/nutch crawl
urls.txt" it will run, but without the parameters I
want.  Perhaps a problem with the shell.  I also had
the error "IFS: cannot unset" so I had commented that
out.

--- Stefan Groschupf <[EMAIL PROTECTED]> wrote:

> Try to move the urls.txt in a folder called urls and
> provide the  
> folder instead the text file itself.
> Does this help?
> Stefan
> 
> Am 12.01.2006 um 21:29 schrieb Mike Markzon:
> 
> > I've tried 0.7 and the nightly build of 0.8. 
> Neither
> > is working for me.  I'm just trying to follow the
> > tutorials.  Here's what i'm getting with 0.7 when
> I
> > try and crawl (FileNotFoundException).
> >
> > $ ls urls.txt
> > urls.txt
> > $ bin/nutch crawl urls.txt -dir crawl.test -d 2
> > 060112 122459 parsing
> >
>
file:/apps/user/vignette/nutch-0.7/conf/nutch-default.xml
> > 060112 122459 parsing
> >
>
file:/apps/user/vignette/nutch-0.7/conf/crawl-tool.xml
> > 060112 122459 parsing
> >
>
file:/apps/user/vignette/nutch-0.7/conf/nutch-site.xml
> > 060112 122459 No FS indicated, using default:local
> > 060112 122459 crawl started in:
> crawl-20060112122459
> > 060112 122459 rootUrlFile = urls.txt -dir
> crawl.test
> > -d 2
> > 060112 122459 threads = 10
> > 060112 122459 depth = 5
> > 060112 122459 Created webdb at
> > LocalFS,/apps/user/vignette/nutch-0.7/crawl-20060
> > 112122459/db
> > Exception in thread "main"
> > java.io.FileNotFoundException: urls.txt -dir
> crawl.te
> > st -d 2 (No such file or directory)
> >         at java.io.FileInputStream.open(Native
> Method)
> >         at
> >
>
java.io.FileInputStream.<init>(FileInputStream.java:106)
> >         at
> > java.io.FileReader.<init>(FileReader.java:55)
> >         at
> >
>
org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:37
> > 2)
> >         at
> >
>
org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
> >         at
> >
>
org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)
> > $
> >
> > If I follow the tutorial at
> > http://wiki.media-style.com/display/nutchDocu/Home
> > everytime I execute a command I get a Usage
> statement
> > and the command doesn't do anything.
> > $ bin/nutch admin db/ -create
> > Usage: java org.apache.nutch.tools.WebDBAdminTool
> > (-local | -ndfs <namenode:port
> >> ) db [-create] [-textdump dumpPrefix]
> [-scoredump]
> > [-top k]
> >
> > Any ideas?  Thanks!  Also thanks to those who
> answered
> > my first question about using a server besides
> Tomcat.
> > -Mike
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam
> protection around
> > http://mail.yahoo.com
> >
> 
>
---------------------------------------------------------------
> company:        http://www.media-style.com
> forum:        http://www.text-mining.org
> blog:            http://www.find23.net
> 
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Reply via email to