I've narrowed down the problem.  It is not parsing my
command line correctly.  You'll notice it says file
not found "urls.txt -dir crawl.test" rather than just
urls.txt.  Also even though I specified depth 2 it
thinks it is 5.  If I just do "bin/nutch crawl
urls.txt" it will run, but without the parameters I
want.  Perhaps a problem with the shell.  I also had
the error "IFS: cannot unset" so I had commented that
out.

--- Stefan Groschupf <[EMAIL PROTECTED]> wrote:

> Try to move the urls.txt in a folder called urls and
> provide the  
> folder instead the text file itself.
> Does this help?
> Stefan
> 
> Am 12.01.2006 um 21:29 schrieb Mike Markzon:
> 
> > I've tried 0.7 and the nightly build of 0.8. 
> Neither
> > is working for me.  I'm just trying to follow the
> > tutorials.  Here's what i'm getting with 0.7 when
> I
> > try and crawl (FileNotFoundException).
> >
> > $ ls urls.txt
> > urls.txt
> > $ bin/nutch crawl urls.txt -dir crawl.test -d 2
> > 060112 122459 parsing
> >
>
file:/apps/user/vignette/nutch-0.7/conf/nutch-default.xml
> > 060112 122459 parsing
> >
>
file:/apps/user/vignette/nutch-0.7/conf/crawl-tool.xml
> > 060112 122459 parsing
> >
>
file:/apps/user/vignette/nutch-0.7/conf/nutch-site.xml
> > 060112 122459 No FS indicated, using default:local
> > 060112 122459 crawl started in:
> crawl-20060112122459
> > 060112 122459 rootUrlFile = urls.txt -dir
> crawl.test
> > -d 2
> > 060112 122459 threads = 10
> > 060112 122459 depth = 5
> > 060112 122459 Created webdb at
> > LocalFS,/apps/user/vignette/nutch-0.7/crawl-20060
> > 112122459/db
> > Exception in thread "main"
> > java.io.FileNotFoundException: urls.txt -dir
> crawl.te
> > st -d 2 (No such file or directory)
> >         at java.io.FileInputStream.open(Native
> Method)
> >         at
> >
>
java.io.FileInputStream.<init>(FileInputStream.java:106)
> >         at
> > java.io.FileReader.<init>(FileReader.java:55)
> >         at
> >
>
org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:37
> > 2)
> >         at
> >
>
org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
> >         at
> >
>
org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)
> > $
> >
> > If I follow the tutorial at
> > http://wiki.media-style.com/display/nutchDocu/Home
> > everytime I execute a command I get a Usage
> statement
> > and the command doesn't do anything.
> > $ bin/nutch admin db/ -create
> > Usage: java org.apache.nutch.tools.WebDBAdminTool
> > (-local | -ndfs <namenode:port
> >> ) db [-create] [-textdump dumpPrefix]
> [-scoredump]
> > [-top k]
> >
> > Any ideas?  Thanks!  Also thanks to those who
> answered
> > my first question about using a server besides
> Tomcat.
> > -Mike
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam
> protection around
> > http://mail.yahoo.com
> >
> 
>
---------------------------------------------------------------
> company:        http://www.media-style.com
> forum:        http://www.text-mining.org
> blog:            http://www.find23.net
> 
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to