Hi 

  You get that error while running earlier 0.7 nutch
tutorial running on 0.8dev nutch.

  Use the tutorial  for 0.8 dev 
http://wiki.media-style.com/display/nutchDocu/quick+tutorial+for+nutch+0.8+and+later.

  Or add following property to nutch-site.xml.

 <property>
  <name>mapred.input.dir</name>
 
<value>C:/cygwin/usr/local/src/nutch-nightly/conf</value>
  <description>The proxy port.</description>
</property>


P

>Hi all,

>Having some problems getting nutch to run on
XP/Cygwin.
>This is re nutch-2006-01-17

>Intranet crawl........

>When I do this (after making urls file, etc.):

>       bin/nutch crawl urls -dir cdir -depth 2 >&log
        
>I get this in the log:
        
>060117 114833 parsing
>file:/C:/cygwin/usr/local/src/nutch-nightly/conf/nutch->default.xml
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/crawl-tool.xml
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/mapred-default.xml
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/nutch-site.xml
060117 114834 crawl started in: cdir
060117 114834 rootUrlDir = urls
060117 114834 threads = 10
060117 114834 depth = 2
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/nutch-default.xml
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/crawl-tool.xml
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/nutch-site.xml
060117 114834 Injector: starting
060117 114834 Injector: crawlDb: cdir\crawldb
060117 114834 Injector: urlDir: urls
060117 114834 Injector: Converting injected urls to
crawl db entries.
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/nutch-default.xml
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/crawl-tool.xml
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/mapred-default.xml
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/mapred-default.xml
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/nutch-site.xml
060117 114834 Running job: job_krj0e1
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/nutch-default.xml
060117 114834 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/mapred-default.xml
060117 114835 parsing
\tmp\nutch\mapred\local\localRunner\job_krj0e1.xml
060117 114835 parsing
file:/C:/cygwin/usr/local/src/nutch-nightly/conf/nutch-site.xml
java.io.IOException: No input directories specified
in: NutchConf: nutch-default.xml , mapred-default.xml
, \tmp\nutch\mapred\local\localRunner\job_krj0e1.xml ,
nutch-site.xml
        at
org.apache.nutch.mapred.InputFormatBase.listFiles(InputFormatBase.java:85)
        at
org.apache.nutch.mapred.InputFormatBase.getSplits(InputFormatBase.java:95)
        at
org.apache.nutch.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:63)
060117 114835  map 0%
java.io.IOException: Job failed!
        at
org.apache.nutch.mapred.JobClient.runJob(JobClient.java:308)
        at
org.apache.nutch.crawl.Injector.inject(Injector.java:102)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:105)
Exception in thread "main" 

I see that:

        nutch-site.xml is empty
        mapred-default is empty


Whole Web setup............................ 

When I do this: (after mkdirs)

        bin/nutch admin db -create
 
I get this at the prompt:

        Exception in thread "main"
java.lang.NoClassDefFoundError: admin
        
I don't speak Java, so I'm not sure what it's saying.


Please help.

TIA.





__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam
protection around http://mail.yahoo.com 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to