On 8/9/07, Kai_testing Middleton <[EMAIL PROTECTED]> wrote:
> Hmm:
>
> $ bin/nutch inject crawl/crawldb /usr/tmp2/urls.txt
> Injector: starting
> Injector: crawlDb: crawl/crawldb
> Injector: urlDir: /usr/tmp2/urls.txt
> Injector: Converting injected urls to crawl db entries.
> Injector: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:166)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:196)
>         at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:186)
>
> $ cat /var/tmp/nutch-trunk/hadoop.log
> 2007-08-09 10:23:23,504 INFO  crawl.Injector - Injector: starting
> 2007-08-09 10:23:23,505 INFO  crawl.Injector - Injector: crawlDb: 
> crawl/crawldb
> 2007-08-09 10:23:23,505 INFO  crawl.Injector - Injector: urlDir: 
> /usr/tmp2/urls.txt
> 2007-08-09 10:23:23,976 INFO  crawl.Injector - Injector: Converting injected 
> urls to crawl db entries.
> 2007-08-09 10:23:25,035 INFO  plugin.PluginRepository - Plugins: looking in: 
> /usr/tmp2/nutch_trunk/plugins
> 2007-08-09 10:23:25,038 WARN  mapred.LocalJobRunner - job_48xttw
> java.lang.NullPointerException
>         at 
> org.apache.nutch.plugin.PluginManifestParser.parsePluginFolder(PluginManifestParser.java:87)
>         at 
> org.apache.nutch.plugin.PluginRepository.<init>(PluginRepository.java:71)
>         at 
> org.apache.nutch.plugin.PluginRepository.get(PluginRepository.java:95)
>         at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:116)
>         at 
> org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:59)
>         at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
>         at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
>         at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>         at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
>         at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:170)
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:126)
> 2007-08-09 10:23:25,946 FATAL crawl.Injector - Injector: java.io.IOException: 
> Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:166)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:196)
>         at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:186)
>
> So, Doğacan, I tried the steps you suggested:
>    1) Try doing an "ant clean;ant" and try again.
>    2) Check if your classpath is clean
>    3) Try to run the inject command by itself: bin/nutch <crawldb> <urldir>
>
> I did "ant clean; ant".  In fact I even tried "svn up -r HEAD" yesterday.  My 
> CLASSPATH is not set (it's empty) - do I need it to be set?  JAVA_HOME is set 
> properly.  NUTCH_HOME is set to /usr/tmp2/nutch_trunk as appropriate, and 
> that's where I ran the above inject command from.  No crawl directory gets 
> created, though now I'm seeing hadoop.log in the correct place, as we see 
> above.  Using df I see I have plenty of disk space.
>
> Any other ideas?  Should I add some logging code and rebuild?  Maybe I'll try 
> this
with a stock 0.9 of nutch and see what happens.

Can you check your plugin.folders setting? You probably have a path
there which doesn't exists (Note that: having
value="plugins,some_non_existing_folder" would not work either. All
the paths there have to be correct).

>
> --Kai Middleton
>
>
>
>
>
> ____________________________________________________________________________________
> Pinpoint customers who are looking for what you sell.
> http://searchmarketing.yahoo.com/


-- 
Doğacan Güney

Reply via email to