Hi,
check wether your "Working directory" (Run -> Run Configurations -> Tab
Arguments -> Working Directory) points to the Nutch base directory (where
your conf/nucht-site.xml is located).
Regards
Hannes


On Mon, Oct 4, 2010 at 11:02 AM, Marseld Dedgjonaj <
marseld.dedgjo...@ikubinfo.com> wrote:

> Hello,
> Thanks for your answer. I try it but I got this error.
> Maybe any problem on reading in "conf" folder. I see its ok.
> If I run crawl from linux script it works.
> Thanks
>
>
> This is the error message:
>
> 10/10/04 10:46:40 INFO crawl.Crawl: crawl started in: crawl
> 10/10/04 10:46:40 INFO crawl.Crawl: rootUrlDir = my_urls
> 10/10/04 10:46:40 INFO crawl.Crawl: threads = 5
> 10/10/04 10:46:40 INFO crawl.Crawl: depth = 3
> 10/10/04 10:46:40 INFO crawl.Crawl: indexer=lucene
> 10/10/04 10:46:40 INFO crawl.Crawl: topN = 50
> 10/10/04 10:46:40 INFO crawl.Injector: Injector: starting at 2010-10-04
> 10:46:40
> 10/10/04 10:46:40 INFO crawl.Injector: Injector: crawlDb: crawl/crawldb
> 10/10/04 10:46:40 INFO crawl.Injector: Injector: urlDir: my_urls
> 10/10/04 10:46:40 INFO crawl.Injector: Injector: Converting injected urls
> to
> crawl db entries.
> 10/10/04 10:46:40 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
> 10/10/04 10:46:40 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 10/10/04 10:46:41 WARN mapred.JobClient: No job jar file set.  User classes
> may not be found. See JobConf(Class) or JobConf#setJar(String).
> 10/10/04 10:46:41 INFO mapred.FileInputFormat: Total input paths to process
> : 1
> 10/10/04 10:46:42 INFO mapred.JobClient: Running job: job_local_0001
> 10/10/04 10:46:42 INFO mapred.FileInputFormat: Total input paths to process
> : 1
> 10/10/04 10:46:42 INFO mapred.MapTask: numReduceTasks: 1
> 10/10/04 10:46:42 INFO mapred.MapTask: io.sort.mb = 100
> 10/10/04 10:46:43 INFO mapred.JobClient:  map 0% reduce 0%
> 10/10/04 10:46:43 INFO mapred.MapTask: data buffer = 79691776/99614720
> 10/10/04 10:46:43 INFO mapred.MapTask: record buffer = 262144/327680
> 10/10/04 10:46:43 WARN mapred.LocalJobRunner: job_local_0001
> java.lang.RuntimeException: Error in configuring object
>    at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>    at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>    at
>
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>    at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> Caused by: java.lang.reflect.InvocationTargetException
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
>    at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>    ... 5 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>    at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>    at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>    at
>
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>    ... 10 more
> Caused by: java.lang.reflect.InvocationTargetException
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
>    at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>    ... 13 more
> Caused by: java.lang.IllegalArgumentException: plugin.folders is not
> defined
>    at
>
> org.apache.nutch.plugin.PluginManifestParser.parsePluginFolder(PluginManifes
> tParser.java:78)
>    at
> org.apache.nutch.plugin.PluginRepository.<init>(PluginRepository.java:72)
>    at
> org.apache.nutch.plugin.PluginRepository.get(PluginRepository.java:95)
>    at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:117)
>    at
> org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:70)
>    ... 18 more
> 10/10/04 10:46:44 INFO mapred.JobClient: Job complete: job_local_0001
> 10/10/04 10:46:44 INFO mapred.JobClient: Counters: 0
> 10/10/04 10:46:48 INFO mapred.LocalJobRunner:
> file:/home/administrator/workspace/nutch-1.2/my_urls/urls:0+443
>
>
>
> -----Original Message-----
> From: Ahmad Al-Amri [mailto:amri...@yahoo.com]
> Sent: Sunday, October 03, 2010 11:34 AM
> To: user@nutch.apache.org
> Subject: Re: Run crawl from java code
>
> Hello;
>
> open nutch with eclipse;
>
> Run -> debug Configuration -> 'right click on' java application and choose
> new
>
> -- set the main class;
> org.apache.nutch.crawl.Crawl
>
> -- and the arguments:
> urls -dir crawloutput -threads 5 -depth 3  -topN 50
>
> then set your breakpoints and run the debug for this configuration
>
> Good Luck :)
>
>
>
>
>
> ________________________________
> From: Marseld Dedgjonaj <marseld.dedgjo...@ikubinfo.com>
> To: user@nutch.apache.org
> Sent: Sat, October 2, 2010 4:51:28 PM
> Subject: Run crawl from java code
>
> Hi,
>
> I have configured nutch 1.2 in Eclipse project.
>
> I need to run crawl from java code to follow it with debug.
>
>
>
> This is the script in linux that I execute for crawl.
>
>
>
> .         bin/nutch inject /home/administrator/nutch/albanian_crawl/crawldb
> my_urls
>
> .         bin/nutch generate
> /home/administrator/nutch/albanian_crawl/crawldb
> /home/administrator/nutch/albanian_crawl/segments
>
> .         segment=`ls -d
> /home/administrator/nutch/albanian_crawl/segments/2* | tail -1`
>
> .         bin/nutch fetch $segment
>
> .         bin/nutch updatedb
> /home/administrator/nutch/albanian_crawl/crawldb $segment
>
> .         bin/nutch mergesegs
> /home/administrator/nutch/albanian_crawl/segments
> /home/administrator/nutch/albanian_crawl/segments/*
>
> .         bin/nutch invertlinks
> /home/administrator/nutch/albanian_crawl/linkdb
> /home/administrator/nutch/albanian_crawl/segments/*
>
> .         bin/nutch index /home/administrator/nutch/albanian_crawl/indexes
> /home/administrator/nutch/albanian_crawl/crawldb
> /home/administrator/nutch/albanian_crawl/linkdb
> /home/administrator/nutch/albanian_crawl/segments/*
>
> .         bin/nutch dedup /home/administrator/nutch/albanian_crawl/indexes
>
>
>
> Can anybody help to translate it in java.
>
>
>
>
>
> Thanks in advance ,
>
> Marseld.
>
>
>
>
>
>
>

Reply via email to