[... ]

> We do not push any Nutch related stuff to the Sonatype Nexus Maven
> Repository so you
> can't therefore pull it and depend upon in an any way.
>

We do and it is synced with Central see :

- http://wiki.apache.org/nutch/NutchMavenSupport
- http://search.maven.org/#artifactdetails|org.apache.nutch|nutch|1.3|jar
-
https://repository.apache.org/content/groups/public/org/apache/nutch/nutch/1.3/
- http://mvnrepository.com/artifact/org.apache.nutch/nutch/1.3

 Julien


> On Thu, Sep 15, 2011 at 4:06 PM, Luis Cappa Banda <[email protected]
> >wrote:
>
> > Hello.
> >
> > I've downloaded Nutch-1.3 version via Subversion and modified some
> classes
> > a
> > little. My intention is to integrate with Maven the new artifacts created
> > from the new "hacked" Nutch version and integrate them with another Maven
> > project which has a dependency to the hacked version mentioned. Both
> > projects (Nutch personalized version and the other project) are inside a
> > parent project that orchestrates compilation by modules. All
> configuration
> > aparently looks good and compiles correctly.
> >
> > When launching a crawling process using Solr index option appears the
> > following error:
> >
> > 2011-09-15 16:57:07,137 0    [main] INFO
> > es.desa.empleate.infojobs.CrawlingProperties  - Loading property file...
> >  2011-09-15 16:57:07,144 7    [main] INFO
> > es.desa.empleate.infojobs.CrawlingProperties  - Property file loaded!
> >  2011-09-15 16:57:07,145 8    [main] INFO
> > es.desa.empleate.infojobs.CrawlingProperties  - Retrieving property
> > 'URLS_DIR'
> >  2011-09-15 16:57:07,145 8    [main] INFO
> > es.desa.empleate.infojobs.CrawlingProperties  - Retrieving property
> > 'SOLR_SERVER'
> >  2011-09-15 16:57:07,145 8    [main] INFO
> > es.desa.empleate.infojobs.CrawlingProperties  - Retrieving property
> 'DEPTH'
> >  2011-09-15 16:57:07,145 8    [main] INFO
> > es.desa.empleate.infojobs.CrawlingProperties  - Retrieving property
> > 'THREADS'
> >  2011-09-15 16:57:08,259 1122 [main] INFO
> > es.desa.empleate.infojobs.CrawlingProcess  - > Crawling process
> started...
> > 2011-09-15 16:57:09,653 2516 [main] INFO  org.apache.nutch.crawl.Crawl  -
> > crawl started in: crawl-20110915165709
> >  2011-09-15 16:57:09,653 2516 [main] INFO  org.apache.nutch.crawl.Crawl
>  -
> > rootUrlDir =urls
> >  2011-09-15 16:57:09,653 2516 [main] INFO  org.apache.nutch.crawl.Crawl
>  -
> > threads = 10
> >  2011-09-15 16:57:09,653 2516 [main] INFO  org.apache.nutch.crawl.Crawl
>  -
> > depth = 3
> >  2011-09-15 16:57:09,653 2516 [main] INFO  org.apache.nutch.crawl.Crawl
>  -
> > solrUrl=http://localhost:8080/server_infojobs
> >  2011-09-15 16:57:10,090 2953 [main] INFO
>  org.apache.nutch.crawl.Injector
> > - Injector: starting at 2011-09-15 16:57:10
> >  2011-09-15 16:57:10,090 2953 [main] INFO
>  org.apache.nutch.crawl.Injector
> > - Injector: crawlDb: crawl-20110915165709/crawldb
> >  2011-09-15 16:57:10,090 2953 [main] INFO
>  org.apache.nutch.crawl.Injector
> > - Injector: urlDir:
> >
> >
> /home/lcappa/Escritorio/workspaces/Tomcats/Tomcat2/apache-tomcat-6.0.29/urls
> >  2011-09-15 16:57:10,236 3099 [main] INFO
>  org.apache.nutch.crawl.Injector
> > - Injector: Converting injected urls to crawl db entries.
> >  2011-09-15 16:57:10,258 3121 [main] INFO
> > org.apache.hadoop.metrics.jvm.JvmMetrics  - Initializing JVM Metrics with
> > processName=JobTracker, sessionId=
> > * 2011-09-15 16:57:10,328 3191 [main] WARN
> > org.apache.hadoop.mapred.JobClient  - No job jar file set.  User classes
> > may
> > not be found. See JobConf(Class) or JobConf#setJar(String).*
> >  2011-09-15 16:57:10,344 3207 [main] INFO
> > org.apache.hadoop.mapred.FileInputFormat  - Total input paths to process
> :
> > 1
> >  2011-09-15 16:57:10,567 3430 [Thread-10] INFO
> > org.apache.hadoop.mapred.FileInputFormat  - Total input paths to process
> :
> > 1
> >  2011-09-15 16:57:10,584 3447 [main] INFO
> > org.apache.hadoop.mapred.JobClient  - Running job: job_local_0001
> >  2011-09-15 16:57:10,642 3505 [Thread-10] INFO
> > org.apache.hadoop.mapred.MapTask  - numReduceTasks: 1
> >  2011-09-15 16:57:10,648 3511 [Thread-10] INFO
> > org.apache.hadoop.mapred.MapTask  - io.sort.mb = 100
> >  2011-09-15 16:57:10,772 3635 [Thread-10] INFO
> > org.apache.hadoop.mapred.MapTask  - data buffer = 79691776/99614720
> >  2011-09-15 16:57:10,772 3635 [Thread-10] INFO
> > org.apache.hadoop.mapred.MapTask  - record buffer = 262144/327680
> >  2011-09-15 16:57:10,794 3657 [Thread-10] WARN
> > org.apache.hadoop.mapred.LocalJobRunner  - job_local_0001
> > * java.lang.RuntimeException: Error in configuring object*
> >    at
> >
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> >    at
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> >    at
> >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
> >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> >    at
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> > Caused by: java.lang.reflect.InvocationTargetException
> >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >    at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >    at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >    at java.lang.reflect.Method.invoke(Method.java:597)
> >    at
> >
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> >    ... 5 more
> > Caused by: java.lang.RuntimeException: Error in configuring object
> >    at
> >
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> >    at
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> >    at
> >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
> >    ... 10 more
> > Caused by: java.lang.reflect.InvocationTargetException
> >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >    at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >    at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >    at java.lang.reflect.Method.invoke(Method.java:597)
> >    at
> >
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> >    ... 13 more
> > Caused by: java.lang.IllegalArgumentException: plugin.folders is not
> > defined
> >    at
> >
> >
> org.apache.nutch.plugin.PluginManifestParser.parsePluginFolder(PluginManifestParser.java:78)
> >    at
> > org.apache.nutch.plugin.PluginRepository.<init>(PluginRepository.java:71)
> >    at
> > org.apache.nutch.plugin.PluginRepository.get(PluginRepository.java:99)
> >    at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:117)
> >    at
> > org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:70)
> >    ... 18 more
> > 2011-09-15 16:57:11,587 4450 [main] INFO
> > org.apache.hadoop.mapred.JobClient  -  map 0% reduce 0%
> >  2011-09-15 16:57:11,590 4453 [main] INFO
> > org.apache.hadoop.mapred.JobClient  - Job complete: job_local_0001
> >  2011-09-15 16:57:11,591 4454 [main] INFO
> > org.apache.hadoop.mapred.JobClient  - Counters: 0
> >  2011-09-15 16:57:11,591 4454 [main] ERROR
> > es.desa.empleate.infojobs.CrawlingProcess  - > INFOJOBS CRAWLING ERROR:
> Job
> > failed!
> >  2011-09-15 16:57:11,591 4454 [main] INFO
> > es.desa.empleate.infojobs.CrawlingProcess  -  > Crawling process
> finished.
> >
> >
> > Looking at the error I think that I need to include nutch .job artifact
> > too.
> > The question is: is that so? If I have to, how can include it with Maven?
> > Any recomendation?
> >
> > Thank you very much.
> >
>
>
>
> --
> *Lewis*
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to