[
https://issues.apache.org/jira/browse/NUTCH-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091747#comment-13091747
]
Julien Nioche commented on NUTCH-937:
-------------------------------------
@Radim : Nutch is based on the Apache distribution of Hadoop and 1.4 already
works with it. No one suggested that it should be based on something different.
The point here is that if we can get it to work on other distributions by
simply adding a default parameter then it is probably worth doing.
@Ferdy : don't agree that embedding the property within nutch-site has no
effect -> it does work. You probably have a different issue
> When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins
> because MapReduce will not unpack plugin/ directory from the job's pack (due
> to MAPREDUCE-967)
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: NUTCH-937
> URL: https://issues.apache.org/jira/browse/NUTCH-937
> Project: Nutch
> Issue Type: Bug
> Components: build
> Affects Versions: 1.2
> Environment: hadoop 0.21 or cloudera hadoop 0.20.2+737
> Reporter: Claudio Martella
> Assignee: Markus Jelsma
> Fix For: 1.4, 2.0
>
>
> Jobs running in on hadoop 0.21 or cloudera cdh 0.20.2+737 will fail because
> of missing plugins (i.e.):
> 10/10/28 12:22:21 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 10/10/28 12:22:22 INFO mapred.FileInputFormat: Total input paths to
> process : 1
> 10/10/28 12:22:23 INFO mapred.JobClient: Running job: job_201010271826_0002
> 10/10/28 12:22:24 INFO mapred.JobClient: map 0% reduce 0%
> 10/10/28 12:22:39 INFO mapred.JobClient: Task Id :
> attempt_201010271826_0002_m_000000_0, Status : FAILED
> java.lang.RuntimeException: Error in configuring object
> at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:379)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:317)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
> at org.apache.hadoop.mapred.Child.main(Child.java:211)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
> at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
> ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> ... 17 more
> Caused by: java.lang.RuntimeException: x point
> org.apache.nutch.net.URLNormalizer not found.
> at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:122)
> at
> org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:70)
> ... 22 more
> 10/10/28 12:22:40 INFO mapred.JobClient: Task Id :
> attempt_201010271826_0002_m_000001_0, Status : FAILED
> java.lang.RuntimeException: Error in configuring object
> at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:379)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:317)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
> at org.apache.hadoop.mapred.Child.main(Child.java:211)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
> at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
> ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> ... 17 more
> Caused by: java.lang.RuntimeException: x point
> org.apache.nutch.net.URLNormalizer not found.
> at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:122)
> at
> org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:70)
> ... 22 more
> The bug is due to MAPREDUCE-967 (part of hadoop 0.21 and cdh 0.20.2+737)
> which modifies the way MapReduce unpacks the job's jar. The old way was to
> unpack the whole of it, now only classes/ and lib/ are unpacked. This way
> nutch is missing the plugins/ directory.
> A workaround is to force unpacking of the plugin/ directory by setting
> 'mapreduce.job.jar.unpack.pattern' configuration to
> "(?:classes/|lib/|plugins/).*"
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira