[ 
https://issues.apache.org/jira/browse/NUTCH-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated NUTCH-2354:
---------------------------------
    Description: 
This wednesday we experienced trouble running the 1.12 injector on Hadoop 
2.7.3. We operated 2.7.2 before and we had no trouble running a job.

{code}
2017-01-18 15:36:53,005 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error 
running child : java.lang.IncompatibleClassChangeError: Found interface 
org.apache.hadoop.mapreduce.Counter, but class was expected
        at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:216)
        at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:100)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.Counter, but class was expected
        at org.apache.nutch.crawl.Injector.inject(Injector.java:383)
        at org.apache.nutch.crawl.Injector.run(Injector.java:467)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.nutch.crawl.Injector.main(Injector.java:441)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}

Our processes retried injecting for a few minutes until we manually shut it 
down. Meanwhile on HDFS, our CrawlDB was gone, thanks for snapshots and/or 
backups we could restore it, so enable those if you haven't done so yet.

These freak Hadoop errors can be notoriously difficult to debug but it seems we 
are in luck, recompile Nutch with Hadoop 2.7.3 instead 2.4.0. You are also in 
luck if your job file uses the old org.hadoop.mapred.* API, only jobs using the 
org.hadoop.mapreduce.* API seem to fail.

  was:
This wednesday we experienced trouble running the 1.12 injector on Hadoop 
2.7.3. We operated 2.7.2 before and we had no trouble running a job.

2017-01-18 15:36:53,005 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error 
running child : java.lang.IncompatibleClassChangeError: Found interface 
org.apache.hadoop.mapreduce.Counter, but class was expected
        at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:216)
        at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:100)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.Counter, but class was expected
        at org.apache.nutch.crawl.Injector.inject(Injector.java:383)
        at org.apache.nutch.crawl.Injector.run(Injector.java:467)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.nutch.crawl.Injector.main(Injector.java:441)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Our processes retried injecting for a few minutes until we manually shut it 
down. Meanwhile on HDFS, our CrawlDB was gone, thanks for snapshots and/or 
backups we could restore it, so enable those if you haven't done so yet.

These freak Hadoop errors can be notoriously difficult to debug but it seems we 
are in luck, recompile Nutch with Hadoop 2.7.3 instead 2.4.0. You are also in 
luck if your job file uses the old org.hadoop.mapred.* API, only jobs using the 
org.hadoop.mapreduce.* API seem to fail.


> Upgrade Hadoop dependencies to 2.7.3
> ------------------------------------
>
>                 Key: NUTCH-2354
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2354
>             Project: Nutch
>          Issue Type: Bug
>          Components: injector
>    Affects Versions: 1.12
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>            Priority: Blocker
>             Fix For: 1.13
>
>
> This wednesday we experienced trouble running the 1.12 injector on Hadoop 
> 2.7.3. We operated 2.7.2 before and we had no trouble running a job.
> {code}
> 2017-01-18 15:36:53,005 FATAL [main] org.apache.hadoop.mapred.YarnChild: 
> Error running child : java.lang.IncompatibleClassChangeError: Found interface 
> org.apache.hadoop.mapreduce.Counter, but class was expected
>       at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:216)
>       at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:100)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
> interface org.apache.hadoop.mapreduce.Counter, but class was expected
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:383)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}
> Our processes retried injecting for a few minutes until we manually shut it 
> down. Meanwhile on HDFS, our CrawlDB was gone, thanks for snapshots and/or 
> backups we could restore it, so enable those if you haven't done so yet.
> These freak Hadoop errors can be notoriously difficult to debug but it seems 
> we are in luck, recompile Nutch with Hadoop 2.7.3 instead 2.4.0. You are also 
> in luck if your job file uses the old org.hadoop.mapred.* API, only jobs 
> using the org.hadoop.mapreduce.* API seem to fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to