Hello Sebastian,

I am not sure what will happen with having it compiled with 2.7.3 but running 
it on 2.7.2. Since the other way around caused trouble (which usually doesn't 
happen), we could assume this might not work well either. Unfortunately i 
cannot test it, both our Hadoop clusters have already been upgraded.

Everyone would either have to recompile Nutch themselves or upgrade their 
Hadoop cluster, the latter is mostly a good thing, 2.7.2 and 2.7.3 fixed 
long-standing issues for Nutch.

The question is, what do we do.

Thanks,
Markus
 
-----Original message-----
> From:Sebastian Nagel <[email protected]>
> Sent: Saturday 21st January 2017 19:57
> To: [email protected]
> Subject: Re: CrawlDB data-loss and unable to inject 1.12 on Hadoop 2.7.3
> 
> Hi Markus,
> 
> after having once faced failing jobs due to dependency issues,
> I started to compile the Nutch.job with the same Hadoop version
> of the cluster. That's little extra time to change the ivy.xml
> and rarely resolve a conflicting dependency, but to fix broken
> data in the cluster costs you much more.
> 
> 
> > Reference issue: https://issues.apache.org/jira/browse/NUTCH-2354
> 
> What about the opposite, running Nutch.job compiled with 2.7.3 on a 2.7.2 
> Hadoop?
> Nothing against upgrading, but in doubt it would be good to know.
> 
> 
> Thanks,
> Sebastian
> 
> 
> On 01/20/2017 02:23 PM, Markus Jelsma wrote:
> > Hello,
> > 
> > This wednesday we experienced trouble running the 1.12 injector on Hadoop 
> > 2.7.3. We operated 2.7.2 before and we had no trouble running a job.
> > 
> > 2017-01-18 15:36:53,005 FATAL [main] org.apache.hadoop.mapred.YarnChild: 
> > Error running child : java.lang.IncompatibleClassChangeError: Found 
> > interface org.apache.hadoop.mapreduce.Counter, but class was expected
> >     at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:216)
> >     at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:100)
> >     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> >     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> >     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> >     at java.security.AccessController.doPrivileged(Native Method)
> >     at javax.security.auth.Subject.doAs(Subject.java:422)
> >     at 
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> >     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> > Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
> > interface org.apache.hadoop.mapreduce.Counter, but class was expected
> >         at org.apache.nutch.crawl.Injector.inject(Injector.java:383)
> >         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> >         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at 
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >         at 
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >         at java.lang.reflect.Method.invoke(Method.java:498)
> >         at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> > 
> > Our processes retried injecting for a few minutes until we manually shut it 
> > down. Meanwhile on HDFS, our CrawlDB was gone, thanks for snapshots and/or 
> > backups we could restore it, so enable those if you haven't done so yet.
> > 
> > These freak Hadoop errors can be notoriously difficult to debug but it 
> > seems we are in luck, recompile Nutch with Hadoop 2.7.3 instead 2.4.0. You 
> > are also in luck if your job file uses the old org.hadoop.mapred.* API, 
> > only jobs using the org.hadoop.mapreduce.* API seem to fail.
> > 
> > Reference issue: https://issues.apache.org/jira/browse/NUTCH-2354
> > 
> > Regards,
> > Markus
> > 
> 
> 

Reply via email to