problem runnig custom nutch command in deploy mode

alxsss Tue, 14 May 2013 11:42:32 -0700

Hello,

I have created a custom nutch solr indexer as a jar file and put it under 
nutch_home/lib. It runs successfully in local mode. In deploy mode it gives the 
following error. The same jar file is included in job file and lib/


java.lang.RuntimeException: java.io.IOException: WritableName can't load class: 
org.apache.nutch.parse.ParseData
        at 
org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1673)
        at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1613)
        at 
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1486)
        at 
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1475)
        at 
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1470)
        at 
org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:50)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: WritableName can't load class: 
org.apache.nutch.parse.ParseData
        at org.apache.hadoop.io.WritableName.getClass(WritableName.java:73)
        at 
org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1671)
        ... 13 more
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.parse.ParseData
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
        at org.apache.hadoop.io.WritableName.getClass(WritableName.java:71)
        ... 14 more

This is because of the line of the code 

    FileInputFormat.addInputPath(job, new Path(segment, ParseData.DIR_NAME));


If I include CrawlDatum as

 FileInputFormat.addInputPath(job, new Path(segment, 
CrawlDatum.PARSE_DIR_NAME));

It gives the exact error with the class being as CrawlDatum.

Also, I have done exactly the same thing in nutch-2.x and it runs successfully 
in both local and deploy modes.


Any ideas how to fix this issue?

Thanks.
Alex.

problem runnig custom nutch command in deploy mode

Reply via email to