Hello,
I have created a custom nutch solr indexer as a jar file and put it under
nutch_home/lib. It runs successfully in local mode. In deploy mode it gives the
following error. The same jar file is included in job file and lib/
java.lang.RuntimeException: java.io.IOException: WritableName can't load class:
org.apache.nutch.parse.ParseData
at
org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1673)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1613)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1486)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1475)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1470)
at
org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:50)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: WritableName can't load class:
org.apache.nutch.parse.ParseData
at org.apache.hadoop.io.WritableName.getClass(WritableName.java:73)
at
org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1671)
... 13 more
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.parse.ParseData
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
at org.apache.hadoop.io.WritableName.getClass(WritableName.java:71)
... 14 more
This is because of the line of the code
FileInputFormat.addInputPath(job, new Path(segment, ParseData.DIR_NAME));
If I include CrawlDatum as
FileInputFormat.addInputPath(job, new Path(segment,
CrawlDatum.PARSE_DIR_NAME));
It gives the exact error with the class being as CrawlDatum.
Also, I have done exactly the same thing in nutch-2.x and it runs successfully
in both local and deploy modes.
Any ideas how to fix this issue?
Thanks.
Alex.