Hi all
I'm in trouble with ObjectWritable. I'm trying to implement a simple indexation with Lucene & Hadoop, and for that I take inspiration from nutch code. In the Indexer.java of nutch, line 245, I read:
output.collect(key, new ObjectWritable(doc));

(where doc is a Lucene Document: Document doc = new Document(); line 199 same file) So, I try to do the same, but I encounter an error as if ObjectWritable couldn't handle Document type:

/opt/java/bin/java -Didea.launcher.port=7539 -Didea.launcher.bin.path=/opt/idea-6180/bin -Dfile.encoding=UTF-8 -classpath /opt/jdk1.5.0_12/jre/lib/charsets.jar:/opt/jdk1.5.0_12/jre/lib/jce.jar:/opt/jdk1.5.0_12/jre/lib/jsse.jar:/opt/jdk1.5.0_12/jre/lib/plugin.jar:/opt/jdk1.5.0_12/jre/lib/deploy.jar:/opt/jdk1.5.0_12/jre/lib/javaws.jar:/opt/jdk1.5.0_12/jre/lib/rt.jar:/opt/jdk1.5.0_12/jre/lib/ext/localedata.jar:/opt/jdk1.5.0_12/jre/lib/ext/dnsns.jar:/opt/jdk1.5.0_12/jre/lib/ext/sunpkcs11.jar:/opt/jdk1.5.0_12/jre/lib/ext/sunjce_provider.jar:/home/samuel/IdeaProjects/LuceneScratchPad/classes/test/LuceneScratchPad:/home/samuel/IdeaProjects/LuceneScratchPad/classes/production/LuceneScratchPad:/home/samuel/IdeaProjects/LuceneScratchPad/lib/log4j/log4j-1.2.14.jar:/home/samuel/IdeaProjects/LuceneScratchPad/lib/lucene/lucene-core-2.2.0.jar:/home/samuel/hadoop-0.13.1/hadoop-0.13.1-core.jar:/home/samuel/IdeaProjects/LuceneScratchPad/lib/commons-logging-1.1/commons-logging-1.1.jar:/home/samuel/commons-cli-1.1/commons-cli-1.1.jar:/home/samuel/IdeaProjects/LuceneScratchPad/lib/hadoop/conf:/home/samuel/IdeaProjects/LuceneScratchPad/lib/commons-httpclient-3.0.1.jar:/opt/idea-6180/lib/idea_rt.jar com.intellij.rt.execution.application.AppMain com.lingway.proto.lucene.EntryPointHadoop INFO apache.hadoop.mapred.FileInputFormat - Total input paths to process : 9
INFO  apache.hadoop.mapred.JobClient - Running job: job_myhhdn
INFO  apache.hadoop.mapred.MapTask - numReduceTasks: 1
WARN  apache.hadoop.mapred.LocalJobRunner - job_myhhdn
java.io.IOException: Can't write: indexed,tokenized<content:[EMAIL PROTECTED]> as class org.apache.lucene.document.Field at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:157)
   at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:65)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:365)
   at com.lingway.proto.lucene.MapIndexer.map(MapIndexer.java:35)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:186)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:131)
Exception in thread "main" java.io.IOException: Job failed!
   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
at com.lingway.proto.lucene.EntryPointHadoop.main(EntryPointHadoop.java:36)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:585)
   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:90)

Process finished with exit code 1




However, in my code, if I separate the instanciation of the ObjectWritable object, the creation doesn't cause any trouble, it's only when I try to pass it to the OutputCollector...

Any idea ? why the code of nutch doesn't behave in the same way in my project? (I can't afford to take the time to make nutch run, I'm at the very end of my internship so that I'm quite in a hurry :( )

Thanks in advance,

Sam

Reply via email to