ObjectWritable(Document)

Samuel LEMOINE Thu, 16 Aug 2007 01:38:01 -0700

Hi all

I'm in trouble with ObjectWritable. I'm trying to implement a simpleindexation with Lucene & Hadoop, and for that I take inspiration fromnutch code. In the Indexer.java of nutch, line 245, I read:

output.collect(key, new ObjectWritable(doc));

(where doc is a Lucene Document: Document doc = new Document(); line 199same file)So, I try to do the same, but I encounter an error as if ObjectWritablecouldn't handle Document type:

/opt/java/bin/java -Didea.launcher.port=7539-Didea.launcher.bin.path=/opt/idea-6180/bin -Dfile.encoding=UTF-8-classpath/opt/jdk1.5.0_12/jre/lib/charsets.jar:/opt/jdk1.5.0_12/jre/lib/jce.jar:/opt/jdk1.5.0_12/jre/lib/jsse.jar:/opt/jdk1.5.0_12/jre/lib/plugin.jar:/opt/jdk1.5.0_12/jre/lib/deploy.jar:/opt/jdk1.5.0_12/jre/lib/javaws.jar:/opt/jdk1.5.0_12/jre/lib/rt.jar:/opt/jdk1.5.0_12/jre/lib/ext/localedata.jar:/opt/jdk1.5.0_12/jre/lib/ext/dnsns.jar:/opt/jdk1.5.0_12/jre/lib/ext/sunpkcs11.jar:/opt/jdk1.5.0_12/jre/lib/ext/sunjce_provider.jar:/home/samuel/IdeaProjects/LuceneScratchPad/classes/test/LuceneScratchPad:/home/samuel/IdeaProjects/LuceneScratchPad/classes/production/LuceneScratchPad:/home/samuel/IdeaProjects/LuceneScratchPad/lib/log4j/log4j-1.2.14.jar:/home/samuel/IdeaProjects/LuceneScratchPad/lib/lucene/lucene-core-2.2.0.jar:/home/samuel/hadoop-0.13.1/hadoop-0.13.1-core.jar:/home/samuel/IdeaProjects/LuceneScratchPad/lib/commons-logging-1.1/commons-logging-1.1.jar:/home/samuel/commons-cli-1.1/commons-cli-1.1.jar:/home/samuel/IdeaProjects/LuceneScratchPad/lib/hadoop/conf:/home/samuel/IdeaProjects/LuceneScratchPad/lib/commons-httpclient-3.0.1.jar:/opt/idea-6180/lib/idea_rt.jarcom.intellij.rt.execution.application.AppMaincom.lingway.proto.lucene.EntryPointHadoopINFO apache.hadoop.mapred.FileInputFormat - Total input paths toprocess : 9

INFO  apache.hadoop.mapred.JobClient - Running job: job_myhhdn
INFO  apache.hadoop.mapred.MapTask - numReduceTasks: 1
WARN  apache.hadoop.mapred.LocalJobRunner - job_myhhdn

java.io.IOException: Can't write:indexed,tokenized<content:[EMAIL PROTECTED]>as class org.apache.lucene.document.Fieldatorg.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:157)

   at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:65)

atorg.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:365)

   at com.lingway.proto.lucene.MapIndexer.map(MapIndexer.java:35)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:186)

atorg.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:131)

Exception in thread "main" java.io.IOException: Job failed!
   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)

atcom.lingway.proto.lucene.EntryPointHadoop.main(EntryPointHadoop.java:36)

   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

   at java.lang.reflect.Method.invoke(Method.java:585)
   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:90)

Process finished with exit code 1

However, in my code, if I separate the instanciation of theObjectWritable object, the creation doesn't cause any trouble, it's onlywhen I try to pass it to the OutputCollector...

Any idea ? why the code of nutch doesn't behave in the same way in myproject?(I can't afford to take the time to make nutch run, I'm at the very endof my internship so that I'm quite in a hurry :( )


Thanks in advance,

Sam

ObjectWritable(Document)

Reply via email to