For the time being, I've given up on using object serialization to do
what I want. Instead, I'm going to just marshal and unmarshal the
values of my class myself. I've implemented write() and readField()
methods in the classes that I want to read and write. (See my
definition of Sample below.)
Unfortunately, Hadoop throws the following exception when my program starts:
Job started: Wed Oct 10 18:04:06 EDT 2007
07/10/10 18:04:06 INFO mapred.InputFormatBase: Total input paths to process : 1
07/10/10 18:04:06 INFO mapred.JobClient: Running job: job_nlx1k6
07/10/10 18:04:06 WARN mapred.LocalJobRunner: job_nlx1k6
java.lang.ExceptionInInitializerError
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:315)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:326)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:339)
at
org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:411)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:273)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:115)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:126)
Caused by: java.lang.RuntimeException:
java.lang.InstantiationException: net.intelresearch.cvmHadoop.Sample
at
org.apache.hadoop.io.WritableComparator.newKey(WritableComparator.java:74)
at
org.apache.hadoop.io.WritableComparator.<init>(WritableComparator.java:62)
at net.intelresearch.cvmHadoop.Sample$Comparator.<init>(Unknown Source)
at net.intelresearch.cvmHadoop.Sample.<clinit>(Unknown Source)
... 9 more
Caused by: java.lang.InstantiationException: net.intelresearch.cvmHadoop.Sample
at java.lang.Class.newInstance0(Class.java:340)
at java.lang.Class.newInstance(Class.java:308)
at
org.apache.hadoop.io.WritableComparator.newKey(WritableComparator.java:72)
... 12 more
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
at net.intelresearch.cvmHadoop.KeyedByLocationalCode$Driver.main(Unknown
Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
at net.intelresearch.cvmHadoop.Usage.main(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
If I'm only trying to use the Writable interface (not
WritableComparable), what is the purpose of a WritableComparator?
Values are not sorted, only Keys, so it seems that there is no need to
define a comparator for them. Just to be on the safe side, I did
implement one and called WritableComparator.define() with it in
Sample's initializer. What am I missing here?
Thanks again for the help.
-steve
---
package net.intelresearch.cvmHadoop;
import java.io.*;
import org.apache.hadoop.io.*;
public class Sample implements Writable {
Address address;
SampleValue value; // sampled value at this point
public Sample(Address a, SampleValue v) {
address = a;
value = v;
}
public SampleValue getValue() { return value;}
public Address getAddress() { return address; }
public String toString () {
return (address.toString() + " " + value.toString());
}
public void write(DataOutput out) throws IOException {
address.write(out);
value.write(out);
}
public void readFields(DataInput in) throws IOException {
address = new Address();
address.readFields(in);
value = new SampleValue();
value.readFields(in);
}
public static class Comparator extends WritableComparator {
public Comparator() {
super (Sample.class);
}
// Just order by Address for now
public int compare(Sample a, Sample b) {
return a.getAddress().compareTo(b.getAddress());
}
}
// register this comparator
static {
WritableComparator.define(Sample.class, new Comparator());
}
}
On 10/10/07, Matt Kent <[EMAIL PROTECTED]> wrote:
> You're right, Serializable should be sufficient. I was thinking of a
> case where you'd sometimes want to write them out as values, but other
> times combine them inside Sample.
>
> On 10/10/07, Steve Schlosser <[EMAIL PROTECTED]> wrote:
> > Is this true? The fact that SampleValue and Address implement
> > Serializable should be sufficient to write them out to the stream.
> > They are not ever written out as keys or values themselves.
> >
> > -steve
> >
> > On 10/10/07, Matt Kent <[EMAIL PROTECTED]> wrote:
> > > I believe in this case you'll want to make Sample and Address writable as
> > > well.
> > >
> > > On 10/10/07, Steve Schlosser <[EMAIL PROTECTED]> wrote:
> > > > Hello all
> > > >
> > > > Is there a best practice for using my own classes as keys and values?
> > > >
> > > > My first attempt at doing this was successful - I built a
> > > > BigIntegerWritable class using IntWritable as a template. It was easy
> > > > because BigInteger has methods converting to and from byte arrays,
> > > > which I could then write into the DataOutput or read from the
> > > > DataInput.
> > > >
> > > > It seems like I should be able to use object serialization to write
> > > > to/read from the DataOutput/Input objects and make my own classes
> > > > implement the Writable interface. It seems like I should be able to
> > > > do something like this:
> > > >
> > > > import java.io.*;
> > > >
> > > > import org.apache.hadoop.io.*;
> > > >
> > > > public class Sample implements Writable {
> > > >
> > > > Address address;
> > > > SampleValue value; // sampled value at this point
> > > >
> > > > public Sample(Address a, SampleValue v) {
> > > > address = a;
> > > > value = v;
> > > > }
> > > >
> > > > public SampleValue getValue() { return value;}
> > > > public Address getAddress() { return address; }
> > > >
> > > > public String toString () {
> > > > return (address.toString() + " " + value.toString());
> > > > }
> > > >
> > > > [...]
> > > >
> > > > public void readFields(DataInput in) throws IOException {
> > > > ObjectInputStream oin = new
> > > > ObjectInputStream((DataInputBuffer)in);
> > > >
> > > > try {
> > > > address = (Address)oin.readObject();
> > > > value = (SampleValue)oin.readObject();
> > > > } catch (ClassNotFoundException e) {
> > > > throw new IOException(e.toString());
> > > > }
> > > >
> > > > }
> > > >
> > > > public void write(DataOutput out) throws IOException {
> > > > ObjectOutputStream oout = new
> > > > ObjectOutputStream((DataOutputBuffer)out);
> > > >
> > > > oout.writeObject(address);
> > > > oout.writeObject(value);
> > > > }
> > > > }
> > > >
> > > > This code compiles, but throws exceptions at runtime, complaining that
> > > > WritableComparator can not access a member of class Sample with
> > > > modifiers "". Can someone tell me what this exception is talking
> > > > about?
> > > >
> > > > Do I need to implement a WritableComparator for each class that I want
> > > > to implement Writable?
> > > >
> > > > Thanks again for the help.
> > > >
> > > > -steve
> > > >
> > >
> >
>