Re: how to write custom object using M/R

David Rosenstrauch Tue, 18 Jan 2011 10:49:41 -0800

Sounds to me like your custom object isn't serializing properly.

You might want to read up on how to do it correctly here:http://developer.yahoo.com/hadoop/tutorial/module5.html#types

FYI - here's an example of a custom type I wrote, which I'm able toread/write successfully to/from a sequence file:



public class UserStateRecordWritable implements Writable {

        public UserStateRecordWritable() {
                recordType = new Text();
                recordData = new BytesWritable();
        }

        public void readFields(DataInput in) throws IOException {
                recordType.readFields(in);
                recordData.readFields(in);
        }

        public void write(DataOutput out) throws IOException {
                recordType.write(out);
                recordData.write(out);
        }

        public void set(Text newRecordType, BytesWritable newRecordData) {
                recordType.set(newRecordType);
                recordData.set(newRecordData);
        }

        public Text getRecordType() {
                return recordType;
        }

        public BytesWritable getRecordData() {
                return recordData;
        }

        public String copyRecordType() {
                return recordType.toString();
        }

        public byte[] copyRecordData() {
                return TraitWeightUtils.getBytes(recordData);
        }

        private Text recordType;
        private BytesWritable recordData;
}


HTH,

DR

On 01/14/2011 07:57 AM, Joan wrote:

Hi,

I'm trying to write (K,V) where K is a Text object and V's CustomObject. But
It doesn't run.

I'm configuring output job like: SequenceFileInputFormat so I have job with:

         job.setMapOutputKeyClass(Text.class);
         job.setMapOutputValueClass(CustomObject.class);
         job.setOutputKeyClass(Text.class);
         job.setOutputValueClass(CustomObject.class);

         SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");

And I obtain the next output (this is a file: part-r-00000):

K  CustomObject@2b237512
K  CustomObject@24db06de
...

When this job finished I run other job which input is
SequenceFileInputFormat but It doesn't run:

The configuration's second job is:

         job.setInputFormatClass(SequenceFileInputFormat.class);
         SequenceFileInputFormat.addInputPath(job, new Path("myPath"));

But I get an error:

java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not
a SequenceFile
         at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
         at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
         at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
         at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
         at
org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)


Can someone help me? Because I don't understand it. I don't know to save my
object in first M/R and how to use it in second M/R

Thanks

Joan

Re: how to write custom object using M/R

Reply via email to