Types and SequenceFiles

Jens Scheidtmann Thu, 30 May 2013 13:10:50 -0700

Dear list,

I have created a sequence file like this:


    seqWriter = SequenceFile.createWriter(fs, getConf(), new
Path(hdfsPath), IntWritable.class, BytesWritable.class,
SequenceFile.CompressionType.NONE);
    seqWriter.append(new IntWritable(index++), new BytesWritable(buf));

(with buf a byte array.)

Now, when reading the same sequence file in a map reduce job, I specify the
mapper like this:

    public static class NoOfMovesMapper
        extends Mapper<IntWritable, BytesWritable, IntWritable,
IntWritable> {

and configure the SequenceFile as:

    SequenceFileAsBinaryInputFormat.addInputPath(jobConf, new
Path(args[i]));

This job fails with:

    java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot
be cast to org.apache.hadoop.io.IntWritable
        at
org.gostats.hadoop.NoOfMoves$NoOfMovesMapper.map(NoOfMoves.java:1)

I have to specify the mapper as

     extends Mapper<LongWritable, Text, IntWritable, IntWritable> {

to read the sequence file. But then the number of records and invocations
of the map is much larger than I would expect. I thought that I will have
as many invocations of map as records in the sequence file.

What am I doing wrong? Were am I wrong?

Thanks in advance,

Jens

Types and SequenceFiles

Reply via email to