On 8/26/10 7:47 PM, newpant wrote:
Hi, do you use JobConf.setInputFormat(KeyValueTextInputFormat.class) to set
the input format class ? Default input format class is TextInputFormat, and
the Key type is LongWritable, which store offset of lines in the file (in
byte)

if your reducer accept a different key or value from mapper output, you need
to setMapOutputKeyClass and setMapOutputValueClass

2010/8/27 Mark<[email protected]>

  When I configure my job to use a KeyValueTextInputFormat doesn't that
imply that the key and value to my mapper will be both Text?

I have it set up like this and I am using the default Mapper.class ie
IdentityMapper
- KeyValueTextInputFormat.addInputPath(job, new Path(otherArgs[0]));

but I keep receiving this error:
- java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
cast to org.apache.hadoop.io.Text

I would expect this error if I was using the FileInputFormat because that
return the key as a LongWritable and the value as Text but I am unsure of
why its happening here.

Also on the same note, when I supply FileInputFormat or
KeyValueTextInputFormat does that implicitly set job.setMapOutputKeyClass
and job.setMapOutputValueClass. When are these used?

Thanks for the clarification





No I didnt set that and when I did everything worked as expected. I thought if I used:

KeyValueTextInputFormat.addInputPath(job, new Path(otherArgs[0]))


it would set that for me or at lest know that it would be text/text as input. Im guessing that is wrong.

if your reducer accept a different key or value from mapper output, you need
to setMapOutputKeyClass and setMapOutputValueClass

When would this ever come up? Does it just cast to the appropriate classes then?

Thanks

Reply via email to