Hi everyone,
I need some help to run my map-reduce job with pig.
I wrote a map-reduce job that takes an avro file as input:
job.setJarByClass(Main.class);
job.setJobName("MapReduceJob");
job.setMapperClass(Mapper.class);
job.setReducerClass(Reducer.class);
job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(DocumentRepresentation.class);
job.setOutputKeyClass(LongWritable.class);
job.setInputFormatClass(AvroKeyInputFormat.class);
If I run this job using the Hadoop jag command everything works fine and the
output of the map reduce job is as expected.
Now I need this map-reduce job to work inside of a pig-script. To test the
mr-job I used this test script:
...register statements
A = LOAD 'mr-input' USING org.apache.pig.piggybank.storage.avro.AvroStorage();
B = MAPREDUCE 'mr-job-0.0.1.jar' STORE A INTO 'mr-tmp' USING
org.apache.pig.piggybank.storage.avro.AvroStorage('schema', '...')
LOAD 'mr-result' AS (prefix: chararray, result: chararray)
`com.mycompany.hadoop.Main mr-tmp mr-result ...more parameters`;
All the mappers fail with the error: LongWritable cannot be cast to AvroMapper.
The mapper definition looks like this:
public class Mapper extends Mapper<AvroWrapper<Record>, NullWritable,
IntWritable, DocumentRepresentation> {
Any idea how to fix it?
Jonas