Re: Error while reading avro data file in oozie workflow - class is incompatible with new map API mode

Jakub Stransky Thu, 07 Aug 2014 06:43:34 -0700

Hi Frank,

I have already encountered that so I am aware of this. That seems to be ok.


Jakub


On 7 August 2014 15:29, Frank Maritato <[email protected]> wrote:

> Hi,
>
> Make sure you are importing the hadoop2 version of the avro jar. If you
> are using maven, the dependency import would look like this:
>
>         <dependency>
>   <groupId>org.apache.avro</groupId>
>   <artifactId>avro-mapred</artifactId>
>   <classifier>hadoop2</classifier>
>   <version>1.7.4</version>
>         </dependency>
>
>
> Hope this helps.
> --
> Frank Maritato
>
>
>
>
> On 8/7/14, 3:23 AM, "Jakub Stransky" <[email protected]> wrote:
>
> >I am trying to run a MR job from oozie workflow with avro data file as an
> >input and output as well. Mapper emits Text and IntWritable. I am using a
> >new mr api - mapreduce. My workflow definition is following:
> >
> >    <workflow-app xmlns="uri:oozie:workflow:0.5" name="map-reduce-wf">
> >    <global>
> >       <job-tracker>${jobTracker}</job-tracker>
> >       <name-node>${nameNode}</name-node>
> >       <configuration>
> >            <property>
> >                <name>mapreduce.job.queuename</name>
> >                <value>${queueName}</value>
> >            </property>
> >        </configuration>
> >    </global>
> >
> >        <start to="mr-node"/>
> >
> >        <action name="mr-node">
> >            <map-reduce>
> >                <prepare>
> >                    <delete path="${nameNode}/${outputDir}"/>
> >                </prepare>
> >                <configuration>
> >                    <!-- BEGIN: SNIPPET TO ADD IN ORDER TO MAKE USE OF NEW
> >HADOOP API -->
> >                    <property>
> >                      <name>mapred.reducer.new-api</name>
> >                      <value>true</value>
> >                    </property>
> >                    <property>
> >                      <name>mapred.mapper.new-api</name>
> >                      <value>true</value>
> >                    </property>
> >                    <!-- END: SNIPPET -->
> >                    <property>
> >                        <name>mapreduce.map.class</name>
> >
> ><value>com.ncr.bigdata.mr.avro.AvroPifDriver$PifMapper</value>
> >                    </property>
> >                    <property>
> >                        <name>mapreduce.reduce.class</name>
> >
> ><value>com.ncr.bigdata.mr.avro.AvroPifDriver$PifReducer</value>
> >                    </property>
> >                    <property>
> >                        <name>mapred.map.tasks</name>
> >                        <value>1</value>
> >                    </property>
> >                    <property>
> >                        <name>mapred.input.dir</name>
> >                        <value>${nameNode}/${inputDir}</value>
> >                    </property>
> >                    <property>
> >                        <name>mapred.output.dir</name>
> >                        <value>${nameNode}/${outputDir}</value>
> >                    </property>
> >                    <property>
> >                        <name>mapred.input.format.class</name>
> >
> ><value>org.apache.avro.mapreduce.AvroKeyInputFormat</value>
> >                    </property>
> >                    <property>
> >                        <name>avro.schema.input.key</name>
> >
> ><value>{"type":"record","name":"SampleRecord","namespace":"org.co.sample.e
> >tl.domain","fields":[{"name":"requiredName","type":"string"},{"name":"opti
> >onalName","type":["null","string"]},{"name":"dataItemLong","type":"long"},
> >{"name":"dataItemInt","type":"int"},{"name":"startTime","type":"long"},{"n
> >ame":"endTime","type":"long"}]}</value>
> >                    </property>
> >
> >
> >                    <property>
> >                        <name>mapred.output.format.class</name>
> >
> ><value>org.apache.avro.mapreduce.AvroKeyValueOutputFormat</value>
> >                    </property>
> >                    <property>
> >                        <name>mapred.output.key.class</name>
> >                        <value>org.apache.avro.mapred.AvroKey</value>
> >                    </property>
> >                    <property>
> >                        <name>mapred.output.value.class</name>
> >                        <value>org.apache.avro.mapred.AvroValue</value>
> >                    </property>
> >
> >                    <property>
> >                        <name>avro.schema.output.key</name>
> >                        <value>string</value>
> >                    </property>
> >                    <property>
> >                        <name>avro.schema.output.value</name>
> >                        <value>int</value>
> >                    </property>
> >
> >
> >                </configuration>
> >            </map-reduce>
> >            <ok to="end"/>
> >            <error to="fail"/>
> >        </action>
> >        <kill name="fail">
> >            <message>Map/Reduce failed, error
> >message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> >        </kill>
> >        <end name="end"/>
> >    </workflow-app>
> >
> >My mapper looks following:
> >
> >    import org.apache.hadoop.io.IntWritable;
> >    import org.apache.hadoop.io.NullWritable;
> >    import org.apache.hadoop.io.Text;
> >    import org.apache.hadoop.mapreduce.Job;
> >    import org.apache.hadoop.mapreduce.Mapper;
> >
> >     public static class PifMapper extends Mapper<AvroKey<PosData>,
> >NullWritable, Text, IntWritable> {
> >
> >            @Override
> >            public void map(AvroKey<PosData> key, NullWritable value,
> >Context context)
> >                    throws IOException, InterruptedException {
> >            ...
> >            }
> >    }
> >
> >I am getting following error:
> >
> >    140807041959771-oozie-oozi-W@mr-node] Launcher exception:
> >mapred.input.format.class is incompatible with new map API mode.
> >    java.io.IOException: mapred.input.format.class is incompatible with
> >new
> >map API mode.
> >    at org.apache.hadoop.mapreduce.Job.ensureNotSet(Job.java:1172)
> >    at org.apache.hadoop.mapreduce.Job.setUseNewAPI(Job.java:1198)
> >    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1261)
> >    at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
> >    at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
> >    at java.security.AccessController.doPrivileged(Native Method)
> >    at javax.security.auth.Subject.doAs(Subject.java:415)
> >    ...
> >
> >I am using Hadoop 2.2.0 (HDP 2.0), Oozie 4.0.0, Avro 1.7.4
> >
> >Map reduce jobs submitted via driver class works fine.
> >org.apache.avro.mapreduce.AvroKeyInputFormat should be implementation of
> >the new mapreduce as well.
> >
> >To make sure that there is no lib clash I removed shared lib from ozzie
> >and
> >all libs are included to workflow lib dir.
> >
> >Any hints?
> >
> >
> >--
> >Jakub Stransky
>
>


-- 
Jakub Stransky
cz.linkedin.com/in/jakubstransky

Re: Error while reading avro data file in oozie workflow - class is incompatible with new map API mode

Reply via email to