[ 
https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880743#action_12880743
 ] 

Iván de Prado commented on AVRO-493:
------------------------------------

Writting a custom DeserializerCompartor is needed if you want this patch to be 
useful in many developments. Otherwise you would need a different Avro schema 
with a different sorting for each kind of grouping you want to do in the 
reducer. I'm failing to create a custom DeserializerComparator:

{code:java}
  public static class CustomComparator extends 
DeserializerComparator<AvroWrapper<GenericRecord>> {

        public CustomComparator() throws IOException {
                super(new 
AvroKeySerialization().getDeserializer(AvroWrapper.class));
        }

        @Override
        public int compare(AvroWrapper<GenericRecord> o1, 
AvroWrapper<GenericRecord> o2) {
                
                return 
o1.datum().get("word").toString().charAt(1)-o2.datum().get("word").toString().charAt(1);
        }
  }
 {code}

It raises the following exception:

{noformat}
Caused by: java.lang.NullPointerException
        at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:98)
        at 
org.apache.avro.mapred.AvroKeySerialization.getDeserializer(AvroKeySerialization.java:55)
        ....
{noformat}

The problem is in that line:

{code:java}
    Schema schema = AvroJob.getMapOutputSchema(getConf());
{code}

It is looking for the datum schema at the job configuration but unsurprisingly 
it is not there.

Any ideas or workarrounds for creating custom Comparators for Avro? 

> hadoop mapreduce support for avro data
> --------------------------------------
>
>                 Key: AVRO-493
>                 URL: https://issues.apache.org/jira/browse/AVRO-493
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.4.0
>
>         Attachments: AVRO-493.patch, AVRO-493.patch
>
>
> Avro should provide support for using Hadoop MapReduce over Avro data files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to