[
https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880786#action_12880786
]
Iván de Prado commented on AVRO-493:
------------------------------------
Yes, I've tried, but doesn't work. I was able to dodge this issue by creating
my own AvroWrapperDeserializer class that receives the schema in the
constructor, but doesn't seem too elegant.
In any case, I found a bigger issue: seems imposible to use your custom group
comparator with the current AvroReducer implementation because the reducer
receives the datum as the key. From the AvroReducer:
{code:java}
public void reduce(AvroWrapper<IN> wrapper, Iterator<NullWritable> ignore,
OutputCollector<AvroWrapper<OUT>,NullWritable> output,
Reporter reporter) throws IOException {
if (this.out == null) {
this.out = output;
this.reporter = reporter;
}
reduce(wrapper.datum());
}
{code}
If you use your own group comparator, the child reducer will only receive the
first datum of each group. Any ideas about how to solve that?
> hadoop mapreduce support for avro data
> --------------------------------------
>
> Key: AVRO-493
> URL: https://issues.apache.org/jira/browse/AVRO-493
> Project: Avro
> Issue Type: New Feature
> Components: java
> Reporter: Doug Cutting
> Assignee: Doug Cutting
> Fix For: 1.4.0
>
> Attachments: AVRO-493.patch, AVRO-493.patch
>
>
> Avro should provide support for using Hadoop MapReduce over Avro data files.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.