[
https://issues.apache.org/jira/browse/AVRO-669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12911082#action_12911082
]
Ron Bodkin commented on AVRO-669:
---------------------------------
This change does break the word count test:
[junit] java.lang.ClassCastException: java.lang.String cannot be cast to
org.apache.avro.util.Utf8
[junit] at
org.apache.avro.mapred.TestWordCount$MapImpl.map(TestWordCount.java:43)
[junit] at org.apache.avro.mapred.HadoopMapper.map(HadoopMapper.java:80)
[junit] at org.apache.avro.mapred.HadoopMapper.map(HadoopMapper.java:34)
[junit] at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
[junit] at
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
[junit] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
[junit] at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
This was because ReflectDatumReader.readString returns a String, whereas
GenericDatumReader.readString returns a Utf8.
I'm tested making ReflectDatumReader return a Utf8, but there are a number of
places in the code that expect it to return a String - e.g., a protocol test
breaks and TestDataFileReflect breaks also.
What do you think is the right way to handle this inconsistency?
> Avro Mapreduce Doesn't Work With Reflect Schemas
> ------------------------------------------------
>
> Key: AVRO-669
> URL: https://issues.apache.org/jira/browse/AVRO-669
> Project: Avro
> Issue Type: Bug
> Reporter: Ron Bodkin
> Assignee: Doug Cutting
> Fix For: 1.5.0
>
> Attachments: AVRO-669.patch, AVRO-669.patch.2
>
>
> I'm trying to get the Avro trunk code (from Subversion) to work with a simple
> example of a reflection-defined schema, using a class I created. I use a
> ReflectDatumWriter to write a set of records to a file, e.g.,
> DatumWriter writer = new ReflectDatumWriter(Record.class);
> DataFileWriter file = new DataFileWriter(writer);
> However, when I try to read that data in using an AvroMapper it fails with an
> exception as shown below. It turns out that the mapreduce implementation
> hard-codes a dependence on SpecificDatum readers and writers.
> I've tested switching to use ReflectDatum instead in five places to try to
> get it to work for an end-to-end reflect data example:
> AvroFileInputFormat
> AvroFileOutputFormat
> AvroSerialization (getDeserializer and getSerializer)
> AvroKeyComparator
> However, switching to use reflection for AvroKeyComparator doesn't work:
> java.lang.UnsupportedOperationException
> at org.apache.avro.reflect.ReflectData.compare(ReflectData.java:427)
> at
> org.apache.avro.mapred.AvroKeyComparator.compare(AvroKeyComparator.java:46)
> It should be possible to implement compare on reflect data (just like
> GenericData's implementation but use the field name instead (or better yet a
> cached java.lang.reflect.Field)...
> Original exception:
> java.lang.ClassCastException: tba.mr.sample.avro.Record cannot be cast to
> org.apache.avro.generic.IndexedRecord
> at
> org.apache.avro.generic.GenericDatumReader.setField(GenericDatumReader.java:152)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:142)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:114)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:105)
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:198)
> at
> org.apache.avro.mapred.AvroRecordReader.next(AvroRecordReader.java:63)
> at
> org.apache.avro.mapred.AvroRecordReader.next(AvroRecordReader.java:33)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.