[ https://issues.apache.org/jira/browse/CRUNCH-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053202#comment-14053202 ]
Gabriel Reid commented on CRUNCH-433: ------------------------------------- {quote}+1 for the corrected patch, with one request: that BaseAvroTableType be package-scoped instead of public if at all possible.{quote} Sounds like a good plan. The reason it's public is to use it specifically in AvroTableFileSource, but I think it's easy enough to get around that. {quote}do we need to add a classifier line to the avro-mapred dependencies in the POM for this stuff to work properly on MR1 vs. MR2?{quote} I don't think so, but I'm not sure I'm totally following what you mean. The only new thing being done here from avro-mapred is making use of the org.apache.avro.hadoop.io.AvroKeyValue class (basically only for schema creation), so I don't think there's anything that would change there in terms of needing classifiers (or am I missing something?) > Add support for reading specific/reflect data from an Avro MR file > ------------------------------------------------------------------ > > Key: CRUNCH-433 > URL: https://issues.apache.org/jira/browse/CRUNCH-433 > Project: Crunch > Issue Type: New Feature > Reporter: Gabriel Reid > Assignee: Gabriel Reid > Attachments: CRUNCH-433.patch > > > An Avro Key/Value file written via raw MapReduce contains records that follow > the schema generated by the org.apache.avro.hadoop.io.AvroKeyValue class. > If these files contain specific or reflection-based records, there is > currently no easy way to read them in as specific or reflection records. > Using the basic public Crunch APIs, they can only be read as generic records > (that also contain generic records). > A method should be added to the Avros class which allows specifying specific > PTypes to be used for reading the underlying data types within a raw MR > output file. > Link to related discussion that inspired this ticket on the user list: > http://s.apache.org/es -- This message was sent by Atlassian JIRA (v6.2#6252)