[ https://issues.apache.org/jira/browse/CRUNCH-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Wills updated CRUNCH-495: ------------------------------ Attachment: CRUNCH-495.patch Patch w/the fix (just appends an "_" to the name of the generated record) and a test case that will fail w/o the patch. > Fix case class/SpecificRecord interactions in Scrunch > ----------------------------------------------------- > > Key: CRUNCH-495 > URL: https://issues.apache.org/jira/browse/CRUNCH-495 > Project: Crunch > Issue Type: Bug > Components: Scrunch > Affects Versions: 0.11.0 > Reporter: Josh Wills > Assignee: Josh Wills > Fix For: 0.12.0 > > Attachments: CRUNCH-495.patch > > > So this is a fun one: I wrote a way to serialize case classes in Scala as > Avro generic records as part of the work for 0.11. However, if > AvroMode.SPECIFIC is enabled on a MR job (e.g., if you were doing a join > between one PTable that contained specific record instances and a different > PTable that contained instances of a case class), the SpecificData object in > Avro will get confused when it sees the Avro schema I generate for the case > class, b/c the name of the Avro schema is identical to the name of the case > class on the JVM, so Avro will think that the record is an actual instance of > a SpecificRecord. > The solution I came up with is to slightly modify the name of the generated > Avro generic schema that corresponds to the case class so that it doesn't > match the name of the case class exactly so that Avro doesn't get confused. -- This message was sent by Atlassian JIRA (v6.3.4#6332)