[ 
https://issues.apache.org/jira/browse/CRUNCH-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13768743#comment-13768743
 ] 

Brian Dougan commented on CRUNCH-266:
-------------------------------------

While developing the patch, I found a related issue I went ahead and fixed.  

The deepCopy method sends an object to SpecificDatumReader that it expects it 
to populate, but in certain cases (for instance during this classloader issue), 
it won't actually populate the supplied object, but rather create a new 
GenericData.Record object and return it instead.  In this case, rather than 
throwing a ClassCastException, the code was just returning an empty object.  I 
fixed it by always returning the object returned by SpecificDataumReader.  This 
is technically non-passive as it would silently return an empty object before 
resulting in no error, but in this case, it seems the better thing to do.
                
> AvroSpecificDeepCopier needs to use constructor on SpecificDatumReader that 
> takes a class.
> ------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-266
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-266
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Brian Dougan
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: CRUNCH-266.patch
>
>
> As per https://issues.apache.org/jira/browse/AVRO-1240, when the avro jar is 
> in a parent classloader of the classloader that contains SpecificData 
> classes, a ClassCastException can occur if you don't use the 
> SpecificDatumReader constructor that takes a class (and accounts for the 
> classloader).
> Since standard hadoop commands seem to use parent classloaders, and avro is 
> included in the hadoop parent classloader, this issue can be seen if you run 
> a command from a jar including SpecificData classes that attempts to use them 
> from the hadoop command (such as materializing a PCollection of avro objects. 
>  
> It looks like all that is needed is to update AvroSpecificDatumReader to call 
> a different constructor.
> * public SpecificDatumReader(Class<T> c) 
> To add in more confusion, since avro is an included hadoop dependency, and 
> avro itself had a bug until 1.7.4, this fix will only work if avro in hadoop 
> has been updated to 1.7.4 (or is running on a version that has already done 
> this).  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to