Nira Amit created AVRO-2005:
-------------------------------

             Summary: No ClassCastException when failing to load custom type 
and GenericData$Record is returned
                 Key: AVRO-2005
                 URL: https://issues.apache.org/jira/browse/AVRO-2005
             Project: Avro
          Issue Type: Improvement
          Components: java
    Affects Versions: 1.8.1, 1.7.7
         Environment: Ubuntu
            Reporter: Nira Amit


I'm loading data from an avro file and attempting to create custom objects from 
it. If my schema specifies a class that doesn't exist, the library defaults to 
creating org.apache.avro.generic.GenericData$Record instances instead. The 
problem is that this behavior causes unexpected ClassCastException to be thrown 
when I try to access the datum field of MyCustomClassAvroKey (which is an 
AvroKey<MyCustomClass>). All the assignments such as this.mCurrentRecord = 
this.mAvroFileReader.next(this.mCurrentRecord) in 
org.apache.avro.mapreduce.AvroRecordReaderBase succeed because of the type 
erasure.
So my question is: why doesn't the library just fail when it can't load the 
requested class? E.g in org.apache.avro.specific.SpecificData#getClass(Schema 
schema):
{code}
       try {
          c = ClassUtils.forName(getClassLoader(), getClassName(schema));
        } catch (ClassNotFoundException e) {
          c = NO_CLASS; // why not just fail?
        }
{code}
Is this a design choice? It violates the type-safety guarantee and causes very 
confusing and unexpected behavior.

You can find a test-app that reproduces this problem in this GitHub repository: 
https://github.com/homosepian/spark-avro-kryo
I ran into it while trying to load custom types into a Spark RDD.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to