Nira Amit created AVRO-2005:
-------------------------------
Summary: No ClassCastException when failing to load custom type
and GenericData$Record is returned
Key: AVRO-2005
URL: https://issues.apache.org/jira/browse/AVRO-2005
Project: Avro
Issue Type: Improvement
Components: java
Affects Versions: 1.8.1, 1.7.7
Environment: Ubuntu
Reporter: Nira Amit
I'm loading data from an avro file and attempting to create custom objects from
it. If my schema specifies a class that doesn't exist, the library defaults to
creating org.apache.avro.generic.GenericData$Record instances instead. The
problem is that this behavior causes unexpected ClassCastException to be thrown
when I try to access the datum field of MyCustomClassAvroKey (which is an
AvroKey<MyCustomClass>). All the assignments such as this.mCurrentRecord =
this.mAvroFileReader.next(this.mCurrentRecord) in
org.apache.avro.mapreduce.AvroRecordReaderBase succeed because of the type
erasure.
So my question is: why doesn't the library just fail when it can't load the
requested class? E.g in org.apache.avro.specific.SpecificData#getClass(Schema
schema):
{code}
try {
c = ClassUtils.forName(getClassLoader(), getClassName(schema));
} catch (ClassNotFoundException e) {
c = NO_CLASS; // why not just fail?
}
{code}
Is this a design choice? It violates the type-safety guarantee and causes very
confusing and unexpected behavior.
You can find a test-app that reproduces this problem in this GitHub repository:
https://github.com/homosepian/spark-avro-kryo
I ran into it while trying to load custom types into a Spark RDD.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)