As far as I can tell, that works for the top-level class, but not for others. 
When Avro's org.apache.avro.specific.SpecificData#getClass(Schema) attempts to 
look up the Java class for a "record" field of the top-level class:

c = ClassUtils.forName(getClassLoader(), getClassName(schema));


it doesn't find it, because the class name from the Parquet schema doesn't 
match the name of the Java class. As a result, instead of an instance of my 
Java class being passed to the put(int field$, Object value$) of the generated 
avro SpecificRecord subclass, a GenericData$Record is passed. Then, a 
ClassCastException is thrown when the value$ is cast to my Java class.

-Shannon



On 4/27/17, 11:39 AM, "Ryan Blue" <[email protected]> wrote:

>Shannon, you can edit the Avro schema and add those namespaces. Then you
>set that as your read schema for Parquet and it will correctly read the
>data. The Avro schemas don't have to match, they just have to be compatible.
>
>rb
>
>On Thu, Apr 27, 2017 at 9:37 AM, Shannon Carey <[email protected]> wrote:
>
>> I'm not sure whether I should be asking Parquet people or Avro people
>> about this.
>>
>> I'm reading a Parquet file via Avro. The Parquet file was produced by
>> Spark. The Avro schema that I generated from the file (by deserializing it
>> as a GenericData record & retrieving its schema) uses "record" types that
>> have no "namespace" value. Therefore, when generating Java classes from the
>> Avro schema in order to deserialize the Parquet file to strongly typed
>> objects, the generated Java classes are created in the default package. As
>> you may know, it's basically impossible to interact with Java classes that
>> are defined in the unnamed package.
>>
>> Has anyone else run into this situation? And is there any way to work
>> around it? It seems like I should be able to specify how the types in the
>> Parquet file should map to a Avro namespace/package name… not only for
>> preventing classes in the unnamed package but also for avoiding class name
>> conflicts.
>>
>> Thanks!
>>
>
>
>
>-- 
>Ryan Blue
>Software Engineer
>Netflix

Reply via email to