As far as I can tell, that works for the top-level class, but not for others. When Avro's org.apache.avro.specific.SpecificData#getClass(Schema) attempts to look up the Java class for a "record" field of the top-level class:
c = ClassUtils.forName(getClassLoader(), getClassName(schema)); it doesn't find it, because the class name from the Parquet schema doesn't match the name of the Java class. As a result, instead of an instance of my Java class being passed to the put(int field$, Object value$) of the generated avro SpecificRecord subclass, a GenericData$Record is passed. Then, a ClassCastException is thrown when the value$ is cast to my Java class. -Shannon On 4/27/17, 11:39 AM, "Ryan Blue" <[email protected]> wrote: >Shannon, you can edit the Avro schema and add those namespaces. Then you >set that as your read schema for Parquet and it will correctly read the >data. The Avro schemas don't have to match, they just have to be compatible. > >rb > >On Thu, Apr 27, 2017 at 9:37 AM, Shannon Carey <[email protected]> wrote: > >> I'm not sure whether I should be asking Parquet people or Avro people >> about this. >> >> I'm reading a Parquet file via Avro. The Parquet file was produced by >> Spark. The Avro schema that I generated from the file (by deserializing it >> as a GenericData record & retrieving its schema) uses "record" types that >> have no "namespace" value. Therefore, when generating Java classes from the >> Avro schema in order to deserialize the Parquet file to strongly typed >> objects, the generated Java classes are created in the default package. As >> you may know, it's basically impossible to interact with Java classes that >> are defined in the unnamed package. >> >> Has anyone else run into this situation? And is there any way to work >> around it? It seems like I should be able to specify how the types in the >> Parquet file should map to a Avro namespace/package name⦠not only for >> preventing classes in the unnamed package but also for avoiding class name >> conflicts. >> >> Thanks! >> > > > >-- >Ryan Blue >Software Engineer >Netflix
