Make sure you're setting the Parquet read schema.

On Thu, Apr 27, 2017 at 10:35 AM, Shannon Carey <[email protected]> wrote:

> As far as I can tell, that works for the top-level class, but not for
> others. When Avro's org.apache.avro.specific.SpecificData#getClass(Schema)
> attempts to look up the Java class for a "record" field of the top-level
> class:
>
> c = ClassUtils.forName(getClassLoader(), getClassName(schema));
>
>
> it doesn't find it, because the class name from the Parquet schema doesn't
> match the name of the Java class. As a result, instead of an instance of my
> Java class being passed to the put(int field$, Object value$) of the
> generated avro SpecificRecord subclass, a GenericData$Record is passed.
> Then, a ClassCastException is thrown when the value$ is cast to my Java
> class.
>
> -Shannon
>
>
>
> On 4/27/17, 11:39 AM, "Ryan Blue" <[email protected]> wrote:
>
> >Shannon, you can edit the Avro schema and add those namespaces. Then you
> >set that as your read schema for Parquet and it will correctly read the
> >data. The Avro schemas don't have to match, they just have to be
> compatible.
> >
> >rb
> >
> >On Thu, Apr 27, 2017 at 9:37 AM, Shannon Carey <[email protected]>
> wrote:
> >
> >> I'm not sure whether I should be asking Parquet people or Avro people
> >> about this.
> >>
> >> I'm reading a Parquet file via Avro. The Parquet file was produced by
> >> Spark. The Avro schema that I generated from the file (by deserializing
> it
> >> as a GenericData record & retrieving its schema) uses "record" types
> that
> >> have no "namespace" value. Therefore, when generating Java classes from
> the
> >> Avro schema in order to deserialize the Parquet file to strongly typed
> >> objects, the generated Java classes are created in the default package.
> As
> >> you may know, it's basically impossible to interact with Java classes
> that
> >> are defined in the unnamed package.
> >>
> >> Has anyone else run into this situation? And is there any way to work
> >> around it? It seems like I should be able to specify how the types in
> the
> >> Parquet file should map to a Avro namespace/package name… not only for
> >> preventing classes in the unnamed package but also for avoiding class
> name
> >> conflicts.
> >>
> >> Thanks!
> >>
> >
> >
> >
> >--
> >Ryan Blue
> >Software Engineer
> >Netflix
>



-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to