Hi Liang,

Does the new builtin AvroStorage work for you? I don't use Avro myself, so
I cannot test it out. But it looks like that restriction is removed in the
new AvroStorage. Here is the relevant code-

https://github.com/apache/pig/blob/trunk/src/org/apache/pig/impl/util/avro/AvroTupleWrapper.java#L132

Thanks,
Cheolsoo


On Tue, Mar 25, 2014 at 9:17 AM, Liliang Li <[email protected]> wrote:

> Hi:
>
> I have a record of union type of
>
> union {TypeA, TypeB, TypeC, TypeD, TypeE} mydata;
>
> I have the serialized data in avro format, however when I am trying to use
> piggybank.jar's AvroStorage function to load the avro data, it gives me the
> following error:
>
> Caused by: java.io.IOException: We don't accept schema containing
> generic unions.
>     at
> org.apache.pig.piggybank.storage.avro.AvroSchema2Pig.convert(AvroSchema2Pig.java:54)
>     at
> org.apache.pig.piggybank.storage.avro.AvroStorage.getSchema(AvroStorage.java:384)
>     at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:174)
>     ... 23 more
>
> So, after reading the piggybank source code here
>
> https://github.com/triplel/pig/blob/branch-0.12/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
>
>     /** determine whether a union is a nullable union;
>     * note that this function doesn't check containing
>     * types of the input union recursively. */
>     public static boolean isAcceptableUnion(Schema in) {
>         if (! in.getType().equals(Schema.Type.UNION))
>            return false;
>
>     List<Schema> types = in.getTypes();
>     if (types.size() <= 1) {
>         return true;
>     } else if (types.size() > 2) {
>         return false; /*contains more than 2 types */
>     } else {
>         /* one of two types is NULL */
>         return types.get(0).getType().equals(Schema.Type.NULL) ||
> types.get(1) .getType().equals(Schema.Type.NULL);
>     }
> }
>
> basically piggybank's AvroStorage uses a function isAcceptableUnion(Schema
> in) which does not support more than 2 union types.
>
> My question is:
>
> Does anyone know any work around to read avro document with arbitrary union
> types in PIG?
>
>
> Any comments will be greatly appreciated.
>
>
> Liang
>

Reply via email to