ATM, there is no quick and easy solution short of patching Pig... feel free
to make a ticket.

Short of that, what you can do is load each relation with a different
schema separately, and then do a union of it. Given that there might be a
lot of different relations and schemas involved, you could probably make a
script to do this for you... but yeah, the long term approach is to patch
AvroStorage.

2012/3/21 Markus Resch <[email protected]>

> Hi guys,
>
> Thanks again for your awesome hint about sqoop.
>
> I have another question: The data I'm working with is stored as AVRO
> Files in the Hadoop. When I try to glob them everything works just
> perfectly. But. When I add something to the schema of a single data file
> while the others remain, everything gets wrecked:
>
> "currently we assume all avro files under the same "location"
>     * share the same schema and will throw exception if not."
>
> (e.g. I add a new data field) Expected behavior for me would be: If I'm
> globbing several files with slightly different schema the result of the
> LOAD would be either return an intersection of all valid fields that are
> common to both schemes or the atoms of the missing fields are nulled.
>
> How could I handle this properly?
>
> Thanks
>
> Markus
>
>
>

Reply via email to