ATM, there is no quick and easy solution short of patching Pig... feel free to make a ticket.
Short of that, what you can do is load each relation with a different schema separately, and then do a union of it. Given that there might be a lot of different relations and schemas involved, you could probably make a script to do this for you... but yeah, the long term approach is to patch AvroStorage. 2012/3/21 Markus Resch <[email protected]> > Hi guys, > > Thanks again for your awesome hint about sqoop. > > I have another question: The data I'm working with is stored as AVRO > Files in the Hadoop. When I try to glob them everything works just > perfectly. But. When I add something to the schema of a single data file > while the others remain, everything gets wrecked: > > "currently we assume all avro files under the same "location" > * share the same schema and will throw exception if not." > > (e.g. I add a new data field) Expected behavior for me would be: If I'm > globbing several files with slightly different schema the result of the > LOAD would be either return an intersection of all valid fields that are > common to both schemes or the atoms of the missing fields are nulled. > > How could I handle this properly? > > Thanks > > Markus > > >
