A question about this: does Avro have clear cut rules for how to essentially merge two arbitrary JSON schemas?
2012/3/21 Jonathan Coveney <[email protected]> > ATM, there is no quick and easy solution short of patching Pig... feel > free to make a ticket. > > Short of that, what you can do is load each relation with a different > schema separately, and then do a union of it. Given that there might be a > lot of different relations and schemas involved, you could probably make a > script to do this for you... but yeah, the long term approach is to patch > AvroStorage. > > > 2012/3/21 Markus Resch <[email protected]> > >> Hi guys, >> >> Thanks again for your awesome hint about sqoop. >> >> I have another question: The data I'm working with is stored as AVRO >> Files in the Hadoop. When I try to glob them everything works just >> perfectly. But. When I add something to the schema of a single data file >> while the others remain, everything gets wrecked: >> >> "currently we assume all avro files under the same "location" >> * share the same schema and will throw exception if not." >> >> (e.g. I add a new data field) Expected behavior for me would be: If I'm >> globbing several files with slightly different schema the result of the >> LOAD would be either return an intersection of all valid fields that are >> common to both schemes or the atoms of the missing fields are nulled. >> >> How could I handle this properly? >> >> Thanks >> >> Markus >> >> >> >
