For the Avro people, does AvroStorage support column pruning? 2012/4/5 Stan Rosenberg <stan.rosenb...@gmail.com>
> AFAIK, by default AvroStorage enforces that all input files have > exactly the same schema. I've submitted a patch to improve > this somewhat by allowing different input schemas so long as a union > schema can be derived; e.g., say schema 1 contains field 'foo' which > is not > in schema 2, and schema 2 contains 'bar' which is not in schema 1, > then the resulting schema will have both fields, etc. > (The patch is here: https://issues.apache.org/jira/browse/PIG-2579.) > > In your case, you seem to have different schemas where the difference > is actual in the fields which are never used inside pig. > That's an entirely new use case, afaik. The union schema is one > workaround. However, it might be better to specify these unused > fields > and preclude them from validation; perhaps running validation only > against those fields which are specified in the pig script. > > Best, > > stan > > On Thu, Apr 5, 2012 at 8:58 AM, Philipp <philipp.p...@metrigo.de> wrote: > > Hi list, > > > > if I run pig over several avro files with different schemas I get a > schema > > mismatch message, even if the schema has only changed marginally in a > field > > that I'm not even using in that particular pig job. > > I'm wondering if it would be possible to resolve the mismatch, eg. as > > suggested in: > > https://avro.apache.org/docs/current/spec.html#Schema+Resolution > > > > Regards, Philipp > > > > >