For the Avro people, does AvroStorage support column pruning?

2012/4/5 Stan Rosenberg <stan.rosenb...@gmail.com>

> AFAIK, by default AvroStorage enforces that all input files have
> exactly the same schema.  I've submitted a patch to improve
> this somewhat by allowing different input schemas so long as a union
> schema can be derived; e.g., say schema 1 contains field 'foo' which
> is not
> in schema 2, and schema 2 contains 'bar' which is not in schema 1,
> then the resulting schema will have both fields, etc.
> (The patch is here: https://issues.apache.org/jira/browse/PIG-2579.)
>
> In your case, you seem to have different schemas where the difference
> is actual in the fields which are never used inside pig.
> That's an entirely new use case, afaik.  The union schema is one
> workaround.  However, it might be better to specify these unused
> fields
> and preclude them from validation; perhaps running validation only
> against those fields which are specified in the pig script.
>
> Best,
>
> stan
>
> On Thu, Apr 5, 2012 at 8:58 AM, Philipp <philipp.p...@metrigo.de> wrote:
> > Hi list,
> >
> > if I run pig over several avro files with different schemas I get a
> schema
> > mismatch message, even if the schema has only changed marginally in a
> field
> > that I'm not even using in that particular pig job.
> > I'm wondering if it would be possible to resolve the mismatch, eg. as
> > suggested in:
> > https://avro.apache.org/docs/current/spec.html#Schema+Resolution
> >
> > Regards, Philipp
> >
> >
>

Reply via email to