Hi guys,

I use Pig to process some clickstream data. I need to track a new field, so
I added a new field to my avro schema, and changed my Pig script
accordingly. It works fine with the new files (which have that new column)
but it breaks when I run it on my old files which do not have that column
in the schema (since avro stores schema in the data files itself). I was
expecting that Pig will assume the field to be null if that particular
field does not exist. But now I am having to maintain separate scripts to
process the old and new files. Is there any workaround this? Because I
figure I'll have to add new column frequently and I don't want to maintain
a separate script for each window where the schema is constant.

Thanks,

Reply via email to