It looks like AvroStorage uses JSON encoding instead of Binary encoding when writing files.
This is based on reading http://avro.apache.org/docs/current/spec.html#Encodings https://pig.apache.org/docs/r0.14.0/func.html#AvroStorage and also from looking at the schema of the Avro files I've written with Pig. You can identify a JSON-encoded schema by seeing union types everywhere (e.g. ["int","null"] instead of just "int") It seems strange that JSON encoding would be used. From the Avro docs: "Most applications will use the binary encoding, as it is smaller and faster. But, for debugging and web-based applications, the JSON encoding may sometimes be appropriate."