I want to confirm the bug report by Alex Hanna made to this mailing list in message #2233 on 2013-03-06 (also entered into JIRA as https://issues.apache.org/jira/browse/AVRO-1271, but apparently punted to this mailing list).
There is a bug in org.apache.avro.mapred.AvroAsTextInputFormat. In my case, it looks like a faulty regular expression which is trying to do something with smart quotes. If an Avro datum destined to be a string value in JSON contains a smart quote, the datum string gets weirdly duplicated and uppercased in the JSON output. For example, if the Avro datum is [before“after the output JSON will contain the following (which is not legal JSON and cannot be parsed): [before\u[BEFORE“AFTERafter - - Martin
