This came up in the context of using Hive: how should I handle mapping “original” to “safe” field names?
Currently, a non-alphanumeric field name in Avro leads to an error when using it with Hive. That’s fine, but while researching that, I saw that this is a generally unresolved issue. The field name with Avro implementations is almost-always processed as UTF-8. That keeps it in parity with JSON, which is nice. But, there was talk about possibly restricting it to alphanumeric [w/ underscore perhaps]. Apologies, I don’t have the bug numbers Looks like we have aliases: https://issues.apache.org/jira/browse/AVRO-600 Do I just pop the “original” field name in as an alias and use the “safe” (alphanumeric+underscore) one as the primary name? -Charles
