This came up in the context of using Hive: how should I handle mapping 
“original” to “safe” field names?

Currently, a non-alphanumeric field name in Avro leads to an error when using 
it with Hive. That’s fine, but while researching that, I saw that this is a 
generally unresolved issue.

The field name with Avro implementations is almost-always processed as UTF-8. 
That keeps it in parity with JSON, which is nice.
But, there was talk about possibly restricting it to alphanumeric [w/ 
underscore perhaps]. Apologies, I don’t have the bug numbers

Looks like we have aliases:
https://issues.apache.org/jira/browse/AVRO-600

Do I just pop the “original” field name in as an alias and use the “safe” 
(alphanumeric+underscore) one as the primary name?

-Charles

Reply via email to