RyanSkraba commented on a change in pull request #14858:
URL: https://github.com/apache/beam/pull/14858#discussion_r644606333
##########
File path:
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
##########
@@ -906,6 +906,23 @@ private void readObject(ObjectInputStream in) throws
IOException, ClassNotFoundE
.map(x -> getFieldSchema(x.getType(), x.getName(),
namespace))
.collect(Collectors.toList()));
break;
+
+ case "NVARCHAR":
+ case "VARCHAR":
+ case "LONGNVARCHAR":
+ case "LONGVARCHAR":
+ baseType = org.apache.avro.Schema.create(Type.STRING);
Review comment:
Super interesting conversation from 2017! This could go either way --
the `"char"` and `"varchar"` logical types that Hive adds are not part of the
Avro specification and should be ignored by implementations that don't
understand them. A developer could implement their own "custom" logical types
that understand them and do any necessary truncation, but AFAIK, nobody does.
I have a small preference for aligning with the de facto behaviour of Spark
and Hive, since this information might be useful as data is sent downstream,
and it's easy to ignore. But I'd be OK either way, it could be added in a
later PR as a new feature.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]