[GitHub] [beam] RyanSkraba commented on a change in pull request #14858: [BEAM-12385] Handle VARCHAR and Date-time JDBC specific logical types in AvroUtils.

GitBox Thu, 03 Jun 2021 01:45:21 -0700


RyanSkraba commented on a change in pull request #14858:
URL: https://github.com/apache/beam/pull/14858#discussion_r644606333




##########
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
##########
@@ -906,6 +906,23 @@ private void readObject(ObjectInputStream in) throws 
IOException, ClassNotFoundE
                         .map(x -> getFieldSchema(x.getType(), x.getName(), 
namespace))
                         .collect(Collectors.toList()));
             break;
+
+          case "NVARCHAR":
+          case "VARCHAR":
+          case "LONGNVARCHAR":
+          case "LONGVARCHAR":
+            baseType = org.apache.avro.Schema.create(Type.STRING);

Review comment:
       Super interesting conversation from 2017!  This could go either way -- 
the `"char"` and `"varchar"` logical types that Hive adds are not part of the 
Avro specification and should be ignored by implementations that don't 
understand them.  A developer could implement their own "custom" logical types 
that understand them and do any necessary truncation, but AFAIK, nobody does.
   
   I have a small preference for aligning with the de facto behaviour of Spark 
and Hive, since this information might be useful as data is sent downstream, 
and it's easy to ignore.  But I'd be OK either way, it could be added in a 
later PR as a new feature.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] RyanSkraba commented on a change in pull request #14858: [BEAM-12385] Handle VARCHAR and Date-time JDBC specific logical types in AvroUtils.

Reply via email to