[
https://issues.apache.org/jira/browse/BEAM-12385?focusedWorklogId=605540&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-605540
]
ASF GitHub Bot logged work on BEAM-12385:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 02/Jun/21 20:47
Start Date: 02/Jun/21 20:47
Worklog Time Spent: 10m
Work Description: iemejia commented on a change in pull request #14858:
URL: https://github.com/apache/beam/pull/14858#discussion_r644308777
##########
File path:
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
##########
@@ -906,6 +906,23 @@ private void readObject(ObjectInputStream in) throws
IOException, ClassNotFoundE
.map(x -> getFieldSchema(x.getType(), x.getName(),
namespace))
.collect(Collectors.toList()));
break;
+
+ case "NVARCHAR":
+ case "VARCHAR":
+ case "LONGNVARCHAR":
+ case "LONGVARCHAR":
+ baseType = org.apache.avro.Schema.create(Type.STRING);
Review comment:
These should better be 'proper' Logical types with the associated size.
We should probably align the Avro schema representation with the way other
systems (Hive / Spark) represent these types into Avro. For ref
http://apache-avro.679487.n3.nabble.com/Standardizing-char-and-varchar-logical-types-td4038622.html
or from the source
https://github.com/apache/hive/blob/5d268834a5f5278ea76399f8af0d0ab043ae0b45/serde/src/test/resources/avro-struct.avsc#L11
Keeping the full information in the internal representation is ok but we
should not be losing the maxLength (size) on the type information.
Apart of this the rest of the PR looks pretty good. Thanks for working on
this @anantdamle !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 605540)
Time Spent: 2h 40m (was: 2.5h)
> AvroUtils exception when converting JDBC Row to GenericRecord
> -------------------------------------------------------------
>
> Key: BEAM-12385
> URL: https://issues.apache.org/jira/browse/BEAM-12385
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-core
> Reporter: Anant Damle
> Assignee: Anant Damle
> Priority: P2
> Fix For: 2.31.0
>
> Time Spent: 2h 40m
> Remaining Estimate: 0h
>
> `{{AvroUtils.toAvroSchema()}}` and `{{AvroUtils.toGenericRecord()}}` throw
> exception for JDBC specific logical types like: `{{DATE}}`, `{{NVARCHAR}}`,
> `{{VARCHAR}}`, `{{LONGVARCHAR}}` etc.
>
> {code:java}
> pipeline
> .apply(JdbcIO.readRows()...)
> .apply(MapElements.via(
> (Row row) -> {
> // This statement throws Exception "Unhandled Logical Type"
> org.apache.avro.Schema schema =
> AvroUtils.toAvroSchema(row.getSchema());
> }));
> {code}
>
>
> This can be handled by adding additional case-statements in AvroUtils:
--
This message was sent by Atlassian Jira
(v8.3.4#803005)