[
https://issues.apache.org/jira/browse/BEAM-3437?focusedWorklogId=86145&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-86145
]
ASF GitHub Bot logged work on BEAM-3437:
----------------------------------------
Author: ASF GitHub Bot
Created on: 30/Mar/18 20:15
Start Date: 30/Mar/18 20:15
Worklog Time Spent: 10m
Work Description: akedin commented on a change in pull request #4964:
[BEAM-3437] Introduce Schema class, and use it in BeamSQL
URL: https://github.com/apache/beam/pull/4964#discussion_r178369253
##########
File path:
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/utils/CalciteUtils.java
##########
@@ -38,149 +37,152 @@
*/
public class CalciteUtils {
private static final long UNLIMITED_ARRAY_SIZE = -1L;
- private static final BiMap<SqlTypeCoder, SqlTypeName>
BEAM_TO_CALCITE_TYPE_MAPPING =
- ImmutableBiMap.<SqlTypeCoder, SqlTypeName>builder()
- .put(SqlTypeCoders.TINYINT, SqlTypeName.TINYINT)
- .put(SqlTypeCoders.SMALLINT, SqlTypeName.SMALLINT)
- .put(SqlTypeCoders.INTEGER, SqlTypeName.INTEGER)
- .put(SqlTypeCoders.BIGINT, SqlTypeName.BIGINT)
+ private static final BiMap<Schema.FieldType, SqlTypeName>
BEAM_TO_CALCITE_TYPE_MAPPING =
+ ImmutableBiMap.<Schema.FieldType, SqlTypeName>builder()
+ .put(FieldType.BYTE, SqlTypeName.TINYINT)
+ .put(FieldType.INT16, SqlTypeName.SMALLINT)
+ .put(FieldType.INT32, SqlTypeName.INTEGER)
+ .put(FieldType.INT64, SqlTypeName.BIGINT)
- .put(SqlTypeCoders.FLOAT, SqlTypeName.FLOAT)
- .put(SqlTypeCoders.DOUBLE, SqlTypeName.DOUBLE)
+ .put(FieldType.FLOAT, SqlTypeName.FLOAT)
+ .put(FieldType.DOUBLE, SqlTypeName.DOUBLE)
- .put(SqlTypeCoders.DECIMAL, SqlTypeName.DECIMAL)
+ .put(FieldType.DECIMAL, SqlTypeName.DECIMAL)
- .put(SqlTypeCoders.CHAR, SqlTypeName.CHAR)
- .put(SqlTypeCoders.VARCHAR, SqlTypeName.VARCHAR)
+ .put(FieldType.STRING, SqlTypeName.VARCHAR)
- .put(SqlTypeCoders.DATE, SqlTypeName.DATE)
- .put(SqlTypeCoders.TIME, SqlTypeName.TIME)
- .put(SqlTypeCoders.TIMESTAMP, SqlTypeName.TIMESTAMP)
+ .put(FieldType.DATETIME, SqlTypeName.TIMESTAMP)
- .put(SqlTypeCoders.BOOLEAN, SqlTypeName.BOOLEAN)
- .build();
+ .put(FieldType.BOOLEAN, SqlTypeName.BOOLEAN)
- private static final BiMap<SqlTypeName, SqlTypeCoder>
CALCITE_TO_BEAM_TYPE_MAPPING =
+ .put(FieldType.ARRAY, SqlTypeName.ARRAY)
+ .put(FieldType.ROW, SqlTypeName.ROW)
Review comment:
Not sure if this will work, but I would try to change this map to
`BiMap<FieldTypeDescriptor, SqlTypeName>`, so that it is possible to do it like
this:
```java
BEAM_TO_CALCITE_MAPPING =
...
.put(FieldType.DECIMAL.typeDescriptor(), SqlTypeName.DECIMAL)
.put(FieldType.STRING.withMetadata(CHAR_METADATA), SqlTypeName.CHAR)
.put(FieldType.STRING.withMetadata(VARCHAR_METADATA),
SqlTypeName.VARCHAR)
.put(FieldType.DATETIME.withMetadata(TIME_METADATA), SqlTypeName.TIME)
.put(FieldType.DATETIME.withMetadata(TIMESTAMP_METADATA),
SqlTypeName.TIMESTAMP)
...
```
Then you probably don't need to look up metadata separately each time and
now have a 1-1 mapping
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 86145)
Time Spent: 1h 50m (was: 1h 40m)
> Support schema in PCollections
> ------------------------------
>
> Key: BEAM-3437
> URL: https://issues.apache.org/jira/browse/BEAM-3437
> Project: Beam
> Issue Type: Wish
> Components: beam-model
> Reporter: Jean-Baptiste Onofré
> Assignee: Jean-Baptiste Onofré
> Priority: Major
> Time Spent: 1h 50m
> Remaining Estimate: 0h
>
> As discussed with some people in the team, it would be great to add schema
> support in {{PCollections}}. It will allow us:
> 1. To expect some data type in {{PTransforms}}
> 2. Improve some runners with additional features (I'm thinking about Spark
> runner with data frames for instance).
> A technical draft document has been created:
> https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit?disco=AAAABhykQIs&ts=5a203b46&usp=comment_email_document
> I also started a PoC on a branch, I will update this Jira with a "discussion"
> PR.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)