[
https://issues.apache.org/jira/browse/FLINK-12557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Danny Chan updated FLINK-12557:
-------------------------------
Description:
The *with* option in table DDL defines the properties needed for specific
connector to create TableSource/Sink. The properties structure for SqlClient
config YAML is defined in [Improvements to the Unified SQL Connector
API|https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit#heading=h.41fd6rs7b3cf],
in this design, the properties can be categorized into 4 parts:
# Top level properties: name, type(source, sink), update-mode ...
# Connector specific properties: connector.type, connector.path ...
# Format properties: format.type, format.fields.1.name ...
# Table schema properties: (can be omitted for DDL)
This properties structure is reasonable for YAML, but they are not that concise
enough for developers. So there also defines a tool class named
[DescriptorProperties|https://github.com/apache/flink/blob/b3604f7bee7456b8533e9ea222a833a2624e36c2/flink-table/flink-table-common/src/main/java/org/apache/flink/table/descriptors/DescriptorProperties.java#L67]
to reconstruct the data structure(like TableSchema) from the flat k-v strings.
So in order to reduce complexity and keep the KV consistency for DDL with
properties and TableFactory properties, i proposed to simplify the DDL with
properties keys as following (corresponding to above 4 categories):
# Top level properties: keep same as that in the YAML e.g. connector,
update-mode
# Connector specific properties: start with prefix named the connector type
e.g. for kafka connector, the properties are defined as kafka.k1 = v1, kafka.k2
= v2
# Format properties: format.type simplified to format and the others with
prefix of the format name e.g. format = 'json', json.line-delimiter = "\n"
# Table schema properties: omitted.
Here is a demo of creat table DDL:
{code:java}
CREATE TABLE Kafka10SourceTable (
intField INTEGER,
stringField VARCHAR(128) COMMENT 'User IP address',
longField BIGINT,
rowTimeField TIMESTAMP,
WATERMARK wm01 FOR 'longField' AS BOUNDED WITH DELAY '60' SECOND
)
COMMENT 'Kafka Source Table of topic user_ip_address'
WITH (
connector='kafka',
kafka.property-version='1',
kafka.version='0.10',
kafka.topic='test-kafka-topic',
kafka.startup-mode = 'latest-offset'
kafka.specific-offset = 'offset'
format='json'
json.property-version = '1'
json.version='1'
json.derive-schema='true'
)
{code}
was:
The *with* option in table DDL defines the properties needed for specific
connector to create TableSource/Sink. The properties structure for SqlClient
config YAML is defined in [Improvements to the Unified SQL Connector
API|https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit#heading=h.41fd6rs7b3cf],
in this design, the properties can be categorized into 4 parts:
# Top level properties: name, type(source, sink), update-mode ...
# Connector specific properties: connector.type, connector.path ...
# Format properties: format.type, format.fields.1.name ...
# Table schema properties: (can be omitted for DDL)
This properties structure is reasonable for YAML, but they are not that concise
enough for developers. So there also defines a tool class named
[DescriptorProperties|https://github.com/apache/flink/blob/b3604f7bee7456b8533e9ea222a833a2624e36c2/flink-table/flink-table-common/src/main/java/org/apache/flink/table/descriptors/DescriptorProperties.java#L67]
to reconstruct the data structure(like TableSchema) from the flat k-v strings.
So in order to reduce complexity and keep the KV consistency for DDL with
properties and TableFactory properties, i proposed to simplify the DDL with
properties keys as following (corresponding to above 4 categories):
# Top level properties: keep same as that in the YAML e.g. type, update-mode
# Connector specific properties: start with prefix named the connector type
e.g. for kafka connector, the properties are defined as kafka.k1 = v1, kafka.k2
= v2
# Format properties: format.type simplified to format and the others keep the
same e.g. format = 'json', format.line-delimiter = "\n"
# Table schema properties: omitted.
Here is a demo of creat table DDL:
{code:java}
CREATE TABLE Kafka10SourceTable (
intField INTEGER,
stringField VARCHAR(128) COMMENT 'User IP address',
longField BIGINT,
rowTimeField TIMESTAMP,
WATERMARK wm01 FOR 'longField' AS BOUNDED WITH DELAY '60' SECOND
)
COMMENT 'Kafka Source Table of topic user_ip_address'
WITH (
type='kafka',
property-version='1',
version='0.10',
kafka.topic='test-kafka-topic',
kafka.startup-mode = 'latest-offset'
kafka.specific-offset = 'offset'
format='json'
format.property-version = '1'
format.version='1'
format.derive-schema='true'
)
{code}
> Unify create table DDL with clause and connector descriptor keys
> ----------------------------------------------------------------
>
> Key: FLINK-12557
> URL: https://issues.apache.org/jira/browse/FLINK-12557
> Project: Flink
> Issue Type: Improvement
> Components: Table SQL / API
> Affects Versions: 1.8.0
> Reporter: Danny Chan
> Assignee: Danny Chan
> Priority: Major
> Fix For: 1.9.0
>
>
> The *with* option in table DDL defines the properties needed for specific
> connector to create TableSource/Sink. The properties structure for SqlClient
> config YAML is defined in [Improvements to the Unified SQL Connector
> API|https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit#heading=h.41fd6rs7b3cf],
> in this design, the properties can be categorized into 4 parts:
>
> # Top level properties: name, type(source, sink), update-mode ...
> # Connector specific properties: connector.type, connector.path ...
> # Format properties: format.type, format.fields.1.name ...
> # Table schema properties: (can be omitted for DDL)
>
> This properties structure is reasonable for YAML, but they are not that
> concise enough for developers. So there also defines a tool class named
> [DescriptorProperties|https://github.com/apache/flink/blob/b3604f7bee7456b8533e9ea222a833a2624e36c2/flink-table/flink-table-common/src/main/java/org/apache/flink/table/descriptors/DescriptorProperties.java#L67]
> to reconstruct the data structure(like TableSchema) from the flat k-v
> strings.
>
> So in order to reduce complexity and keep the KV consistency for DDL with
> properties and TableFactory properties, i proposed to simplify the DDL with
> properties keys as following (corresponding to above 4 categories):
>
> # Top level properties: keep same as that in the YAML e.g. connector,
> update-mode
> # Connector specific properties: start with prefix named the connector type
> e.g. for kafka connector, the properties are defined as kafka.k1 = v1,
> kafka.k2 = v2
> # Format properties: format.type simplified to format and the others with
> prefix of the format name e.g. format = 'json', json.line-delimiter = "\n"
> # Table schema properties: omitted.
> Here is a demo of creat table DDL:
> {code:java}
> CREATE TABLE Kafka10SourceTable (
> intField INTEGER,
> stringField VARCHAR(128) COMMENT 'User IP address',
> longField BIGINT,
> rowTimeField TIMESTAMP,
> WATERMARK wm01 FOR 'longField' AS BOUNDED WITH DELAY '60' SECOND
> )
> COMMENT 'Kafka Source Table of topic user_ip_address'
> WITH (
> connector='kafka',
> kafka.property-version='1',
> kafka.version='0.10',
> kafka.topic='test-kafka-topic',
> kafka.startup-mode = 'latest-offset'
> kafka.specific-offset = 'offset'
> format='json'
> json.property-version = '1'
> json.version='1'
> json.derive-schema='true'
> )
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)