[
https://issues.apache.org/jira/browse/FLINK-21634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298067#comment-17298067
]
Jark Wu edited comment on FLINK-21634 at 3/9/21, 1:26 PM:
----------------------------------------------------------
Hi [~twalthr], thanks for the great feedback!
1) make COLUMN options
We can make COLUMN optional, but I would like to be cautious in the first
version for better evolution in the future. Because we may support more
keywords in the future, e.g. {{ADD WATERMARK}}, {{ADD PRIMARY KEY}}, etc...
Having the {{COLUMN}} keyword in syntax would be clearer and readable.
2) FIRST and LAST
I think this is a good point. But I think we don't need {{LAST}}, because it is
default, MySQL and Iceberg don't have this keyword too. We also not introdced
{{PERSIST}} keyword for metadata column because persist is default.
Besides, I think we can also add {{AFTER column_name}} which is also proposed
in MySQL and Iceberg.
{code:sql}
ALTER TABLE table_name
ADD COLUMN column_name <column_definition> [FIRST | AFTER column_name]
{code}
3) same syntax as in a regular CREATE TABLE DDL schema part
I'm also fine to support computed column and metadata column. Based on the
above syntax, the {{<column_definition>}} is shared with CREATE TABLE DDL:
{code:sql}
<column_definition>::
{ <physical_column_definition> | <metadata_column_definition> |
<computed_column_definition> } [COMMENT column_comment]
<physical_column_definition>::
column_type
<metadata_column_definition>::
column_type METADATA [ FROM metadata_key ] [ VIRTUAL ]
<computed_column_definition>::
AS computed_column_expression
{code}
Regarding supporting add multiple columns, from my point of view, this should
be another ALTER TABLE syntax, because {{ADD COLUMN}} should only add a single
column. What do you think about introducing another {{ADD COLUMNS}} syntax?
{code:sql}
ALTER TABLE table_name
ADD COLUMNS (column_name <column_definition> [, ...]) [FIRST | AFTER
column_name]
{code}
4) ALTER COLUMN and MODIFY COLUMN
I think {{MODIFY COLUMN}} is a good idea. Oracle and MySQL have {{MODIFY
COLUMN}} as a more powerful syntax than {{ALTER COLUMN}}, but still keep
{{ALTER COLUMN}}.
We can follow the {{MODIFY COLUMN}} syntax in MySQL and Oracle:
{code:sql}
ALTER TABLE table_name
MODIFY COLUMN column_name <column_definition> [FIRST | AFTER column_name]
{code}
As MODIFY COLUMN is a superset of ALTER COLUMN, I think we don't need to
introduce {{ALTER COLUMN}} for now. What do you think?
was (Author: jark):
Hi [~twalthr], thanks for the great feedback!
1) make COLUMN options
We can make COLUMN optional, but I would like to be cautious in the first
version for better evolution in the future. Because we may support more
keywords in the future, e.g. {{ADD WATERMARK}}, {{ADD PRIMARY KEY}}, etc...
Having the {{COLUMN}} keyword in syntax would be clearer and readable.
2) FIRST and LAST
I think this is a good point. But I think we don't need {{LAST}}, because it is
default, MySQL and Iceberg don't have this keyword too. We also not introdced
{{PERSIST}} keyword for metadata column because persist is default.
Besides, I think we can also add {{AFTER column_name}} which is also proposed
in MySQL and Iceberg.
{code:sql}
ALTER TABLE table_name
ADD COLUMN column_name <column_definition> [FIRST | AFTER column_name]
{code}
3) same syntax as in a regular CREATE TABLE DDL schema part
I'm also fine to support computed column and metadata column. Based on the
above syntax, the {{<column_definition>}} is shared with CREATE TABLE DDL:
{code:sql}
<column_definition>::
{ <physical_column_definition> | <metadata_column_definition> |
<computed_column_definition> } [COMMENT column_comment]
<physical_column_definition>::
column_type
<metadata_column_definition>::
column_type METADATA [ FROM metadata_key ] [ VIRTUAL ]
<computed_column_definition>::
AS computed_column_expression
{code}
Regarding supporting add multiple columns, from my point of view, this should
be another ALTER TABLE syntax, because {{ADD COLUMN}} should only add a single
column. What do you think about introducing another {{ADD COLUMNS}} syntax?
{code:sql}
ALTER TABLE table_name
ADD COLUMNS (<column_definition> [, ...]) [FIRST | AFTER column_name]
{code}
4) ALTER COLUMN and MODIFY COLUMN
I think {{MODIFY COLUMN}} is a good idea. Oracle and MySQL have {{MODIFY
COLUMN}} as a more powerful syntax than {{ALTER COLUMN}}, but still keep
{{ALTER COLUMN}}.
We can follow the {{MODIFY COLUMN}} syntax in MySQL and Oracle:
{code:sql}
ALTER TABLE table_name
MODIFY COLUMN column_name <column_definition> [FIRST | AFTER column_name]
{code}
As MODIFY COLUMN is a superset of ALTER COLUMN, I think we don't need to
introduce {{ALTER COLUMN}} for now. What do you think?
> ALTER TABLE statement enhancement
> ---------------------------------
>
> Key: FLINK-21634
> URL: https://issues.apache.org/jira/browse/FLINK-21634
> Project: Flink
> Issue Type: New Feature
> Components: Table SQL / API, Table SQL / Client
> Reporter: Jark Wu
> Priority: Major
>
> We already introduced ALTER TABLE statement in FLIP-69 [1], but only support
> to rename table name and change table options. One useful feature of ALTER
> TABLE statement is modifying schema. This is also heavily required by
> integration with data lakes (e.g. iceberg).
> Therefore, I propose to support following ALTER TABLE statements:
> *Add Column*
> {code:sql}
> ALTER TABLE <table_name>
> ADD COLUMN <column_name> <data_type> [COMMENT column_comment]
> {code}
> This follows SQL standard 2011 Section 11.10. And Iceberg[2], Trino[3],
> MySQL[4] also are the same.
> *Drop Column*
> {code:sql}
> ALTER TABLE <table_name> DROP COLUMN <column_name>
> {code}
> This follows SQL standard 2011 Section 11.10. And Iceberg[2], Trino[3],
> MySQL[4] also are the same.
> *Alter Column*
> {code:sql}
> ALTER TABLE <table_name> ALTER COLUMN <column_name>
> SET DATA TYPE <data_type> [COMMENT column_comment]
> {code}
> This follows SQL standard 2011 Section 11.10. Same to PG [5], and similar to
> Iceberg[2], Trino[3], MySQL[4].
> *Rename Column*
> {code:sql}
> ALTER TABLE <table_name> REANME COLUMN <old_name> TO <new_name>
> {code}
> This is not listed in SQL standard, but is also very useful. Follows the
> syntax of Iceberg[2], Trino[3], MySQL[4].
> *Reset Options*
> {code:sql}
> ALTER TABLE <table_name> RESET (key1, key2, ...)
> {code}
> Out of SQL standard, but is useful. Has been discussed in FLINK-17845. Use
> {{RESET}} to keep align with {{SET key=value}} and {{RESET key}} proposed in
> FLIP-163. And PG[5] also uses the {{RESET}} keyword.
> For example:
> {code:sql}
> -- add a new column
> ALTER TABLE mytable ADD COLUMN new_column STRING COMMENT 'new_column docs';
> -- drop an old column
> ALTER TABLE prod.db.sample DROP COLUMN legacy_name;
> -- rename column name
> ALTER TABLE prod.db.sample RENAME COLUMN `data` TO payload;
> -- alter table type
> ALTER TABLE prod.db.sample ALTER COLUMN measurement
> SET DATA TYPE double COMMENT 'unit is bytes per second';
> -- reset options
> ALTER TALBE kafka_table RESET ('scan.startup.mode',
> 'scan.startup.specific-offsets');
> {code}
> Note: we don't need to introduce new interfaces, because all the alter table
> operation will be forward to catalog through {{Catalog#alterTable(tablePath,
> newTable, ignoreIfNotExists)}}.
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/alter/#alter-table
> [2]: http://iceberg.apache.org/spark-ddl/#alter-table-alter-column
> [3]: https://trino.io/docs/current/sql/alter-table.html
> [4]: https://dev.mysql.com/doc/refman/8.0/en/alter-table.html
> [5]: https://www.postgresql.org/docs/9.1/sql-altertable.html
--
This message was sent by Atlassian Jira
(v8.3.4#803005)