Hi Jark,

Thank you for your comments.

Ad. 1 My intention was to throw an exception if a user tries to use the
ENFORCE mode. SQL standard states this is the default mode, so if we
ever want to support the ENFORCE/NOT ENFORCE mode we need to make sure
users now always provide the NOT ENFORCE flag. Does it make sense to
you? If it does, I will clarify the FLIP on this topic.

Ad. 2 This is the behavior of majority of RDBM systems. I think this is
quite useful from the usability perspective. A column is nullable by
default. It is also a good practice to always define PRIMARY KEY for a
table. Therefore we would force users to add additional annotations.
E.g. instead of: CREATE TABLE tableName(id INT PRIMARY KEY); user would
always have to type CREATE TABLE tableName(id INT NOT NULL PRIMARY KEY);

I am not totally against it, if this is the preferred way.

Ad. 3 The only reason why I didn't want to add a full UNIQUE support is
that we would have to decide how do we handle null for the UNIQUE
constraint. Some databases allow multiple rows to have a null in a
column with unique constraint, whereas other allow only one. That means
if you have a table CREATE TABLE tableName (id INTEGER UNIQUE); In some
databases it is legal to have rows (null; null; null; null; ....) while
in others you can have at most (null). Do you think we should still
discuss this as well?

Ad. 4 I don't have a strong opinion on that. Actually SQL standard
describes a Table descriptor. Some excerpts from the standard:

— The name of the base table.
— An indication of whether the table is a regular persistent base table,
a system-versioned table, a global
temporary table, a created local temporary table, or a declared local
temporary table.

...

— An indication of whether the table is insertable-into or not.

— The column descriptor of each column in the table.

— The descriptor of each table constraint specified for the table.

In this context I thought primary key belongs to the Table rather than
schema, where schema is as far as I understand just a list of Table
column descriptors. Tbh the TableSchema does not fit very well into that
description of the descriptor. If you have a good reasons why you think
it would be better to have it in the TableSchema, I would be fine with
it. We should though have a clear separation what belongs to the Table
schema and what does not (e.g. partitions are in the Catalog table now).

Best,

Dawid

On 20/11/2019 04:32, Jark Wu wrote:
> Hi Dawid,
>
> Thanks for driving this. Primary key is a very important and useful feature
> to Flink Table API / SQL.
>
> I'm +1 to the general design.
>
> And have some thoughts as following:
>
> 1. > The design says Flink only support NOT ENFORCED mode. But the DDL
> and KeyConstraint#primaryKey(..) can pass in ENFORCED mode.
>     What will we do when user specify an ENFORCED mode?  Shall we forbid
> it?
>
> 2. > "When creating a table, creating a primary key constraint will alter
> the columns nullability."
>     I think we should force the columns to be declared NOT NULL, instead of
> change the columns nullability implicitly.
>     Otherwise, it may be confused what is the nullbility of the primary key
> columns.
>
> 3. > "Support for Unique key is not part of the FLIP. It is just mentioned
> to show how can we extend the primary key concept with more constraints in
> the future."
>     I think we can include Unique Key as part of the FLIP, as it is already
> stated clearly. We can put the implemenation task of unique key in a lower
> priority.
>
> 4. > Method for retrieving primary key constraint is in CatalogBaseTable.
>     In SQL standard, primary key is declared in line or out of line. If it
> is out of line, it is still in the schema part, i.e. besides column
> definitions and in the parentheses.
>     So I think maybe primary key should belong to `TableSchema` becuase it
> is a part of schema.
>
>     CREATE TABLE xxx (
>        a int,
>        b varchar,
>        c bigint,
>        primary key (a, b)
>     ) with (...)
>
>
> Best,
> Jark
>
>
> On Tue, 19 Nov 2019 at 23:18, Dawid Wysakowicz <[email protected]>
> wrote:
>
>> Hi,
>>
>> I wanted to bring up the topic of primary key constraints that we could
>> leverage for runtime optimizations. Please have a look at the proposal
>> and I would especially draw your attention to the topic of nullability
>> of columns that are part of a primary key. Some time in the future we
>> will also be able to leverage it for upserting streaming tables.
>>
>> Here is the proposal:
>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP+87%3A+Primary+key+constraints+in+Table+API
>>
>> Best,
>>
>> Dawid
>>
>>
>>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to