Sorry, I forgot to remind you that Timo's concern about the changes to the
CompiledPlan looks like is still not covered in the FLIP.

Best,
Jark

On Wed, 1 Mar 2023 at 19:28, Jark Wu <imj...@gmail.com> wrote:

> Hi Kui,
>
> Thank you for the great proposal, I think this is already in a good shape.
>
> Just a kind reminder, according to the community guidelines[1],
> if there are unresponsive reviewers, a typical reasonable time
> to wait for responses is one week, but be pragmatic about it.
>
> Regarding the FLIP, I have some comments below:
>
> 1. IIRC, this is the first time we introduce the framework-level connector
> options that the option is not recognized and handled by connectors.
> The FLIP should cover how framework filters the watermark related options
> to avoid discover connector factory failed, and what happens if the
> connector
> already supported the conflict options.
>
> 2. I'm not sure about the usage scenarios of event gap emit strategy. Do
> you have any specific use case of this strategy? I'm confused why no one
> requested this strategy before no matter in DataStream or SQL, but maybe
> I missed something. I'm not against to add this option, but just want to
> be
> careful when adding new API because it's hard to remove in the future.
>
>
> 3. Adding a "Public Interface"[2] section to summarize the
> proposed APIs and options would be better for developers to
> know the impact. Currently, the APIs are scattered in the long
> design sections.
>
> Best,
> Jark
>
>
> [1]:
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> [2]: https://cwiki.apache.org/confluence/display/FLINK/FLIP+Template
>
> On Wed, 1 Mar 2023 at 16:56, Kui Yuan <catye...@gmail.com> wrote:
>
>> Hi all,
>>
>> Thanks for all discussions!
>>
>> Anyone else have questions or suggestions? if not, I will start a vote
>> thread later.
>>
>> Best
>> Kui Yuan
>>
>> kui yuan <catye...@gmail.com> 于2023年2月27日周一 20:21写道:
>>
>> > Hi Timo,
>> >
>> > Thanks for your advice. I totally agree with your suggestion of naming
>> > convention, I will rename these options and update the flip later,
>> thanks
>> > very much.
>> >
>> > In our internal implementation we had put these options inside the
>> > `FactoryUtil`, just as you expect.  We have also taken into account the
>> > changes to the CompiledPlan and we have packaged these options
>> > appropriately to minimize intrusiveness and ensure the compatibility to
>> the
>> > `WatermarkPushDownSpec`.
>> >
>> > > A hint to the implementation: I would suggest that we add those
>> options
>> > > to `FactoryUtil`. All cross-connector options should end up there.
>> >
>> >
>> > > Please also consider the changes to the CompiledPlan in your FLIP.
>> This
>> > > change has implications on the JSON format as watermark strategy of
>> > > ExecNode becomes more complex, see WatermarkPushDownSpec
>> >
>> > Best
>> > Kui Yuan
>> >
>> > Timo Walther <twal...@apache.org> 于2023年2月27日周一 18:05写道:
>> >
>> >> Hi Kui Yuan,
>> >>
>> >> thanks for working on this FLIP. Let me also give some comments about
>> >> the proposed changes.
>> >>
>> >> I support the direction of this FLIP about handling these
>> >> watermark-specific properties through options and /*+OPTIONS(...) */
>> >> hints.
>> >>
>> >> Regarding naming, I would like to keep the options in sync with
>> existing
>> >> options:
>> >>
>> >>  > 'watermark.emit.strategy'='ON_EVENT'
>> >>
>> >> Let's use lower case (e.g. `on-event`) that matches with properties
>> like
>> >> sink.partitioner [1] or sink.delivery-guarantee [2].
>> >>
>> >>  > 'source.idle-timeout'='1min'
>> >>
>> >> According to FLIP-122 [3], we want to prefix all scan-source related
>> >> properties with `scan.*`. This clearly includes idle-timeout and
>> >> actually also watermark strategies which don't apply for lookup
>> sources.
>> >>
>> >> Summarizing the comments above, we should use the following options:
>> >>
>> >> 'scan.watermark.emit.strategy'='on-event',
>> >> 'scan.watermark.emit.on-event.gap'='10000',
>> >> 'scan.idle-timeout'='1min',
>> >> 'scan.watermark.alignment.group'='alignment-group-1',
>> >> 'scan.watermark.alignment.max-drift'='1min',
>> >> 'scan.watermark.alignment.update-interval'='1s'
>> >>
>> >> I know that this makes the keys even longer, but given that those
>> >> options are for power users this should be acceptable. It also clearly
>> >> indicates which options are for sinks, scans, and lookups. This
>> >> potentially also helps in allow lists.
>> >>
>> >> A hint to the implementation: I would suggest that we add those options
>> >> to `FactoryUtil`. All cross-connector options should end up there.
>> >>
>> >> Please also consider the changes to the CompiledPlan in your FLIP. This
>> >> change has implications on the JSON format as watermark strategy of
>> >> ExecNode becomes more complex, see WatermarkPushDownSpec [4].
>> >>
>> >> Regards,
>> >> Timo
>> >>
>> >>
>> >> [1]
>> >>
>> >>
>> https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/connectors/kafka.html#sink-partitioner
>> >> [2]
>> >>
>> >>
>> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/kafka/#sink-delivery-guarantee
>> >> [3]
>> >>
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
>> >> [4]
>> >>
>> >>
>> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/abilities/source/WatermarkPushDownSpec.java
>> >>
>> >>
>> >> On 24.02.23 04:55, kui yuan wrote:
>> >> > Hi all,
>> >> >
>> >> > I have updated the flip according to the discussion, and we will
>> extend
>> >> the
>> >> > watermark-related features with both table options and 'OPTIONS'
>> hint,
>> >> like
>> >> > this:
>> >> >
>> >> > ```
>> >> > -- configure in table options
>> >> > CREATE TABLE user_actions (
>> >> >    ...
>> >> >    user_action_time TIMESTAMP(3),
>> >> >    WATERMARK FOR user_action_time AS user_action_time - INTERVAL '5'
>> >> SECOND
>> >> > ) WITH (
>> >> >    'watermark.emit.strategy'='ON_PERIODIC',
>> >> >    ...
>> >> > );
>> >> >
>> >> > -- use 'OPTIONS' hint
>> >> > select ... from source_table /*+ OPTIONS('watermark.emit.strategy'=
>> >> > 'ON_PERIODIC') */
>> >> > ```
>> >> >
>> >> > Does everybody have any other questions?
>> >> >
>> >> > Best
>> >> > Kui Yuan
>> >> >
>> >> > kui yuan <catye...@gmail.com> 于2023年2月23日周四 20:05写道:
>> >> >
>> >> >> Hi all,
>> >> >>
>> >> >> Thanks for all suggestions.
>> >> >>
>> >> >> We will extend the watermark-related features in SQL layer with
>> dynamic
>> >> >> table options and 'OPTIONS' hint, just as everyone expects. I will
>> >> modify
>> >> >> Flip-296 as discussed.
>> >> >>
>> >> >> @Martijn As far as I know, there is no hint interface in the table
>> API,
>> >> >> so we can't use hint in table API directly. if we need to extend the
>> >> hint
>> >> >> interface in the table API, maybe we need another flip. However, if
>> we
>> >> >> extend the watermark-related features in the dynamic table options,
>> >> maybe
>> >> >> we are able to use them indirectly in the table API like this[1]:
>> >> >>
>> >> >> ```
>> >> >> // register a table named "Orders"
>> >> >> tableEnv.executeSql("CREATE TABLE Orders (`user` BIGINT, product
>> >> STRING,
>> >> >> amount INT) WITH ('watermark.emit.strategy'='ON_EVENT'...)");
>> >> >> ```
>> >> >>
>> >> >> [1]
>> >> >>
>> >>
>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/
>> >> >>
>> >> >> Best
>> >> >> Kui Yuan
>> >> >>
>> >> >> Yun Tang <myas...@live.com> 于2023年2月23日周四 17:46写道:
>> >> >>
>> >> >>> Thanks for the warm discussions!
>> >> >>>
>> >> >>> I had an offline discussion with Kui about the replies. I think I
>> >> could
>> >> >>> give some explanations on the original intention to introduce
>> another
>> >> >>> WATERMARK_PARAMS. If we take a look at the current datastream API,
>> the
>> >> >>> watermark strategy does not belong to any specific connector. And
>> we
>> >> >>> thought the dynamic table options were more like the configurations
>> >> within
>> >> >>> some specific connector.
>> >> >>>
>> >> >>>  From the review comments, I think most people feel good to make it
>> >> part
>> >> >>> of the dynamic table options. I think this is fine if we give more
>> >> clear
>> >> >>> scope definition of the dynamic table options here. And I also
>> agree
>> >> with
>> >> >>> Jingsong's concern about adding SQL syntax which is the most
>> >> concerning
>> >> >>> part before launching this discussion.
>> >> >>>
>> >> >>> For Martijn's concern, if we accept to make the watermark-related
>> >> options
>> >> >>> part of dynamic table options, the problem becomes another topic:
>> how
>> >> to
>> >> >>> support the dynamic table options in table API, which is deserved
>> to
>> >> create
>> >> >>> another FLIP.
>> >> >>>
>> >> >>> Best
>> >> >>> Yun Tang
>> >> >>> ________________________________
>> >> >>> From: Martijn Visser <martijnvis...@apache.org>
>> >> >>> Sent: Thursday, February 23, 2023 17:14
>> >> >>> To: dev@flink.apache.org <dev@flink.apache.org>
>> >> >>> Subject: Re: [DISCUSS] FLIP-296: Watermark options for table API &
>> SQL
>> >> >>>
>> >> >>> Hi,
>> >> >>>
>> >> >>> While I can understand that there's a desire to first focus on
>> solving
>> >> >>> this
>> >> >>> problem for SQL, I do wonder if we should ignore the Table API at
>> this
>> >> >>> point. If we could include the syntax for the Table API, it
>> >> potentially
>> >> >>> could also be implemented by another contributor without needing to
>> >> create
>> >> >>> another FLIP. If we don't design it right now, my concern is that
>> this
>> >> >>> will
>> >> >>> increase sparsity for the Table API which ultimately hurts
>> adoption.
>> >> >>>
>> >> >>> With regards to the syntax, I have a preference to solve this via
>> the
>> >> >>> connector options (e.g. like you can currently specify things as
>> >> >>> scan.startup.specific-offsets or scan.bounded.mode for the Kafka
>> >> >>> connector). You could still use the dynamic table options to
>> >> override/add
>> >> >>> them.
>> >> >>>
>> >> >>> Best regards,
>> >> >>>
>> >> >>> Martijn
>> >> >>>
>> >> >>> On Thu, Feb 23, 2023 at 7:21 AM Shammon FY <zjur...@gmail.com>
>> wrote:
>> >> >>>
>> >> >>>> Hi kui
>> >> >>>>
>> >> >>>> Thanks for your answer and +1 to yuxia too
>> >> >>>>
>> >> >>>>> we should not bind the watermark-related options to a connector
>> to
>> >> >>> ensure
>> >> >>>> semantic clarity.
>> >> >>>>
>> >> >>>> In my opinion, adding watermark-related options to a connector is
>> >> much
>> >> >>> more
>> >> >>>> clear. Currently users can define simple watermark strategy in
>> DDL,
>> >> >>> adding
>> >> >>>> more configuration items in connector options is easy to
>> understand
>> >> >>>>
>> >> >>>> Best,
>> >> >>>> Shammon
>> >> >>>>
>> >> >>>>
>> >> >>>> On Thu, Feb 23, 2023 at 10:52 AM Jingsong Li <
>> jingsongl...@gmail.com
>> >> >
>> >> >>>> wrote:
>> >> >>>>
>> >> >>>>> Thanks for your proposal.
>> >> >>>>>
>> >> >>>>> +1 to yuxia, consider watermark-related hints as option hints.
>> >> >>>>>
>> >> >>>>> Personally, I am cautious about adding SQL syntax,
>> WATERMARK_PARAMS
>> >> is
>> >> >>>>> also SQL syntax to some extent.
>> >> >>>>>
>> >> >>>>> We can use OPTIONS to meet this requirement if possible.
>> >> >>>>>
>> >> >>>>> Best,
>> >> >>>>> Jingsong
>> >> >>>>>
>> >> >>>>> On Thu, Feb 23, 2023 at 10:41 AM yuxia <
>> luoyu...@alumni.sjtu.edu.cn
>> >> >
>> >> >>>>> wrote:
>> >> >>>>>>
>> >> >>>>>> Hi, Yuan Kui.
>> >> >>>>>> Thanks for driving it.
>> >> >>>>>> IMO, the 'OPTIONS' hint may be not only specific to the
>> connector
>> >> >>>>> options. Just as a reference, we also have `sink.parallelism`[1]
>> as
>> >> a
>> >> >>>>> connector options. It enables
>> >> >>>>>> user to specific the writer's parallelism dynamically per-query.
>> >> >>>>>>
>> >> >>>>>> Personally, I perfer to consider watermark-related hints as
>> option
>> >> >>>>> hints. So, user can define a default watermark strategy for the
>> >> table,
>> >> >>>> and
>> >> >>>>> if user dosen't needed to changes it, they need to do nothing in
>> >> their
>> >> >>>>> query instead of specific it ervery time.
>> >> >>>>>>
>> >> >>>>>> [1]
>> >> >>>>>
>> >> >>>>
>> >> >>>
>> >>
>> https://nightlies.apache.org/flink/flink-docs-master/zh/docs/connectors/table/filesystem/#sink-parallelism
>> >> >>>>>>
>> >> >>>>>> Best regards,
>> >> >>>>>> Yuxia
>> >> >>>>>>
>> >> >>>>>> Best regards,
>> >> >>>>>> Yuxia
>> >> >>>>>>
>> >> >>>>>> ----- 原始邮件 -----
>> >> >>>>>> 发件人: "kui yuan" <catye...@gmail.com>
>> >> >>>>>> 收件人: "dev" <dev@flink.apache.org>
>> >> >>>>>> 抄送: "Jark Wu" <imj...@gmail.com>
>> >> >>>>>> 发送时间: 星期三, 2023年 2 月 22日 下午 10:08:11
>> >> >>>>>> 主题: Re: [DISCUSS] FLIP-296: Watermark options for table API &
>> SQL
>> >> >>>>>>
>> >> >>>>>> Hi all,
>> >> >>>>>>
>> >> >>>>>> Thanks for the lively discussion and I will respond to these
>> >> >>> questions
>> >> >>>>> one
>> >> >>>>>> by one. However, there are also some common questions and I will
>> >> >>> answer
>> >> >>>>>> together.
>> >> >>>>>>
>> >> >>>>>> @郑 Thanks for your reply. The features mentioned in this flip
>> are
>> >> >>> only
>> >> >>>>> for
>> >> >>>>>> those source connectors that implement the
>> >> SupportsWatermarkPushDown
>> >> >>>>>> interface, generating watermarks in other graph locations is
>> not in
>> >> >>> the
>> >> >>>>>> scope of this discussion. Perhaps another flip can be proposed
>> >> >>> later to
>> >> >>>>>> implement this feature.
>> >> >>>>>>
>> >> >>>>>> @Shammon Thanks for your reply. In Flip-296, a rejected
>> alternative
>> >> >>> is
>> >> >>>>>> adding watermark related options in the connector options,we
>> >> believe
>> >> >>>> that
>> >> >>>>>> we should not bind the watermark-related options to a connector
>> to
>> >> >>>> ensure
>> >> >>>>>> semantic clarity.
>> >> >>>>>>
>> >> >>>>>>> What will happen if we add watermark related options in `the
>> >> >>>> connector
>> >> >>>>>>> options`? Will the connector ignore these options or throw an
>> >> >>>>> exception?
>> >> >>>>>>> How can we support this?
>> >> >>>>>>
>> >> >>>>>> If user defines different watermark configurations for one
>> table in
>> >> >>> two
>> >> >>>>>> places, I tend to prefer the first place would prevail, but  we
>> can
>> >> >>>> also
>> >> >>>>>> throw exception or just print logs to prompt the user, which are
>> >> >>>>>> implementation details.
>> >> >>>>>>
>> >> >>>>>>> If one table is used by two operators with different watermark
>> >> >>>> params,
>> >> >>>>>>> what will happen?
>> >> >>>>>>
>> >> >>>>>> @Martijn Thanks for your reply.  I'm sorry that we are not
>> >> >>> particularly
>> >> >>>>>> accurate, this hint is mainly for SQL,  not table API.
>> >> >>>>>>
>> >> >>>>>>> While the FLIP talks about watermark options for Table API &
>> SQL,
>> >> >>> I
>> >> >>>>> only
>> >> >>>>>>> see proposed syntax for SQL, not for the Table API. What is
>> your
>> >> >>>>> proposal
>> >> >>>>>>> for the Table API
>> >> >>>>>>
>> >> >>>>>> @Jane Thanks for your reply. For the first question, If the user
>> >> >>> uses
>> >> >>>>> this
>> >> >>>>>> hint on those sourse that does not implement the
>> >> >>>>> SupportsWatermarkPushDown
>> >> >>>>>> interface, it will be completely invalid. The task will run as
>> >> >>> normal
>> >> >>>> as
>> >> >>>>> if
>> >> >>>>>> the hint had not been used.
>> >> >>>>>>
>> >> >>>>>>> What's the behavior if there are multiple table sources, among
>> >> >>> which
>> >> >>>>>>> some do not support `SupportsWatermarkPushDown`?
>> >> >>>>>>
>> >> >>>>>> @Jane feedback that 'WATERMARK_PARAMS' is difficult to remember,
>> >> >>>> perhaps
>> >> >>>>>> the naming issue can be put to the end of the discussion,
>> because
>> >> >>> more
>> >> >>>>>> people like @Martijn @Shuo are considering whether these
>> >> >>> configurations
>> >> >>>>>> should be put into the DDL or the 'OPTIONS' hint. Here's what I
>> >> >>>>>> think, Putting these configs into DDL or putting them into
>> >> 'OPTIONS'
>> >> >>>> hint
>> >> >>>>>> is actually the same thing, because the 'OPTIONS' hint is mainly
>> >> >>> used
>> >> >>>> to
>> >> >>>>>> configure the properties of conenctor. The reason why I want to
>> use
>> >> >>> a
>> >> >>>> new
>> >> >>>>>> hint is to make sure the semantics clear, in my opinion the
>> >> >>>> configuration
>> >> >>>>>> of watermark should not be mixed up with connector. However, a
>> new
>> >> >>> hint
>> >> >>>>>> does make it more difficult to use to some extent, for example,
>> >> >>> when a
>> >> >>>>> user
>> >> >>>>>> uses both 'OPTIONS' hint and 'WATERMARK_PARAMS' hint. For this
>> >> >>> point,
>> >> >>>>> maby
>> >> >>>>>> it is more appropriate to use uniform 'OPTIONS' hint.
>> >> >>>>>> On the other hand, if we enrich more watermark option keys in
>> >> >>> 'OPTIONS'
>> >> >>>>>> hints, The question will be what we treat the definatrions of
>> >> >>>> 'OPTIONS'
>> >> >>>>>> hint, is this only specific to the connector options or could be
>> >> >>> more?
>> >> >>>>>> Maybe @Jark could share more insights here. In my opion,
>> 'OPTIONS'
>> >> >>> is
>> >> >>>>> only
>> >> >>>>>> related to the connector options, which is not like the gernal
>> >> >>>> watermark
>> >> >>>>>> options.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Shuo Cheng <njucs...@gmail.com> 于2023年2月22日周三 19:17写道:
>> >> >>>>>>
>> >> >>>>>>> Hi Kui,
>> >> >>>>>>>
>> >> >>>>>>> Thanks for driving the discussion. It's quite useful to
>> introduce
>> >> >>>>> Watermark
>> >> >>>>>>> options. I have some questions:
>> >> >>>>>>>
>> >> >>>>>>> What kind of hints is "WATERMARK_PARAMS"?
>> >> >>>>>>> Currently, we have two kinds of hints in Flink: Dynamic Table
>> >> >>>> Options &
>> >> >>>>>>> Query Hints. As described in the Flip, "WATERMARK_PARAMS" is
>> more
>> >> >>>> like
>> >> >>>>>>> Dynamic Table Options. So two questions arise here:
>> >> >>>>>>>
>> >> >>>>>>> 1)  Are these watermark options to be exposed as connector WITH
>> >> >>>>> options? Aa
>> >> >>>>>>> described in SQL Hints doc[1],  "Dynamic Table Options allow to
>> >> >>>>> specify or
>> >> >>>>>>> override table options dynamically", which implies that these
>> >> >>> options
>> >> >>>>> can
>> >> >>>>>>> also be configured in WITH options.
>> >> >>>>>>>
>> >> >>>>>>> 2)  Do we really need a new hint name like 'WATERMARK_PARAMS',
>> >> >>> table
>> >> >>>>>>> options use "OPTIONS" as hint name, like '/*+
>> >> >>>>>>> OPTIONS('csv.ignore-parse-errors'='true') */', maybe we can
>> enrich
>> >> >>>> more
>> >> >>>>>>> table option keys for watermark, e.g., /*+
>> >> >>>>>>> OPTIONS('watermark.emit-strategy'='ON_PERIODIC') */.
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> [1]
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>
>> >> >>>>
>> >> >>>
>> >>
>> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/
>> >> >>>>>>>
>> >> >>>>>>> On Wed, Feb 22, 2023 at 10:22 AM kui yuan <catye...@gmail.com>
>> >> >>>> wrote:
>> >> >>>>>>>
>> >> >>>>>>>> Hi devs,
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> I'd like to start a discussion thread for FLIP-296[1]. This
>> >> >>> comes
>> >> >>>>> from an
>> >> >>>>>>>> offline discussion with @Yun Tang, and we hope to enrich table
>> >> >>> API
>> >> >>>> &
>> >> >>>>> SQL
>> >> >>>>>>> to
>> >> >>>>>>>> support many watermark-related features which were only
>> >> >>> implemented
>> >> >>>>> at
>> >> >>>>>>> the
>> >> >>>>>>>> datastream API level.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Basically, we want to introduce watermark options in table
>> API &
>> >> >>>> SQL
>> >> >>>>> via
>> >> >>>>>>>> SQL hint named 'WATERMARK_PARAMS' to support features:
>> >> >>>>>>>>
>> >> >>>>>>>> 1、Configurable watermark emit strategy
>> >> >>>>>>>>
>> >> >>>>>>>> 2、Dealing with idle sources
>> >> >>>>>>>>
>> >> >>>>>>>> 3、Watermark alignment
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Last but not least, thanks to Qingsheng and Jing Zhang for the
>> >> >>>>> initial
>> >> >>>>>>>> reviews.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Looking forward to your thoughts and any feedback is
>> >> >>> appreciated!
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> [1]
>> >> >>>>>>>>
>> >> >>>>>>>
>> >> >>>>>
>> >> >>>>
>> >> >>>
>> >>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240884405
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Best
>> >> >>>>>>>>
>> >> >>>>>>>> Yuan Kui
>> >> >>>>>>>>
>> >> >>>>>>>
>> >> >>>>>
>> >> >>>>
>> >> >>>
>> >> >>
>> >> >
>> >>
>> >>
>>
>

Reply via email to