Hi, everyone, The discussion has not received any responses for a while. If there are no further comments this week, I will close the discussion and initiate the vote next Monday.
Thank you all for your input! Best regards, Feng Jin On Fri, Dec 20, 2024 at 10:38 AM Feng Jin <jinfeng1...@gmail.com> wrote: > Hi > Zuo and Lincoln > > Thanks for your reply. > > @Zuo > > > detects whether the modified state is compatioble or not with the > previous state automatically ? > > > Automatic detection of state compatibility is technically feasible, but > currently, there is no ready-made interface to check whether the job before > and after modification is compatible. However, this issue may be beyond the > scope of this FLIP. Regarding the topic of Flink SQL state compatibility, I > believe a separate FLIP is needed to describe its behavior in more detail. > > > > > @Lincoln > > > > And for the default behavior of alter operation under continuous mode, > can you add an example of starting the new job with hints(similar to the > case in the User Story section)? > > > Thank you for the suggestion. The compatibility recovery for users is not > easy to judge. I think removing it is reasonable, as it is not suitable to > be provided as a public feature. I have already updated the relevant > content. > > > > Best, > Feng Jin > > > On Thu, Dec 19, 2024 at 10:08 PM Lincoln Lee <lincoln.8...@gmail.com> > wrote: > >> Thanks Feng for driving this! >> Supporting modification is an important improvement for Materialized >> Table. >> >> Regarding the alter table reserve historic data, I have similar question >> with Ron, >> Users can't easily to judge whether a change is simple enough to keep >> state >> compatibility with old refresh job under the continuous mode. Therefore, I >> suggest removing the description of “Compatibile Recovery” from the public >> inferface section. >> >> And for the default behavior of alter operation under continuous mode, can >> you >> add an example of starting the new job with hints(similar to the case in >> the >> User Story section)? >> >> Best, >> Lincoln Lee >> >> >> Wei Zuo <1015766...@qq.com.invalid> 于2024年12月19日周四 14:05写道: >> >> > Hi, Feng >> > >> > >> > Is it possible that the framework detects whether the modified state is >> > compatioble or not with the previous state automatically? It would >> be >> > better to recognize query state compatibility automatically. >> > >> > >> > Best, >> > >> > >> > Zuo Wei >> > >> > >> > >> > >> > ------------------ 原始邮件 ------------------ >> > 发件人: >> > "dev" >> > < >> > ron9....@gmail.com>; >> > 发送时间: 2024年12月19日(星期四) 上午10:15 >> > 收件人: "Feng Jin"<jinfeng1...@gmail.com>; >> > 抄送: "dev"<dev@flink.apache.org>;"Lincoln Lee"< >> > lincoln.8...@gmail.com>; >> > 主题: Re: [DISCUSS] FLIP-492: Support Query Modifications for >> > Materialized Tables. >> > >> > >> > >> > Hi, Feng >> > >> > The reply looks good to me. But I have one question: You mentioned the >> > `DESC MATERIALIZED TABLE` syntax in FLIP, but we didn't provide this >> syntax >> > until now. I think we should add it to this FLIP if needed. >> > >> > Best, >> > Ron >> > >> > Feng Jin <jinfeng1...@gmail.com> 于2024年12月18日周三 16:52写道: >> > >> > > Hi Ron >> > > >> > > Thanks for your reply. >> > > >> > > > Is it only possible to add columns at the end and not >> > anywhere in >> > > table schema, some databases have this limitation, does lake >> storage >> > such >> > > as Iceberg/Paimon have this limitation? >> > > >> > > >> > > Currently, we can restrict adding columns only to the end of >> > the schema. >> > > Although both Paimon and Iceberg already support adding columns >> > anywhere, >> > > there are still some systems that do not. I will include this in >> the >> > FLIP. >> > > >> > > >> > > > In the Refresh Task Behavior section you mention partition >> > hints, is it >> > > possible to clarify what it is in the FLIP? >> > > >> > > >> > > I have added the relevant details. >> > > >> > > >> > > > Are you able to articulate the default behavior? >> > > >> > > >> > > The detailed explanation for this part has been updated. >> > > >> > > >> > > > How users can determine if states are compatible? >> > > >> > > >> > > Users can only rely on their experience to make modifications. >> > Currently, >> > > the Flink framework does not guarantee that changes to SQL logic >> will >> > > maintain state compatibility. >> > > >> > > I think we can add some suggestions in the user documentation in >> the >> > > future. While the framework itself cannot ensure state >> compatibility, >> > some >> > > simple modification scenarios can indeed be compatible. >> > > >> > > For now, the responsibility is left to the users. >> > > >> > > >> > > Even if recovery ultimately fails, users still have the option to >> roll >> > > back to the original query or start consuming from a new offset by >> > > disabling recovery parameters. >> > > >> > > >> > > >> > > >> > > Best, >> > > Feng >> > > >> > > >> > > On Tue, Dec 17, 2024 at 10:37 AM Ron Liu <ron9....@gmail.com> >> > wrote: >> > > >> > >> Hi Feng >> > >> >> > >> Thanks for initiating this FLIP, in lakehouse, Schema Evolution >> > of tables >> > >> due to modification of business logic is a very common >> scenario, >> > so >> > >> Materialized Table's support for modification of Query can >> > greatly improve >> > >> flexibility and usability, and we've seen that other similar >> > products in >> > >> the industry also support this capability. >> > >> >> > >> I read the content of this FLIP and the overall design looks >> > good, +1. >> > >> However, I have some questions as follows: >> > >> >> > >> 1. By `ALTER MATERIALIZED TABLE ... AS select` statement to >> > realize the >> > >> add column logic, is it only possible to add columns at the end >> > and not >> > >> anywhere in table schema, some databases have this limitation, >> > does lake >> > >> storage such as Iceberg/Paimon have this limitation? >> > >> 2. In the Refresh Task Behavior section you mention partition >> > hints, is >> > >> it possible to clarify what it is in the FLIP? >> > >> >> > >> >>> *CONTINUOUS Mode: *Stops the old job and starts a >> > new one with the >> > >> updated query. >> > >> >> > >> - The initial position of the new job is >> > controlled by the source >> > >> parameters. >> > >> - For compatible logic changes, recovery >> > parameters >> > >> (execution.state-recovery.path) can be >> > manually set if state compatibility >> > >> is confirmed. >> > >> >> > >> >> > >> 4. Are you able to articulate the default behavior? >> > >> 5. How users can determine if states are compatible? >> > >> >> > >> Best, >> > >> Ron >> > >> >> > >> Feng Jin <jinfeng1...@gmail.com> 于2024年12月16日周一 10:49写道: >> > >> >> > >>> Hi, everyone, >> > >>> >> > >>> I’d like to initiate a discussion on FLIP-492: Support >> Query >> > >>> Modifications for Materialized Tables[1]. >> > >>> >> > >>> In FLIP-435[2], we introduced *MATERIALIZED TABLES*. By >> > defining query >> > >>> logic and specifying data freshness requirements, users can >> > efficiently >> > >>> build data pipelines, greatly improving development >> > productivity. >> > >>> FLIP-492 builds on this by addressing a common need: the >> > ability to >> > >>> modify the query logic of an existing MATERIALIZED TABLE. >> Two >> > approaches >> > >>> are proposed: >> > >>> >> > >>> >> > >>> *1. Modifying the Query Logic: ALTER MATERIALIZED TABLE AS >> > <query>* >> > >>> Retain historical data while modifying the query logic: >> > >>> >> > >>> ``` >> > >>> ALTER MATERIALIZED TABLE >> [catalog_name.][db_name.]table_name >> > AS <query> >> > >>> ``` >> > >>> >> > >>> >> > >>> *2. Replacing the Table: CREATE OR REPLACE MATERIALIZED >> TABLE* >> > >>> Reconstruct the table with a new query, discarding all >> > historical data: >> > >>> >> > >>> ``` >> > >>> CREATE [OR REPLACE] MATERIALIZED TABLE >> > >>> [catalog_name.][db_name.]table_name >> > >>> [ ([<table_constraint>]) ] >> > >>> [COMMENT table_comment] >> > >>> [PARTITIONED BY (partition_column_name1, >> > partition_column_name2, ...)] >> > >>> [WITH (key1=val1, key2=val2, ...)] >> > >>> FRESHNESS = INTERVAL '<num>' { SECOND | MINUTE | HOUR | >> > DAY } >> > >>> [REFRESH_MODE = { CONTINUOUS | FULL }] >> > >>> AS <select_statement> >> > >>> ``` >> > >>> >> > >>> For a more detailed explanation of this proposal, please >> > refer to the >> > >>> FLIP-492[1] documentation. >> > >>> Your feedback and suggestions are highly appreciated to >> help >> > refine this >> > >>> proposal further. >> > >>> >> > >>> Lastly, I’d like to thank Ron and Lincoln (cc’d) for their >> > valuable >> > >>> input and suggestions during the drafting process. >> > >>> >> > >>> Looking forward to hearing your thoughts! >> > >>> >> > >>> >> > >>> Best, >> > >>> Feng Jin >> > >>> >> > >>> >> > >>> [1]. >> > >>> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-492%3A+Support+Query+Modifications+for+Materialized+Tables >> > >>> >> > < >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-492%3A+Support+Query+Modifications+for+Materialized+Tables>>> >> >; >> > [2]. >> > >>> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines >> > >>> >> > < >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines>>> >> > >> > ; >> > >> >> >