Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-06-09 Thread liu ron
Hi, Mang Thanks for your update, the FLIP looks good to me now. Best, Ron Mang Zhang 于2023年6月9日周五 12:08写道: > Hi Ron, > Thanks for your reply! > After our offline discussion, at present, there may be many of flink jobs > using non-atomic CTAS functions, especially Stream jobs, > If we only

Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-06-08 Thread liu ron
Hi, Mang In FLIP-214, we have discussed that atomicity is not needed in streaming mode, so we have implemented the initial version that doesn't support atomicity. In addition, we introduce the option "table.ctas.atomicity-enabled" to enable the atomic ability. According to your FLIP-315

Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-06-08 Thread Jark Wu
Thank you for the great work, Mang! The updated proposal looks good to me. Best, Jark > 2023年6月8日 11:49,Jingsong Li 写道: > > Thanks Mang for updating! > > Looks good to me! > > Best, > Jingsong > > On Wed, Jun 7, 2023 at 2:31 PM Mang Zhang wrote: >> >> Hi Jingsong, >> >>> I have some

Re: Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-06-07 Thread Jingsong Li
Thanks Mang for updating! Looks good to me! Best, Jingsong On Wed, Jun 7, 2023 at 2:31 PM Mang Zhang wrote: > > Hi Jingsong, > > >I have some doubts about the `TwoPhaseCatalogTable`. Generally, our > >Flink design places execution in the TableFactory or directly in the > >Catalog, so

Re:Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-06-07 Thread Mang Zhang
Hi Jingsong, >I have some doubts about the `TwoPhaseCatalogTable`. Generally, our >Flink design places execution in the TableFactory or directly in the >Catalog, so introducing an executable table makes me feel a bit >strange. (Spark is this style, but Flink may not be) On this issue, we

Re:Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-05-14 Thread Mang Zhang
Hi Jingsong, Thank you for your reply! We introduced `TwoPhaseCatalogTable` for two reasons: 1. The `TwoPhaseCatalogTable` of different data sources can have more operations, if through Catalog, there can only be simple create table and drop table, not flexible enough; For example, deleting a

Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-05-11 Thread Jingsong Li
Hi Mang, Thanks for starting this FLIP. I have some doubts about the `TwoPhaseCatalogTable`. Generally, our Flink design places execution in the TableFactory or directly in the Catalog, so introducing an executable table makes me feel a bit strange. (Spark is this style, but Flink may not be)

Re:Re: Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-05-10 Thread Mang Zhang
Hi Jing, Currently, we cannot determine in the planner whether the source is bounded or unbounded. So when we design the API, we use the execution model to help determine if atomicity can be supported. Thank you very much for your reply! -- Best regards, Mang Zhang At 2023-04-28

Re: Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-04-28 Thread Jing Ge
Hi Mang, Boundedness and execution modes are two orthogonal concepts. Since atomic CTAS will be only supported for bounded data, which means it does not depend on the execution modes. I was wondering if it is possible to only provide (or call) twoPhaseCreateTable for bounded data (in both

Re:Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-04-25 Thread Mang Zhang
Hi Jing, Yes, the atomic CTAS will be only supported for bounded data, but the execution modes can be stream or batch. I introduced the isStreamingMode parameter in the twoPhaseCreateTable API to make it easier for users to provide different levels of atomicity implementation depending on the

Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-04-24 Thread Jing Ge
Hi Mang, Thanks for clarifying it. I am trying to understand your thoughts. Do you actually mean the boundedness[1] instead of the execution modes[2]? I.e. the atomic CTAS will be only supported for bounded data. Best regards, Jing [1]

Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-04-23 Thread liu ron
Hi, Mang I have a question about the implementation details. For the atomicity case, since the target table is not created before the JobGraph is generated, but then the target table is required to exist when optimizing plan to generate the JobGraph. So how do you solve this problem? Best, Ron

Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-04-19 Thread yuxia
Share some insights about the new TwoPhaseCatalogTable proposed after offline discussion with Mang. The main or important reason is that the TwoPhaseCatalogTable enables external connectors to implement theirs own logic for commit / abort. In FLIP-218, for atomic CTAS, the Catalog will then

Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-04-14 Thread Jing Ge
Hi Mang, This is the FLIP I was looking forward to after FLIP-218. Thanks for driving it. I have two questions and would like to know your thoughts, thanks: 1. It looks like you found another way to design the atomic CTAS with new serializable TwoPhaseCatalogTable instead of making Catalog

Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-04-14 Thread yuxia
Hi, Mang. +1 for completing the support for atomicity of CTAS, this is very useful in batch scenarios and integrate with the data lake which support transcation. I just have one question, IIUC, the DynamiacTableSink will need to know it's for normal case or the atomicity with CTAS as well as

Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-04-12 Thread Lincoln Lee
Hi, Mang +1 for completing the support for atomicity of CTAS, this is very useful in batch scenarios. I have two questions: 1. naming wise: a) can we rename the `Catalog#getTwoPhaseCommitCreateTable` to `Catalog#twoPhaseCreateTable` (and we may add

Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-04-12 Thread liu ron
Hi, Mang Atomicity is very important for CTAS, especially for batch jobs. This FLIP is a continuation of FLIP-218, which is valuable for CTAS. I just have one question, in the Motivation part of FLIP-218, we mentioned three levels of atomicity semantics, can this current design do the same as