Hi Martijn, Thank you for your reply, these are two good questions. >1. The FLIP mentions that if the user doesn't specify the WITH option part >in the query of the sink table, it will be assumed that the user wants to >create a managed table. What will happen if the user doesn't have Table >Store configured/installed? Will we throw an error?
If it is a Catalog that does not support managed table and no `connector` is specified, then the corresponding TableSink cannot be generated, will fail. If it is a Catalog that supports managed table and no `connector` is specified, then it will fail because the table store related configuration is not set and there is no table store related jar. >2. Will there be support included for FLIP-190 (version upgrades)? FLIP-190 mainly solves the problem of Streaming mode upgrade. FLIP-218 use scenarios more in Batch mode. CTAS atomicity implementation requires serialization support for Catalog and hook, which currently cannot be serialized into json, so they cannot be supported FLIP-190. Non-atomic implementations are able to support FLIP-190. -- Best regards, Mang Zhang At 2022-06-30 16:47:38, "Martijn Visser" <martijnvis...@apache.org> wrote: >Hi Mang, > >I have two questions/remarks: > >1. The FLIP mentions that if the user doesn't specify the WITH option part >in the query of the sink table, it will be assumed that the user wants to >create a managed table. What will happen if the user doesn't have Table >Store configured/installed? Will we throw an error? > >2. Will there be support included for FLIP-190 (version upgrades)? > >Best regards, > >Martijn > >Op wo 29 jun. 2022 om 05:18 schreef Mang Zhang <zhangma...@163.com>: > >> Hi everyone, >> Thank you to all those who participated in the discussion, we have >> discussed many rounds, the program has been gradually revised and improved, >> looking forward to further feedback, we will launch a vote in the next day >> or two. >> >> >> >> >> >> >> >> -- >> >> Best regards, >> Mang Zhang >> >> >> >> >> >> At 2022-06-28 22:23:16, "Mang Zhang" <zhangma...@163.com> wrote: >> >Hi Yuxia, >> >Thank you very much for your reply. >> > >> > >> >>1: Also, the mixture of ctas and rtas confuses me as the FLIP talks >> nothing about rtas but refer it in the configuration suddenly. And if >> we're not to implement rtas in this FLIP, it may be better not to refer it >> and the `rtas` shouldn't exposed to user as a configuration. >> >Currently does not support RTAS because in the stream mode and batch mode >> semantic unification issues and specific business scenarios are not very >> clear, the future we will support, if in support of rtas and then modify >> the option name, then it will bring the cost of modifying the configuration >> to the user. >> >>2: How will the CTASJobStatusHook be passed to StreamGraph as a hook? >> Could you please explain about it. Some pseudocode will be much better if >> it's possible. I'm lost in this part. >> > >> > >> > >> > >> >This part is too much of an implementation detail, and of course we had >> to make some changes to achieve this. FLIP focuses on semantic consistency >> in stream and batch mode, and can provide optional atomicity support. >> > >> > >> >>3: The name `AtomicCatalog` confuses me. Seems the backgroud for the >> naming is to implement atomic for ctas, we propose a interface for catalog >> to support serializing, then we name it to `AtomicCatalog`. At least, the >> interface is for the atomic of ctas. But if we want to implement other >> features like isolate which may also require serializable catalog in the >> future, should we introduce a new interface naming `IsolateCatalog`? Have >> you ever considered other names like `SerializableCatalog`. As it's a >> public interface, maybe we should be careful about the name. >> >Regarding the definition of the Catalog name, we have also discussed the >> name `SerializableCatalog`, which is too specific and does not relate to >> the atomic functionality we want to express. CTAS/RTAS want to support >> atomicity, need Catalog to implement `AtomicCatalog`, so it's more >> straightforward to understand. >> > >> > >> >Hope this answers your question. >> > >> > >> > >> > >> >-- >> > >> >Best regards, >> >Mang Zhang >> > >> > >> > >> > >> > >> >At 2022-06-28 11:36:51, "yuxia" <luoyu...@alumni.sjtu.edu.cn> wrote: >> >>Thanks for updating. The FLIP looks generall good to me. I have only >> minor questions: >> >> >> >>1: Also, the mixture of ctas and rtas confuses me as the FLIP talks >> nothing about rtas but refer it in the configuration suddenly. And if >> we're not to implement rtas in this FLIP, it may be better not to refer it >> and the `rtas` shouldn't exposed to user as a configuration. >> >> >> >>2: How will the CTASJobStatusHook be passed to StreamGraph as a hook? >> Could you please explain about it. Some pseudocode will be much better if >> it's possible. I'm lost in this part. >> >> >> >>3: The name `AtomicCatalog` confuses me. Seems the backgroud for the >> naming is to implement atomic for ctas, we propose a interface for catalog >> to support serializing, then we name it to `AtomicCatalog`. At least, the >> interface is for the atomic of ctas. But if we want to implement other >> features like isolate which may also require serializable catalog in the >> future, should we introduce a new interface naming `IsolateCatalog`? Have >> you ever considered other names like `SerializableCatalog`. As it's a >> public interface, maybe we should be careful about the name. >> >> >> >> >> >>Best regards, >> >>Yuxia >> >> >> >>----- 原始邮件 ----- >> >>发件人: "Mang Zhang" <zhangma...@163.com> >> >>收件人: "dev" <dev@flink.apache.org> >> >>抄送: imj...@gmail.com >> >>发送时间: 星期一, 2022年 6 月 27日 下午 5:43:50 >> >>主题: Re:Re: Re:Re: Re: Re: Re: [DISCUSS] FLIP-218: Support SELECT clause >> in CREATE TABLE(CTAS) >> >> >> >>Hi Jark, >> >>First of all, thank you for your very good advice! >> >>The RTAS point you mentioned is a good one, and we should support it as >> well. >> >>However, by investigating the semantics of RTAS and how RTAS is used >> within the company, I found that: >> >>1. The semantics of RTAS says that if the table exists, need to delete >> the old data and use the new data. >> >>This semantics is better implemented in Batch mode, for example, if the >> target table is a Hive table, old data file can be deleted directly. >> >>But in Streaming mode, the target table is probably a Kafka topic, we >> can't delete the data. >> >>So the semantics in streaming and batch scenarios are not well >> guaranteed to be consistent. >> >>2. I checked the SQL for big data in the company in the last week and >> found that RTAS was not used. >> >>No users in the company have mentioned the need for RTAS yet. So this >> application scenario is not very clear. >> >> >> >> >> >>It is not clear what kind of semantics RTAS should provide in streaming >> mode, and the user's business scenarios are not very clear. >> >>Maybe We don't have to support RTAS soon, but we can leave the >> possibility of supporting RTAS in the future in the interface definition. >> >>What do you think? Looking forward to your response! >> >> >> >> >> >>By the way, the other points raised have been updated. thanks. >> >> >> >> >> >> >> >> >> >>-- >> >> >> >>Best regards, >> >>Mang Zhang >> >> >> >> >> >> >> >> >> >> >> >>At 2022-06-26 11:56:53, "Jark Wu" <imj...@gmail.com> wrote: >> >>>Thanks for the update, Mang and Ron, >> >>> >> >>>The new proposal looks good to me in general, especially keeping the >> >>>behavior >> >>>consistent between batch and streaming mode by default. This is how we >> do >> >>>it >> >>>in the previous "table.dml-sync" option on ML [1]. >> >>> >> >>>Besides that, I just have some final minor comments regarding some >> >>>interfaces. >> >>> >> >>>1) table.ctas-or-rtas.atomicity-enabled >> >>>The "OR" keyword sounds like this configuration can only take effect on >> one >> >>>of CTAS and RTAS. >> >>>What about "table.ctas-and-rtas" or "table.ctas-rtas"? >> >>> >> >>>2) In the FLIP, you have mentioned RTAS many times, but have no plan to >> >>>support it. >> >>>RTAS is another widely used statement similar to CTAS. It seems there is >> >>>not much difference >> >>>between CTAS and RTAS. Considering we are introducing RTAS >> configurations, >> >>>is it possible >> >>> to support RTAS in this FLIP as well? >> >>> >> >>>3) connector.type >> >>>"connector.type" has been deprecated since FLIP-95, could you replace >> them >> >>>with 'connector'? >> >>> >> >>>4) SupportsAtomicCatalog >> >>>I have some concerns about using "Supports.." prefix which is known as >> the >> >>>ability extension for >> >>>DynamicTableSource and DynamicTableSink. Maybe "AtomicCatalog" is >> enough? >> >>> >> >>>Best, >> >>>Jark >> >>> >> >>>[1]: https://lists.apache.org/thread/78r8ybh4q3hkxf935vzjkb7782hqzcj2 >> >>> >> >>>On Fri, 24 Jun 2022 at 22:51, Mang Zhang <zhangma...@163.com> wrote: >> >>> >> >>>> Hi all, >> >>>> Thank you to all those who participated in the discussion and made >> >>>> suggestions! >> >>>> After several rounds of online and offline discussions, the solution >> in >> >>>> FLIP has been updated. >> >>>> Looking forward to more feedback from everyone. >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> -- >> >>>> >> >>>> Best regards, >> >>>> Mang Zhang >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> At 2022-06-24 21:58:01, "Mang Zhang" <zhangma...@163.com> wrote: >> >>>> >Hi godfrey and ron, >> >>>> >Thank you very much for your replies and suggestions. >> >>>> >Special thanks to ron for helping to review and improve the FLIP. >> >>>> >Looking forward to further feedback from others. >> >>>> > >> >>>> > >> >>>> > >> >>>> >-- >> >>>> > >> >>>> >Best regards, >> >>>> >Mang Zhang >> >>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> >At 2022-06-24 19:52:58, "ron" <ld...@zju.edu.cn> wrote: >> >>>> >>Thanks for godfrey further feedback, your suggestions are very good >> to >> >>>> me, the FLIP has updated according to your feedback. It will be very >> good >> >>>> if you look at it again。 >> >>>> >> >> >>>> >>Also looking forward to further feedback from others. >> >>>> >> >> >>>> >> >> >>>> >>> -----原始邮件----- >> >>>> >>> 发件人: "godfrey he" <godfre...@gmail.com> >> >>>> >>> 发送时间: 2022-06-24 17:00:51 (星期五) >> >>>> >>> 收件人: dev <dev@flink.apache.org> >> >>>> >>> 抄送: "Yun Gao" <yungao...@aliyun.com> >> >>>> >>> 主题: Re: Re: Re: [DISCUSS] FLIP-218: Support SELECT clause in >> CREATE >> >>>> TABLE(CTAS) >> >>>> >>> >> >>>> >>> Hi all, >> >>>> >>> >> >>>> >>> Sorry for the late reply. >> >>>> >>> >> >>>> >>> >table.cor-table-as-select.atomicity-enabled >> >>>> >>> Regarding `cor`, this abbreviation is not commonly used. >> >>>> >>> >> >>>> >>> >Create Table As Select(CTAS) feature depends on the >> serializability >> >>>> of the catalog. To quickly see if the catalog supports CTAS, we need >> to try >> >>>> to serialize the catalog when compile SQL in planner and if it fails, >> an >> >>>> exception will be >thrown to indicate to user that the catalog does >> not >> >>>> support CTAS because it cannot be serialized. >> >>>> >>> This behavior is too cryptic, and will break the current catalog >> >>>> >>> behavior when using 1.16. >> >>>> >>> I suggest we introduce a new interface for atomic catalog which >> >>>> >>> implements Serializable. >> >>>> >>> The existent catalogs can choose whether implements the new >> catalog >> >>>> interface. >> >>>> >>> >> >>>> >>> > Catalog#inferTableOptions >> >>>> >>> I strongly recommend not introducing this feature now, because the >> >>>> >>> behavior is unclear. >> >>>> >>> 1) if the catalog support managed table, the connector option is >> >>>> >>> empty. but if user forget to >> >>>> >>> set connector option for CTAS statement, the created table will be >> >>>> >>> managed table. >> >>>> >>> 2) the options and its values for catalog and for connector may be >> >>>> different, >> >>>> >>> so use the catalog option may cause expected errors. >> >>>> >>> >> >>>> >>> > StreamGraph#addJobStatusHook >> >>>> >>> I prefer `registerJobStatusHook` >> >>>> >>> >> >>>> >>> Best, >> >>>> >>> Godfrey >> >>>> >>> >> >>>> >>> Mang Zhang <zhangma...@163.com> 于2022年6月13日周一 16:43写道: >> >>>> >>> > >> >>>> >>> > Hi Yun, >> >>>> >>> > Thanks for your reply! >> >>>> >>> > Through offline communication with Dalong, I updated the >> >>>> JobStatusHook part to FLIP, looking forward to your feedback. >> >>>> >>> > >> >>>> >>> > >> >>>> >>> > >> >>>> >>> > -- >> >>>> >>> > >> >>>> >>> > Best regards, >> >>>> >>> > Mang Zhang >> >>>> >>> > >> >>>> >>> > >> >>>> >>> > >> >>>> >>> > >> >>>> >>> > >> >>>> >>> > At 2022-05-31 14:34:25, "Yun Gao" <yungao...@aliyun.com.INVALID >> > >> >>>> wrote: >> >>>> >>> > >Hi, >> >>>> >>> > > >> >>>> >>> > >Regarding the drop operation, with some offline discussion with >> >>>> Dalong and Zhu, >> >>>> >>> > >we think that listening in the client side might be problematic >> >>>> since it would exit >> >>>> >>> > >after submitting the jobs in detached mode, thus the operation >> >>>> might need to >> >>>> >>> > >be in the JobMaster side. >> >>>> >>> > > >> >>>> >>> > >For the listener interface, currently JobListener only resides >> in >> >>>> the client side >> >>>> >>> > >and contains unsuitable methods like onJobSubmitted for this >> >>>> scenario, and >> >>>> >>> > >the internal JobStatusListener is designed to be used inside >> JM and >> >>>> is not >> >>>> >>> > >serializable, thus we tend to add a new interface >> JobStatusHook, >> >>>> >>> > >which could be attached to the JobGraph and executed in the >> >>>> JobMaster. >> >>>> >>> > >The interface will also be marked as Internal. >> >>>> >>> > > >> >>>> >>> > >Best, >> >>>> >>> > >Yun >> >>>> >>> > > >> >>>> >>> > > >> >>>> >>> > >> >------------------------------------------------------------------ >> >>>> >>> > >From:Mang Zhang <zhangma...@163.com> >> >>>> >>> > >Send Time:2022 May 25 (Wed.) 10:24 >> >>>> >>> > >To:dev <dev@flink.apache.org> >> >>>> >>> > >Subject:Re:Re: [DISCUSS] FLIP-218: Support SELECT clause in >> CREATE >> >>>> TABLE(CTAS) >> >>>> >>> > > >> >>>> >>> > >Hi, Martijn >> >>>> >>> > >Thanks for your reply! >> >>>> >>> > >I looked at the SQL standard, CTAS is part of the SQL standard. >> >>>> >>> > >Feature T172 is "AS subquery clause in table definition". >> >>>> >>> > > >> >>>> >>> > > >> >>>> >>> > > >> >>>> >>> > >-- >> >>>> >>> > > >> >>>> >>> > >Best regards, >> >>>> >>> > >Mang Zhang >> >>>> >>> > > >> >>>> >>> > > >> >>>> >>> > > >> >>>> >>> > > >> >>>> >>> > > >> >>>> >>> > >At 2022-05-04 21:49:00, "Martijn Visser" < >> martijnvis...@apache.org> >> >>>> wrote: >> >>>> >>> > >>Hi everyone, >> >>>> >>> > >> >> >>>> >>> > >>Can we identify if this proposed syntax is part of the SQL >> >>>> standard? >> >>>> >>> > >> >> >>>> >>> > >>Best regards, >> >>>> >>> > >> >> >>>> >>> > >>Martijn Visser >> >>>> >>> > >>https://twitter.com/MartijnVisser82 >> >>>> >>> > >>https://github.com/MartijnVisser >> >>>> >>> > >> >> >>>> >>> > >> >> >>>> >>> > >>On Fri, 29 Apr 2022 at 11:19, yuxia < >> luoyu...@alumni.sjtu.edu.cn> >> >>>> wrote: >> >>>> >>> > >> >> >>>> >>> > >>> Thanks for for driving this work, it's to be a useful >> feature. >> >>>> >>> > >>> About the flip-218, I have some questions. >> >>>> >>> > >>> >> >>>> >>> > >>> 1: Does our CTAS syntax support specify target table's >> schema >> >>>> including >> >>>> >>> > >>> column name and data type? I think it maybe a useful fature >> in >> >>>> case we want >> >>>> >>> > >>> to change the data types in target table instead of always >> copy >> >>>> the source >> >>>> >>> > >>> table's schema. It'll be more flexible with this feature. >> >>>> >>> > >>> Btw, MySQL's "CREATE TABLE ... SELECT Statement"[1] support >> this >> >>>> feature. >> >>>> >>> > >>> >> >>>> >>> > >>> 2: Seems it'll requre sink to implement an public interface >> to >> >>>> drop table, >> >>>> >>> > >>> so what's the interface will look like? >> >>>> >>> > >>> >> >>>> >>> > >>> [1] >> >>>> https://dev.mysql.com/doc/refman/8.0/en/create-table-select.html >> >>>> >>> > >>> >> >>>> >>> > >>> Best regards, >> >>>> >>> > >>> Yuxia >> >>>> >>> > >>> >> >>>> >>> > >>> ----- 原始邮件 ----- >> >>>> >>> > >>> 发件人: "Mang Zhang" <zhangma...@163.com> >> >>>> >>> > >>> 收件人: "dev" <dev@flink.apache.org> >> >>>> >>> > >>> 发送时间: 星期四, 2022年 4 月 28日 下午 4:57:24 >> >>>> >>> > >>> 主题: [DISCUSS] FLIP-218: Support SELECT clause in CREATE >> >>>> TABLE(CTAS) >> >>>> >>> > >>> >> >>>> >>> > >>> Hi, everyone >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > >>> I would like to open a discussion for support select clause >> in >> >>>> CREATE >> >>>> >>> > >>> TABLE(CTAS), >> >>>> >>> > >>> With the development of business and the enhancement of >> flink sql >> >>>> >>> > >>> capabilities, queries become more and more complex. >> >>>> >>> > >>> Now the user needs to use the Create Table statement to >> create >> >>>> the target >> >>>> >>> > >>> table first, and then execute the insert statement. >> >>>> >>> > >>> However, the target table may have many columns, which will >> >>>> bring a lot of >> >>>> >>> > >>> work outside the business logic to the user. >> >>>> >>> > >>> At the same time, ensure that the schema of the created >> target >> >>>> table is >> >>>> >>> > >>> consistent with the schema of the query result. >> >>>> >>> > >>> Using a CTAS syntax like Hive/Spark can greatly facilitate >> the >> >>>> user. >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > >>> You can find more details in FLIP-218[1]. Looking forward to >> >>>> your feedback. >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > >>> [1] >> >>>> >>> > >>> >> >>>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-218%3A+Support+SELECT+clause+in+CREATE+TABLE(CTAS) >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > >>> -- >> >>>> >>> > >>> >> >>>> >>> > >>> Best regards, >> >>>> >>> > >>> Mang Zhang >> >>>> >>> > >>> >> >>>> >>> > > >> >>>> >> >> >>>> >> >> >>>> >>------------------------------ >> >>>> >>Best, >> >>>> >>Ron >> >>>> >>