Hi Martjin, Thank you for good feedback! I might not be able to address them all especially around catalog table schema inference design.
>* What I'm missing overall, is the section on 'Public Interfaces'. The FLIP has a large Proposed Changes section I updated https://cwiki.apache.org/confluence/display/FLINK/FLIP-237 %3A+Thrift+Format+Support as well as google doc with public interfaces and summary section outline areas we propose to change. Let me know if that helps. I agree, swallow errors should be opt-in based, user needs to add code in datastream code and check metrics in their code. Please check datastrem api section under public interfaces section. I am not aware if we have way to capture ser/deser errors with DLQ, the best knowledge is having side output but that's one level up in stack.( >I think 'Proposed Changes' needs a summary because you're suggesting multiple things and it easy to miss one of them. I categorized based on changes in summary. I agree this FLIP seems more like a journey, let me know if summary can focus on what audience needs to pay attention to (aka what need to change) > * With regards to mapping ENUM to STRING, what is to be expected if Flink will support ENUM in the future? Yes, we will move to ENUM as well. > * Is there a tight dependency between Thrift and Hive? When we externalize the Hive connector, can the Thrift format still work? We plan to introduce a flink-thrift in flink-format subproject with proper thrift.version that users can define in mvn build. Instead of getting thrift dep from the hive connector implicitly. More detail in public interfaces section. * The FLIP mentions usage of Thrift on both DataStream and SQL jobs; however, the FLIP is very SQL oriented. >I split a dedicated section in public changes with code examples how datastream api user can leverage our proposed change. * With regards to the Table/View Inference DDL, this is only providing the Hive metastore as options. I would like to understand how this could work with Catalogs in general, not with Hive only. What type of compatibility guarantees (backward, forward, full) does Thrift offer? > this part needs more work, current implementation is more like a hack as I described in the proposal doc. Looking for more suggestions if folks are already considering along inference schema support. I also adds section on schema compatibility (limit to der/deser, instead of catalog table topic) Thanks, Chen On Thu, Jun 9, 2022 at 12:34 AM Martijn Visser <martijnvis...@apache.org> wrote: > Hi Chen, > > Thanks for creating the FLIP and opening the discussion. I have a couple of > questions/remarks: > > * What I'm missing overall, is the section on 'Public Interfaces'. The FLIP > has a large Proposed Changes section, but it reads more like your journey > when you implemented Thrift in your fork. For example, it mentions corrupt > Thrift payloads causing issues, but I can't determine if you want to > propose to deal with this upfront or not. (I would not deal with this > upfront in the format implementation, because there will be users who want > to job the fail while others just want it to continue and send something to > a DLQ) > * I think 'Proposed Changes' needs a summary because you're suggesting > multiple things and it easy to miss one of them. > * With regards to mapping ENUM to STRING, what is to be expected if Flink > will support ENUM in the future? > * Is there a tight dependency between Thrift and Hive? When we externalize > the Hive connector, can the Thrift format still work? > * The FLIP mentions usage of Thrift on both DataStream and SQL jobs; > however, the FLIP is very SQL oriented. > * With regards to the Table/View Inference DDL, this is only providing the > Hive metastore as options. I would like to understand how this could work > with Catalogs in general, not with Hive only. What type of > compatibility guarantees (backward, forward, full) does Thrift offer? > > Best regards, > > Martijn > > Op di 7 jun. 2022 om 18:56 schreef Chen Qin <qinnc...@gmail.com>: > > > Thanks for the pointers, I moved the proposal to the wiki and updated the > > FLIP status page. > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-237%3A+Thrift+Format+Support > > > > > > Looking forward to getting community feedback. > > > > Chen > > > > > > > > On Tue, Jun 7, 2022 at 12:12 AM Jark Wu <imj...@gmail.com> wrote: > > > > > Yes. The community recommends keeping content on the wiki page > > > and discuss in the mailing list. Discussions on the google doc are > > > not so visible to the community, and "If it didn’t happen on a mailing > > > list, it didn’t happen." > > > > > > Best, > > > Jark > > > > > > On Tue, 7 Jun 2022 at 14:15, Jing Ge <j...@ververica.com> wrote: > > > > > > > Hi Chen, > > > > > > > > Thanks for driving this! Afaik, the community has the consensus to > > *Start > > > > a [DISCUSS] thread on the Apache mailing list*[1]. I just walked > > through > > > > some existing FLIPs and didn't find any that have been using google > doc > > > as > > > > the discussion thread. Would you like to follow the current process > and > > > > move the content to the FLIP Wiki page? Another question would be: > > could > > > > anyone in the community confirm that it is also fine to use google > doc > > to > > > > discuss? We'd better clarify it before kicking off the discussion. > > > Thanks! > > > > > > > > By the way, you might want to reserve the FLIP number 237 on [1] in > the > > > > table *Adopted/Accepted but unreleased FLIPs.* > > > > > > > > Best regards, > > > > Jing > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-WhatshouldbeincludedinaFLIP > > > > ? > > > > > > > > On Tue, Jun 7, 2022 at 6:41 AM Chen Qin <qinnc...@gmail.com> wrote: > > > > > > > > > Hi there, > > > > > > > > > > I want to kick off the first round of FLIP-237 > > > > > < > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-237%3A+Thrift+Format+Support > > > > > >: > > > > > Thrift Format Support discussion. Notice for the area marked as > WIP, > > we > > > > are > > > > > looking for more feedback from folks, those areas would either stay > > in > > > > the > > > > > scope of current FLIP or removed based on feedback and discussion > > > > results. > > > > > > > > > > Google Doc > > > > > < > > > > > > > > > > > > > > > https://docs.google.com/document/d/1EhHewAW39pm-TX6fuUZLogWHK7vtgJ616WRwH7OOrgg/edit# > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > Chen Q > > > > > > > > > > > > > > >