Re: Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-23 Thread godfrey he
Hi, everyone. Thanks for all the inputs. If there is no feedback any more, I will start the vote tomorrow. Best, Godfrey Jing Ge 于2022年6月22日周三 15:50写道: > > sounds good to me. Thanks! > > Best regards, > Jing > > On Fri, Jun 17, 2022 at 5:37 AM godfrey he wrote: > > > Hi, Jing. > > > > Thanks

Re: Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-22 Thread Jing Ge
sounds good to me. Thanks! Best regards, Jing On Fri, Jun 17, 2022 at 5:37 AM godfrey he wrote: > Hi, Jing. > > Thanks for the feedback. > > >When will the converted SELECT statement of the ANALYZE TABLE be > > submitted? right after the CREATE TABLE? > The SELECT job will be submitted only

Re: Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-16 Thread godfrey he
Hi Jark, I have created the issue and will be done in release 1.16, see https://issues.apache.org/jira/browse/FLINK-28074 Best, Godfrey Jark Wu 于2022年6月16日周四 18:03写道: > > Hi Godfrey, > > > we just need a JIRA to support it. > Could you create the JIRA issue? I think it would be better if we

Re: Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-16 Thread godfrey he
Hi, Jing. Thanks for the feedback. >When will the converted SELECT statement of the ANALYZE TABLE be > submitted? right after the CREATE TABLE? The SELECT job will be submitted only when `ANALYZE TABLE` is executed, and there is nothing to do with CREATE TABLE. Because the `ANALYZE TABLE` is

Re: Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-16 Thread Jark Wu
Hi Godfrey, > we just need a JIRA to support it. Could you create the JIRA issue? I think it would be better if we can support `DESC EXTENDED` and `ANALYZE TABLE` together in the 1.16 release. Otherwise, it's hard for users to determine when to call ANALYZE TABLE. Best, Jark On Thu, 16 Jun

Re: Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-15 Thread Jing Ge
Hi Godfrey, Thanks for driving this! There are some areas where I couldn't find enough information in the FLIP, just wondering if I could get more explanation from you w.r.t. the following questions: 1. When will the converted SELECT statement of the ANALYZE TABLE be submitted? right after the

Re: Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-15 Thread godfrey he
Hi Jark, Thanks for the inputs. >Do we need to provide DESC EXTENDED statement like Spark[1] to >show statistic for table/partition/columns? We do have supported `DESC EXTENDED` syntax, but currently only table schema will be display, I think we just need a JIRA to support it. > is it possible

Re: Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-14 Thread Jark Wu
Hi Godfrey, thanks for starting this discussion, this is a great feature for batch users. The FLIP looks good to me in general. I only have 2 comments: 1) How do users know whether the given table or partition contains required statistics? Do we need to provide DESC EXTENDED statement like

Re: Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-13 Thread Jing Ge
Hi 华宗 退订请发送任意消息至dev-unsubscr...@flink.apache.org In order to unsubscribe, please send an email to dev-unsubscr...@flink.apache.org Thanks Best regards, Jing On Tue, Jun 14, 2022 at 2:05 AM 华宗 wrote: > 退订 > > > > > > > > > > > > > > > > > > At 2022-06-13 22:44:24, "cao zou" wrote: > >Hi

Re:Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-13 Thread 华宗
退订 At 2022-06-13 22:44:24, "cao zou" wrote: >Hi godfrey, thanks for your detail explanation. >After explaining and glancing over the FLIP-231, I think it is >really need, +1 for this and looking forward to it. > >best >zoucao > >godfrey he 于2022年6月13日周一 14:43写道: > >> Hi Ingo, >>

Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-13 Thread cao zou
Hi godfrey, thanks for your detail explanation. After explaining and glancing over the FLIP-231, I think it is really need, +1 for this and looking forward to it. best zoucao godfrey he 于2022年6月13日周一 14:43写道: > Hi Ingo, > > The semantics does not distinguish batch and streaming, > It works for

Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-13 Thread godfrey he
Hi Ingo, The semantics does not distinguish batch and streaming, It works for both batch and streaming, but the result of unbounded sources is meaningless. Currently, I throw exception for streaming mode, and we can support streaming mode with bounded source in the future. Best, Godfrey Ingo

Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-13 Thread Ingo Bürk
Hi Godfrey, thank you for the explanation. A SELECT is definitely more generic and will work for all connectors automatically. As such I think it's a good baseline solution regardless. We can also think about allowing connector-specific optimizations in the future, but I do like your idea

Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-12 Thread godfrey he
Hi Ingo, Thanks for the inputs. I think converting `ANALYZE TABLE` to `SELECT` statement is more generic approach. Because query plan optimization is more generic, we can provide more optimization rules to optimize not only `SELECT` statement converted from `ANALYZE TABLE` but also the `SELECT`

Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-12 Thread godfrey he
Hi cao, Thanks for the feedback. AFAK, unlike databases' behavior, the statistics will not collected automatically when writing data for many big data compute engines. FLIP-231[1] has introduced SupportsStatisticsReport interface which the planner will collect the statistics from connector when

Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-10 Thread Ingo Bürk
Hi Godfrey, compared to the solution proposed in the FLIP (using a SELECT statement), I wonder if you have considered adding APIs to catalogs / connectors to perform this task as an alternative? I could imagine that for many connectors, statistics could be implemented in a less expensive way

Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-10 Thread cao zou
Hi godfrey, Thanks for driving this meaningful topic. I think statistics are essential and meaningful for the optimizer, I'm just wondering which situation is needed. From the user side, the optimizer should be executed by the framework, maybe they do not want to consider too much about it. Could

[DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-10 Thread godfrey he
Hi all, I would like to open a discussion on FLIP-240: Introduce "ANALYZE TABLE" Syntax. As FLIP-231 mentioned, statistics are one of the most important inputs to the optimizer. Accurate and complete statistics allows the optimizer to be more powerful. "ANALYZE TABLE" syntax is a very common