Hi, everyone.
Thanks for all the inputs.
If there is no feedback any more, I will start the vote tomorrow.
Best,
Godfrey
Jing Ge 于2022年6月22日周三 15:50写道:
>
> sounds good to me. Thanks!
>
> Best regards,
> Jing
>
> On Fri, Jun 17, 2022 at 5:37 AM godfrey he wrote:
>
> > Hi, Jing.
> >
> > Thanks
sounds good to me. Thanks!
Best regards,
Jing
On Fri, Jun 17, 2022 at 5:37 AM godfrey he wrote:
> Hi, Jing.
>
> Thanks for the feedback.
>
> >When will the converted SELECT statement of the ANALYZE TABLE be
> > submitted? right after the CREATE TABLE?
> The SELECT job will be submitted only
Hi Jark,
I have created the issue and will be done in release 1.16,
see https://issues.apache.org/jira/browse/FLINK-28074
Best,
Godfrey
Jark Wu 于2022年6月16日周四 18:03写道:
>
> Hi Godfrey,
>
> > we just need a JIRA to support it.
> Could you create the JIRA issue? I think it would be better if we
Hi, Jing.
Thanks for the feedback.
>When will the converted SELECT statement of the ANALYZE TABLE be
> submitted? right after the CREATE TABLE?
The SELECT job will be submitted only when `ANALYZE TABLE` is executed,
and there is nothing to do with CREATE TABLE. Because the `ANALYZE TABLE`
is
Hi Godfrey,
> we just need a JIRA to support it.
Could you create the JIRA issue? I think it would be better if we can
support
`DESC EXTENDED` and `ANALYZE TABLE` together in the 1.16 release.
Otherwise, it's hard for users to determine when to call ANALYZE TABLE.
Best,
Jark
On Thu, 16 Jun
Hi Godfrey,
Thanks for driving this! There are some areas where I couldn't find enough
information in the FLIP, just wondering if I could get more
explanation from you w.r.t. the following questions:
1. When will the converted SELECT statement of the ANALYZE TABLE be
submitted? right after the
Hi Jark,
Thanks for the inputs.
>Do we need to provide DESC EXTENDED statement like Spark[1] to
>show statistic for table/partition/columns?
We do have supported `DESC EXTENDED` syntax, but currently only table schema
will be display, I think we just need a JIRA to support it.
> is it possible
Hi Godfrey, thanks for starting this discussion, this is a great feature
for batch users.
The FLIP looks good to me in general.
I only have 2 comments:
1) How do users know whether the given table or partition contains required
statistics?
Do we need to provide DESC EXTENDED statement like
Hi 华宗
退订请发送任意消息至dev-unsubscr...@flink.apache.org
In order to unsubscribe, please send an email to
dev-unsubscr...@flink.apache.org
Thanks
Best regards,
Jing
On Tue, Jun 14, 2022 at 2:05 AM 华宗 wrote:
> 退订
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> At 2022-06-13 22:44:24, "cao zou" wrote:
> >Hi
退订
At 2022-06-13 22:44:24, "cao zou" wrote:
>Hi godfrey, thanks for your detail explanation.
>After explaining and glancing over the FLIP-231, I think it is
>really need, +1 for this and looking forward to it.
>
>best
>zoucao
>
>godfrey he 于2022年6月13日周一 14:43写道:
>
>> Hi Ingo,
>>
Hi godfrey, thanks for your detail explanation.
After explaining and glancing over the FLIP-231, I think it is
really need, +1 for this and looking forward to it.
best
zoucao
godfrey he 于2022年6月13日周一 14:43写道:
> Hi Ingo,
>
> The semantics does not distinguish batch and streaming,
> It works for
Hi Ingo,
The semantics does not distinguish batch and streaming,
It works for both batch and streaming, but the result of
unbounded sources is meaningless.
Currently, I throw exception for streaming mode,
and we can support streaming mode with bounded source
in the future.
Best,
Godfrey
Ingo
Hi Godfrey,
thank you for the explanation. A SELECT is definitely more generic and
will work for all connectors automatically. As such I think it's a good
baseline solution regardless.
We can also think about allowing connector-specific optimizations in the
future, but I do like your idea
Hi Ingo,
Thanks for the inputs.
I think converting `ANALYZE TABLE` to `SELECT` statement is
more generic approach. Because query plan optimization is more generic,
we can provide more optimization rules to optimize not only `SELECT` statement
converted from `ANALYZE TABLE` but also the `SELECT`
Hi cao,
Thanks for the feedback.
AFAK, unlike databases' behavior, the statistics will not collected
automatically
when writing data for many big data compute engines.
FLIP-231[1] has introduced SupportsStatisticsReport interface which the planner
will collect the statistics from connector when
Hi Godfrey,
compared to the solution proposed in the FLIP (using a SELECT
statement), I wonder if you have considered adding APIs to catalogs /
connectors to perform this task as an alternative?
I could imagine that for many connectors, statistics could be
implemented in a less expensive way
Hi godfrey, Thanks for driving this meaningful topic.
I think statistics are essential and meaningful for the optimizer, I'm just
wondering which situation is needed. From the user side, the optimizer
should be executed by the framework, maybe they do not want to consider too
much about it. Could
Hi all,
I would like to open a discussion on FLIP-240: Introduce "ANALYZE
TABLE" Syntax.
As FLIP-231 mentioned, statistics are one of the most important inputs
to the optimizer. Accurate and complete statistics allows the
optimizer to be more powerful. "ANALYZE TABLE" syntax is a very common
18 matches
Mail list logo