Hi Chesney and Konstantin,
thanks for your feedback, I have added a section about How we support set
description at DataStream API in the doc.


Bests,
Wenlong

On Tue, 16 Nov 2021 at 21:05, Konstantin Knauf <kna...@apache.org> wrote:

> Hi everyone,
>
> Thanks for starting this discussion. I am in favor of solving this for
> DataStream and Table API at the same time, using the same configuration
> keys. IMO we shouldn't introduce any additional fragmentation if we can
> avoid it.
>
> Cheers,
>
> Konstantin
>
> On Tue, Nov 16, 2021 at 1:50 PM wenlong.lwl <wenlong88....@gmail.com>
> wrote:
>
> > hi, Chesney, we focus on sql first because the operator and topology of
> sql
> > jobs are generated by the engine, raising most of the problems in naming,
> > not only because the name is long but also because the topology can be
> more
> > complex than DataStream.
> >
> > The case in Datastream is much better, most of the names in DataStream
> API
> > are quite concise except for the windowing you mentioned, and the
> topology
> > is usually simpler,  what's more we can easily expose to DataStream API
> as
> > a second step once the foundation implementation is done. If it is
> > necessary, we can also cover the changes on DataStream API now, maybe
> take
> > Windowing first as an example?
> >
> > Best,
> > Wenlong
> >
> > On Tue, 16 Nov 2021 at 19:14, Chesnay Schepler <ches...@apache.org>
> wrote:
> >
> > > Why should this be specific to the table API? The datastream API has
> > > similar issues with long operator names (like windowing).
> > >
> > > On 16/11/2021 11:22, wenlong.lwl wrote:
> > > > Thanks Godfrey for the suggestion.
> > > > Regarding 1, how about
> table.optimizer.simplify-operator-name-enabled,
> > > > which means that we would simplify the name of operator and keep the
> > > > details in description only.
> > > > "table.optimizer.operator-name.description-enabled" can not describe
> > what
> > > > it means I think.
> > > > Regarding 2, I agree that it is better to use enum instead of
> boolean.
> > > For
> > > > key I think you are meaning "pipeline.vertex-description-pattern"
> > instead
> > > > of "pipeline.vertex-name-pattern", and I would like to choose
> > > DEFAULT/TREE
> > > > for values.
> > > >
> > > > Best,
> > > > Wenlong
> > > >
> > > > On Tue, 16 Nov 2021 at 17:28, godfrey he <godfre...@gmail.com>
> wrote:
> > > >
> > > >> Thanks for creating this FLIP Wenlong.
> > > >>
> > > >> The FLIP already looks pretty solid, I think the config options can
> be
> > > >> improved a little:
> > > >> 1) about table.optimizer.separate-name-and-description, I think
> > > >> "operator-name" should be considered in the option,
> > > >> how about table.optimizer.operator-name.description-enabled ?
> > > >> 2) about pipeline.tree-mode-vertex-description, I think we can make
> > > >> the mode accept string value,
> > > >> which is more flexible. How about pipeline.vertex-name-pattern, the
> > > >> default value is "TREE",
> > > >> another option is "CASCADE" (or "DEFAULT", which is more simple)
> > > >>
> > > >> What do you think?
> > > >>
> > > >> Best,
> > > >> Godfrey
> > > >>
> > > >> wenlong.lwl <wenlong88....@gmail.com> 于2021年11月15日周一 下午6:36写道:
> > > >>
> > > >>> Hi, all, FYI the FLIP doc has been created :
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-195%3A+Improve+the+name+and+structure+of+vertex+and+operator+name+for+sql+job
> > > >>> Best,
> > > >>> Wenlong
> > > >>>
> > > >>> On Mon, 15 Nov 2021 at 11:41, wenlong.lwl <wenlong88....@gmail.com
> >
> > > >> wrote:
> > > >>>> Hi all,
> > > >>>> Thanks for the feedback, It seems that the proposal is accepted by
> > all
> > > >> of
> > > >>>> you guys. I will prepare a formal FLIP document and then go ahead
> to
> > > >> the
> > > >>>> vote stage.
> > > >>>> If any one has any other comments or suggestions, please let me
> > know,
> > > >>>> thanks.
> > > >>>>
> > > >>>> Best,
> > > >>>> Wenlong
> > > >>>>
> > > >>>> On Fri, 12 Nov 2021 at 05:54, Neng Lu <nl...@apache.org> wrote:
> > > >>>>
> > > >>>>> +1 (non-binding)
> > > >>>>> This change will really help to ease developer life.
> > > >>>>>
> > > >>>>> On Thu, Nov 11, 2021 at 6:33 AM Guowei Ma <guowei....@gmail.com>
> > > >> wrote:
> > > >>>>>> +1
> > > >>>>>> This would be very helpful for our debugging online job.
> > > >>>>>>
> > > >>>>>> Best,
> > > >>>>>> Guowei
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Thu, Nov 11, 2021 at 8:03 PM Yuepeng Pan <flin...@126.com>
> > > wrote:
> > > >>>>>>
> > > >>>>>>> +1. It's useful to understand the job topology.
> > > >>>>>>> Looking forward to this feature.
> > > >>>>>>> Best,
> > > >>>>>>> Yuepeng Pan.
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> At 2021-11-11 19:44:44, "Yangze Guo" <karma...@gmail.com>
> wrote:
> > > >>>>>>>> +1. That's gonna help a lot for debugging.
> > > >>>>>>>>
> > > >>>>>>>> Best,
> > > >>>>>>>> Yangze Guo
> > > >>>>>>>>
> > > >>>>>>>> On Thu, Nov 11, 2021 at 7:37 PM Till Rohrmann <
> > > >> trohrm...@apache.org>
> > > >>>>>>> wrote:
> > > >>>>>>>>> This improvement looks like it makes the life of our users a
> > lot
> > > >>>>>> easier
> > > >>>>>>>>> when it comes to understanding logs and reading the UI. Hence
> > > >> +1.
> > > >>>>>>>>> Cheers,
> > > >>>>>>>>> Till
> > > >>>>>>>>>
> > > >>>>>>>>> On Thu, Nov 11, 2021 at 11:59 AM JING ZHANG <
> > > >> beyond1...@gmail.com>
> > > >>>>>>> wrote:
> > > >>>>>>>>>> Big +1.
> > > >>>>>>>>>>
> > > >>>>>>>>>> This is a problem frequently encountered in our production
> > > >>>>>> platform,
> > > >>>>>>> look
> > > >>>>>>>>>> forward to this improvement.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Best,
> > > >>>>>>>>>> Jing Zhang
> > > >>>>>>>>>>
> > > >>>>>>>>>> Martijn Visser <mart...@ververica.com> 于2021年11月11日周四
> > > >> 下午6:26写道:
> > > >>>>>>>>>>> +1. Looks much better now
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Thu, 11 Nov 2021 at 11:07, godfrey he <
> > > >> godfre...@gmail.com>
> > > >>>>>>> wrote:
> > > >>>>>>>>>>>> Thanks for driving this, this improvement solves a
> > > >>>>>> long-complained
> > > >>>>>>>>>>>> problem, +1
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Best,
> > > >>>>>>>>>>>> Godfrey
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Jark Wu <imj...@gmail.com> 于2021年11月11日周四 下午5:40写道:
> > > >>>>>>>>>>>>> +1 for this. It looks much more clear and structured.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>> Jark
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On Thu, 11 Nov 2021 at 17:23, Chesnay Schepler <
> > > >>>>>>> ches...@apache.org>
> > > >>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>> I'm generally in favor of it, and there are already
> > > >>>>>> tickets
> > > >>>>>>> that
> > > >>>>>>>>>>>>>> proposed a dedicated operator/vertex description:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-20388
> > > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-21858
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On 11/11/2021 10:02, wenlong.lwl wrote:
> > > >>>>>>>>>>>>>>> Hi, all, I would like to start a discussion about an
> > > >>>>>>> improvement
> > > >>>>>>>>>> on
> > > >>>>>>>>>>>> name
> > > >>>>>>>>>>>>>>> and structure of job vertex name, mainly to improve
> > > >>>>>>> experience of
> > > >>>>>>>>>>>>>> debugging
> > > >>>>>>>>>>>>>>> and analyzing sql job at runtime.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> the main proposed changes including:
> > > >>>>>>>>>>>>>>> 1. separate description and name for operator, so
> > > >> that
> > > >>>>>> we
> > > >>>>>>> can
> > > >>>>>>>>>> have
> > > >>>>>>>>>>>>>> detailed
> > > >>>>>>>>>>>>>>> info at description and shorter name, which could be
> > > >>>>>> more
> > > >>>>>>>>>> friendly
> > > >>>>>>>>>>>> for
> > > >>>>>>>>>>>>>>> external systems like logging/metrics without losing
> > > >>>>>> useful
> > > >>>>>>>>>>>> information.
> > > >>>>>>>>>>>>>>> 2. introduce a tree-mode vertex description which
> > > >> can
> > > >>>>>> make
> > > >>>>>>> the
> > > >>>>>>>>>>>>>> description
> > > >>>>>>>>>>>>>>> more readable and easier to understand
> > > >>>>>>>>>>>>>>> 3. clean up and improve description for sql operator
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> here is an example with the changes for a sql job:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> vertex name:
> > > >>>>>>>>>>>>>>> GlobalGroupAggregate[52] -> (Calc[53] ->
> > > >>>>>>> NotNullEnforcer[54] ->
> > > >>>>>>>>>>> Sink:
> > > >>>>>>>>>>>>>>> tb_ads_dwi_pub_hbd_spm_dtr_002_003[54], Calc[55] ->
> > > >>>>>>>>>>>> NotNullEnforcer[56]
> > > >>>>>>>>>>>>>> ->
> > > >>>>>>>>>>>>>>> Sink: tb_ads_dwi_pub_hbd_spm_dtr_002_004[56])
> > > >>>>>>>>>>>>>>> vertex description:
> > > >>>>>>>>>>>>>>> [52]:GlobalGroupAggregate(groupBy=[stat_date,
> > > >>>>>> spm_url_ab,
> > > >>>>>>>>>> client],
> > > >>>>>>>>>>>>>>> select=[stat_date, spm_url_ab, client,
> > > >> COUNT(count1$0)
> > > >>>>>> AS
> > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_001, COUNT(distinct$0 count$1) AS
> > > >>>>>>>>>>> clk_uv_app_mtr_001,
> > > >>>>>>>>>>>>>>> COUNT(count1$2) AS clk_cnt_app_mtr_002,
> > > >> COUNT(distinct$0
> > > >>>>>>> count$3)
> > > >>>>>>>>>>> AS
> > > >>>>>>>>>>>>>>> clk_uv_app_mtr_002, COUNT(count1$4) AS
> > > >>>>>> clk_cnt_app_mtr_003,
> > > >>>>>>>>>>>>>>> COUNT(distinct$0 count$5) AS clk_uv_app_mtr_003]) :-
> > > >>>>>>>>>>>>>>> [53]:Calc(select=[CASE((client <> ''),
> > > >>>>>> CONCAT_WS('\u0004',
> > > >>>>>>>>>>>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab, '12345')),
> > > >> 1,
> > > >>>>>> 4),
> > > >>>>>>>>>> ':md5'),
> > > >>>>>>>>>>>>>>> CONCAT(spm_url_ab, ':spmab'), '12345:app',
> > > >>>>>> CONCAT(client,
> > > >>>>>>>>>>> ':client'),
> > > >>>>>>>>>>>>>>> CONCAT('ddd:', stat_date)),
> > > >> null:VARCHAR(2147483647)) AS
> > > >>>>>>> rowkey,
> > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_001 AS clk_cnt_app_dtr_001,
> > > >>>>>>> clk_uv_app_mtr_001 AS
> > > >>>>>>>>>>>>>>> clk_uv_app_dtr_001, clk_cnt_app_mtr_002 AS
> > > >>>>>>> clk_cnt_app_dtr_002,
> > > >>>>>>>>>>>>>>> clk_uv_app_mtr_002 AS clk_uv_app_dtr_002,
> > > >>>>>>> clk_cnt_app_mtr_003 AS
> > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_003, clk_uv_app_mtr_003 AS
> > > >>>>>>> clk_uv_app_dtr_003]) :
> > > >>>>>>>>>>> +-
> > > >>>>>>>>>>>>>>> [54]:NotNullEnforcer(fields=[rowkey]) : +-
> > > >>>>>>>>>>>>>>>
> > > >>
> > >
> >
> [54]:Sink(table=[default_catalog.default_database.tb_ads_dwi_pub_hbd_spm_dtr_002_003],
> > > >>>>>>>>>>>>>>> fields=[rowkey, clk_cnt_app_dtr_001,
> > > >> clk_uv_app_dtr_001,
> > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_002, clk_uv_app_dtr_002,
> > > >>>>>>> clk_cnt_app_dtr_003,
> > > >>>>>>>>>>>>>>> clk_uv_app_dtr_003]) +-
> > > >> [55]:Calc(select=[CASE((client
> > > >>>>>> <>
> > > >>>>>>> ''),
> > > >>>>>>>>>>>>>>> CONCAT_WS('\u0004',
> > > >>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab,
> > > >>>>>>>>>>>> '12345')), 1,
> > > >>>>>>>>>>>>>>> 4), ':md5'), CONCAT(spm_url_ab, ':spmab'),
> > > >> '12345:app',
> > > >>>>>>>>>>>> CONCAT('ddd:',
> > > >>>>>>>>>>>>>>> stat_date), CONCAT(client, ':client')), (client =
> > > >> ''),
> > > >>>>>>>>>>>>>> CONCAT_WS('\u0004',
> > > >>>>>>>>>>>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab, '92459')),
> > > >> 1,
> > > >>>>>> 4),
> > > >>>>>>>>>> ':md5'),
> > > >>>>>>>>>>>>>>> CONCAT(spm_url_ab, ':spmab'), '92459:app',
> > > >>>>>> CONCAT('ddd:',
> > > >>>>>>>>>>>> stat_date)),
> > > >>>>>>>>>>>>>>> null:VARCHAR(2147483647)) AS rowkey,
> > > >>>>>> clk_cnt_app_mtr_001 AS
> > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_001, clk_uv_app_mtr_001 AS
> > > >>>>>>> clk_uv_app_dtr_001,
> > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_002 AS clk_cnt_app_dtr_002,
> > > >>>>>>> clk_uv_app_mtr_002 AS
> > > >>>>>>>>>>>>>>> clk_uv_app_dtr_002, clk_cnt_app_mtr_003 AS
> > > >>>>>>> clk_cnt_app_dtr_003,
> > > >>>>>>>>>>>>>>> clk_uv_app_mtr_003 AS clk_uv_app_dtr_003]) +-
> > > >>>>>>>>>>>>>>> [56]:NotNullEnforcer(fields=[rowkey]) +-
> > > >>>>>>>>>>>>>>>
> > > >>
> > >
> >
> [56]:Sink(table=[default_catalog.default_database.tb_ads_dwi_pub_hbd_spm_dtr_002_004],
> > > >>>>>>>>>>>>>>> fields=[rowkey, clk_cnt_app_dtr_001,
> > > >> clk_uv_app_dtr_001,
> > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_002, clk_uv_app_dtr_002,
> > > >>>>>>> clk_cnt_app_dtr_003,
> > > >>>>>>>>>>>>>>> clk_uv_app_dtr_003])
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> For more detail on the proposal:
> > > >>>>>>>>>>>>>>>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1VUVJeHY_We09GY53-K2lETP3HUNZG9wMKyecFWk_Wxk
> > > >>>>>>>>>>>>>>> <
> > > >>
> > >
> >
> https://docs.google.com/document/d/1VUVJeHY_We09GY53-K2lETP3HUNZG9wMKyecFWk_Wxk/edit#
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Looking forward to your feedback, thanks.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Bests
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Wenlong Lyu
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > >
> > >
> >
>
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>

Reply via email to