+1 for the FLIP. We have met the problem that a long name stuck the metric
collection for SQL jobs.

wenlong.lwl <wenlong88....@gmail.com> 于2021年11月19日周五 下午10:29写道:

> hi, yun,
> Thanks for the suggestion, but I am not sure whether we need such a prefix
> or not, because the log has included vertex id, when the name is concise
> enough, we can get the vertex id easily.
> Does anyone have some comments on this?
>
>
> Best,
> Wenlong Lyu
>
> On Thu, 18 Nov 2021 at 19:03, Yun Tang <tang...@apache.org> wrote:
>
> > Hi Wenlong,
> >
> > Thanks for bringing up this discussion and I believe many guys have ever
> > suffered from the long and unreadable operator name for long time.
> >
> > I have another suggestion which inspired by Aitozi, that we could add
> some
> > hint to tell the vertex index. Such as make the pipeline from "source -->
> > flatMap --> sink" to "[vertex-0] souce --> [vertex-1] flatMap -->
> > [vertex-2] sink".
> > This could make user or developer much easier to know which vertex is
> > wrong when meeting exceptions.
> >
> > Best
> > Yun Tang
> >
> > On 2021/11/17 07:42:28 godfrey he wrote:
> > > Hi Wenlong, I'm fine with the config options.
> > >
> > > Best,
> > > Godfrey
> > >
> > > wenlong.lwl <wenlong88....@gmail.com> 于2021年11月17日周三 下午3:13写道:
> > >
> > > >
> > > > Hi Chesney and Konstantin,
> > > > thanks for your feedback, I have added a section about How we support
> > set
> > > > description at DataStream API in the doc.
> > > >
> > > >
> > > > Bests,
> > > > Wenlong
> > > >
> > > > On Tue, 16 Nov 2021 at 21:05, Konstantin Knauf <kna...@apache.org>
> > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > Thanks for starting this discussion. I am in favor of solving this
> > for
> > > > > DataStream and Table API at the same time, using the same
> > configuration
> > > > > keys. IMO we shouldn't introduce any additional fragmentation if we
> > can
> > > > > avoid it.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Konstantin
> > > > >
> > > > > On Tue, Nov 16, 2021 at 1:50 PM wenlong.lwl <
> wenlong88....@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > hi, Chesney, we focus on sql first because the operator and
> > topology of
> > > > > sql
> > > > > > jobs are generated by the engine, raising most of the problems in
> > naming,
> > > > > > not only because the name is long but also because the topology
> > can be
> > > > > more
> > > > > > complex than DataStream.
> > > > > >
> > > > > > The case in Datastream is much better, most of the names in
> > DataStream
> > > > > API
> > > > > > are quite concise except for the windowing you mentioned, and the
> > > > > topology
> > > > > > is usually simpler,  what's more we can easily expose to
> > DataStream API
> > > > > as
> > > > > > a second step once the foundation implementation is done. If it
> is
> > > > > > necessary, we can also cover the changes on DataStream API now,
> > maybe
> > > > > take
> > > > > > Windowing first as an example?
> > > > > >
> > > > > > Best,
> > > > > > Wenlong
> > > > > >
> > > > > > On Tue, 16 Nov 2021 at 19:14, Chesnay Schepler <
> ches...@apache.org
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Why should this be specific to the table API? The datastream
> API
> > has
> > > > > > > similar issues with long operator names (like windowing).
> > > > > > >
> > > > > > > On 16/11/2021 11:22, wenlong.lwl wrote:
> > > > > > > > Thanks Godfrey for the suggestion.
> > > > > > > > Regarding 1, how about
> > > > > table.optimizer.simplify-operator-name-enabled,
> > > > > > > > which means that we would simplify the name of operator and
> > keep the
> > > > > > > > details in description only.
> > > > > > > > "table.optimizer.operator-name.description-enabled" can not
> > describe
> > > > > > what
> > > > > > > > it means I think.
> > > > > > > > Regarding 2, I agree that it is better to use enum instead of
> > > > > boolean.
> > > > > > > For
> > > > > > > > key I think you are meaning
> > "pipeline.vertex-description-pattern"
> > > > > > instead
> > > > > > > > of "pipeline.vertex-name-pattern", and I would like to choose
> > > > > > > DEFAULT/TREE
> > > > > > > > for values.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Wenlong
> > > > > > > >
> > > > > > > > On Tue, 16 Nov 2021 at 17:28, godfrey he <
> godfre...@gmail.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > >> Thanks for creating this FLIP Wenlong.
> > > > > > > >>
> > > > > > > >> The FLIP already looks pretty solid, I think the config
> > options can
> > > > > be
> > > > > > > >> improved a little:
> > > > > > > >> 1) about table.optimizer.separate-name-and-description, I
> > think
> > > > > > > >> "operator-name" should be considered in the option,
> > > > > > > >> how about table.optimizer.operator-name.description-enabled
> ?
> > > > > > > >> 2) about pipeline.tree-mode-vertex-description, I think we
> > can make
> > > > > > > >> the mode accept string value,
> > > > > > > >> which is more flexible. How about
> > pipeline.vertex-name-pattern, the
> > > > > > > >> default value is "TREE",
> > > > > > > >> another option is "CASCADE" (or "DEFAULT", which is more
> > simple)
> > > > > > > >>
> > > > > > > >> What do you think?
> > > > > > > >>
> > > > > > > >> Best,
> > > > > > > >> Godfrey
> > > > > > > >>
> > > > > > > >> wenlong.lwl <wenlong88....@gmail.com> 于2021年11月15日周一
> > 下午6:36写道:
> > > > > > > >>
> > > > > > > >>> Hi, all, FYI the FLIP doc has been created :
> > > > > > > >>>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-195%3A+Improve+the+name+and+structure+of+vertex+and+operator+name+for+sql+job
> > > > > > > >>> Best,
> > > > > > > >>> Wenlong
> > > > > > > >>>
> > > > > > > >>> On Mon, 15 Nov 2021 at 11:41, wenlong.lwl <
> > wenlong88....@gmail.com
> > > > > >
> > > > > > > >> wrote:
> > > > > > > >>>> Hi all,
> > > > > > > >>>> Thanks for the feedback, It seems that the proposal is
> > accepted by
> > > > > > all
> > > > > > > >> of
> > > > > > > >>>> you guys. I will prepare a formal FLIP document and then
> go
> > ahead
> > > > > to
> > > > > > > >> the
> > > > > > > >>>> vote stage.
> > > > > > > >>>> If any one has any other comments or suggestions, please
> > let me
> > > > > > know,
> > > > > > > >>>> thanks.
> > > > > > > >>>>
> > > > > > > >>>> Best,
> > > > > > > >>>> Wenlong
> > > > > > > >>>>
> > > > > > > >>>> On Fri, 12 Nov 2021 at 05:54, Neng Lu <nl...@apache.org>
> > wrote:
> > > > > > > >>>>
> > > > > > > >>>>> +1 (non-binding)
> > > > > > > >>>>> This change will really help to ease developer life.
> > > > > > > >>>>>
> > > > > > > >>>>> On Thu, Nov 11, 2021 at 6:33 AM Guowei Ma <
> > guowei....@gmail.com>
> > > > > > > >> wrote:
> > > > > > > >>>>>> +1
> > > > > > > >>>>>> This would be very helpful for our debugging online job.
> > > > > > > >>>>>>
> > > > > > > >>>>>> Best,
> > > > > > > >>>>>> Guowei
> > > > > > > >>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>> On Thu, Nov 11, 2021 at 8:03 PM Yuepeng Pan <
> > flin...@126.com>
> > > > > > > wrote:
> > > > > > > >>>>>>
> > > > > > > >>>>>>> +1. It's useful to understand the job topology.
> > > > > > > >>>>>>> Looking forward to this feature.
> > > > > > > >>>>>>> Best,
> > > > > > > >>>>>>> Yuepeng Pan.
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> At 2021-11-11 19:44:44, "Yangze Guo" <
> karma...@gmail.com
> > >
> > > > > wrote:
> > > > > > > >>>>>>>> +1. That's gonna help a lot for debugging.
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> Best,
> > > > > > > >>>>>>>> Yangze Guo
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> On Thu, Nov 11, 2021 at 7:37 PM Till Rohrmann <
> > > > > > > >> trohrm...@apache.org>
> > > > > > > >>>>>>> wrote:
> > > > > > > >>>>>>>>> This improvement looks like it makes the life of our
> > users a
> > > > > > lot
> > > > > > > >>>>>> easier
> > > > > > > >>>>>>>>> when it comes to understanding logs and reading the
> > UI. Hence
> > > > > > > >> +1.
> > > > > > > >>>>>>>>> Cheers,
> > > > > > > >>>>>>>>> Till
> > > > > > > >>>>>>>>>
> > > > > > > >>>>>>>>> On Thu, Nov 11, 2021 at 11:59 AM JING ZHANG <
> > > > > > > >> beyond1...@gmail.com>
> > > > > > > >>>>>>> wrote:
> > > > > > > >>>>>>>>>> Big +1.
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> This is a problem frequently encountered in our
> > production
> > > > > > > >>>>>> platform,
> > > > > > > >>>>>>> look
> > > > > > > >>>>>>>>>> forward to this improvement.
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> Best,
> > > > > > > >>>>>>>>>> Jing Zhang
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> Martijn Visser <mart...@ververica.com>
> 于2021年11月11日周四
> > > > > > > >> 下午6:26写道:
> > > > > > > >>>>>>>>>>> +1. Looks much better now
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> On Thu, 11 Nov 2021 at 11:07, godfrey he <
> > > > > > > >> godfre...@gmail.com>
> > > > > > > >>>>>>> wrote:
> > > > > > > >>>>>>>>>>>> Thanks for driving this, this improvement solves a
> > > > > > > >>>>>> long-complained
> > > > > > > >>>>>>>>>>>> problem, +1
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>> Best,
> > > > > > > >>>>>>>>>>>> Godfrey
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>> Jark Wu <imj...@gmail.com> 于2021年11月11日周四
> 下午5:40写道:
> > > > > > > >>>>>>>>>>>>> +1 for this. It looks much more clear and
> > structured.
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> Best,
> > > > > > > >>>>>>>>>>>>> Jark
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> On Thu, 11 Nov 2021 at 17:23, Chesnay Schepler <
> > > > > > > >>>>>>> ches...@apache.org>
> > > > > > > >>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>> I'm generally in favor of it, and there are
> > already
> > > > > > > >>>>>> tickets
> > > > > > > >>>>>>> that
> > > > > > > >>>>>>>>>>>>>> proposed a dedicated operator/vertex
> description:
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>
> https://issues.apache.org/jira/browse/FLINK-20388
> > > > > > > >>>>>>>>>>>>>>
> https://issues.apache.org/jira/browse/FLINK-21858
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> On 11/11/2021 10:02, wenlong.lwl wrote:
> > > > > > > >>>>>>>>>>>>>>> Hi, all, I would like to start a discussion
> > about an
> > > > > > > >>>>>>> improvement
> > > > > > > >>>>>>>>>> on
> > > > > > > >>>>>>>>>>>> name
> > > > > > > >>>>>>>>>>>>>>> and structure of job vertex name, mainly to
> > improve
> > > > > > > >>>>>>> experience of
> > > > > > > >>>>>>>>>>>>>> debugging
> > > > > > > >>>>>>>>>>>>>>> and analyzing sql job at runtime.
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> the main proposed changes including:
> > > > > > > >>>>>>>>>>>>>>> 1. separate description and name for operator,
> so
> > > > > > > >> that
> > > > > > > >>>>>> we
> > > > > > > >>>>>>> can
> > > > > > > >>>>>>>>>> have
> > > > > > > >>>>>>>>>>>>>> detailed
> > > > > > > >>>>>>>>>>>>>>> info at description and shorter name, which
> > could be
> > > > > > > >>>>>> more
> > > > > > > >>>>>>>>>> friendly
> > > > > > > >>>>>>>>>>>> for
> > > > > > > >>>>>>>>>>>>>>> external systems like logging/metrics without
> > losing
> > > > > > > >>>>>> useful
> > > > > > > >>>>>>>>>>>> information.
> > > > > > > >>>>>>>>>>>>>>> 2. introduce a tree-mode vertex description
> which
> > > > > > > >> can
> > > > > > > >>>>>> make
> > > > > > > >>>>>>> the
> > > > > > > >>>>>>>>>>>>>> description
> > > > > > > >>>>>>>>>>>>>>> more readable and easier to understand
> > > > > > > >>>>>>>>>>>>>>> 3. clean up and improve description for sql
> > operator
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> here is an example with the changes for a sql
> > job:
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> vertex name:
> > > > > > > >>>>>>>>>>>>>>> GlobalGroupAggregate[52] -> (Calc[53] ->
> > > > > > > >>>>>>> NotNullEnforcer[54] ->
> > > > > > > >>>>>>>>>>> Sink:
> > > > > > > >>>>>>>>>>>>>>> tb_ads_dwi_pub_hbd_spm_dtr_002_003[54],
> Calc[55]
> > ->
> > > > > > > >>>>>>>>>>>> NotNullEnforcer[56]
> > > > > > > >>>>>>>>>>>>>> ->
> > > > > > > >>>>>>>>>>>>>>> Sink: tb_ads_dwi_pub_hbd_spm_dtr_002_004[56])
> > > > > > > >>>>>>>>>>>>>>> vertex description:
> > > > > > > >>>>>>>>>>>>>>> [52]:GlobalGroupAggregate(groupBy=[stat_date,
> > > > > > > >>>>>> spm_url_ab,
> > > > > > > >>>>>>>>>> client],
> > > > > > > >>>>>>>>>>>>>>> select=[stat_date, spm_url_ab, client,
> > > > > > > >> COUNT(count1$0)
> > > > > > > >>>>>> AS
> > > > > > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_001, COUNT(distinct$0 count$1)
> AS
> > > > > > > >>>>>>>>>>> clk_uv_app_mtr_001,
> > > > > > > >>>>>>>>>>>>>>> COUNT(count1$2) AS clk_cnt_app_mtr_002,
> > > > > > > >> COUNT(distinct$0
> > > > > > > >>>>>>> count$3)
> > > > > > > >>>>>>>>>>> AS
> > > > > > > >>>>>>>>>>>>>>> clk_uv_app_mtr_002, COUNT(count1$4) AS
> > > > > > > >>>>>> clk_cnt_app_mtr_003,
> > > > > > > >>>>>>>>>>>>>>> COUNT(distinct$0 count$5) AS
> > clk_uv_app_mtr_003]) :-
> > > > > > > >>>>>>>>>>>>>>> [53]:Calc(select=[CASE((client <> ''),
> > > > > > > >>>>>> CONCAT_WS('\u0004',
> > > > > > > >>>>>>>>>>>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab,
> > '12345')),
> > > > > > > >> 1,
> > > > > > > >>>>>> 4),
> > > > > > > >>>>>>>>>> ':md5'),
> > > > > > > >>>>>>>>>>>>>>> CONCAT(spm_url_ab, ':spmab'), '12345:app',
> > > > > > > >>>>>> CONCAT(client,
> > > > > > > >>>>>>>>>>> ':client'),
> > > > > > > >>>>>>>>>>>>>>> CONCAT('ddd:', stat_date)),
> > > > > > > >> null:VARCHAR(2147483647)) AS
> > > > > > > >>>>>>> rowkey,
> > > > > > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_001 AS clk_cnt_app_dtr_001,
> > > > > > > >>>>>>> clk_uv_app_mtr_001 AS
> > > > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_001, clk_cnt_app_mtr_002 AS
> > > > > > > >>>>>>> clk_cnt_app_dtr_002,
> > > > > > > >>>>>>>>>>>>>>> clk_uv_app_mtr_002 AS clk_uv_app_dtr_002,
> > > > > > > >>>>>>> clk_cnt_app_mtr_003 AS
> > > > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_003, clk_uv_app_mtr_003 AS
> > > > > > > >>>>>>> clk_uv_app_dtr_003]) :
> > > > > > > >>>>>>>>>>> +-
> > > > > > > >>>>>>>>>>>>>>> [54]:NotNullEnforcer(fields=[rowkey]) : +-
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> >
> [54]:Sink(table=[default_catalog.default_database.tb_ads_dwi_pub_hbd_spm_dtr_002_003],
> > > > > > > >>>>>>>>>>>>>>> fields=[rowkey, clk_cnt_app_dtr_001,
> > > > > > > >> clk_uv_app_dtr_001,
> > > > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_002, clk_uv_app_dtr_002,
> > > > > > > >>>>>>> clk_cnt_app_dtr_003,
> > > > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_003]) +-
> > > > > > > >> [55]:Calc(select=[CASE((client
> > > > > > > >>>>>> <>
> > > > > > > >>>>>>> ''),
> > > > > > > >>>>>>>>>>>>>>> CONCAT_WS('\u0004',
> > > > > > > >>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab,
> > > > > > > >>>>>>>>>>>> '12345')), 1,
> > > > > > > >>>>>>>>>>>>>>> 4), ':md5'), CONCAT(spm_url_ab, ':spmab'),
> > > > > > > >> '12345:app',
> > > > > > > >>>>>>>>>>>> CONCAT('ddd:',
> > > > > > > >>>>>>>>>>>>>>> stat_date), CONCAT(client, ':client')),
> (client =
> > > > > > > >> ''),
> > > > > > > >>>>>>>>>>>>>> CONCAT_WS('\u0004',
> > > > > > > >>>>>>>>>>>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab,
> > '92459')),
> > > > > > > >> 1,
> > > > > > > >>>>>> 4),
> > > > > > > >>>>>>>>>> ':md5'),
> > > > > > > >>>>>>>>>>>>>>> CONCAT(spm_url_ab, ':spmab'), '92459:app',
> > > > > > > >>>>>> CONCAT('ddd:',
> > > > > > > >>>>>>>>>>>> stat_date)),
> > > > > > > >>>>>>>>>>>>>>> null:VARCHAR(2147483647)) AS rowkey,
> > > > > > > >>>>>> clk_cnt_app_mtr_001 AS
> > > > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_001, clk_uv_app_mtr_001 AS
> > > > > > > >>>>>>> clk_uv_app_dtr_001,
> > > > > > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_002 AS clk_cnt_app_dtr_002,
> > > > > > > >>>>>>> clk_uv_app_mtr_002 AS
> > > > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_002, clk_cnt_app_mtr_003 AS
> > > > > > > >>>>>>> clk_cnt_app_dtr_003,
> > > > > > > >>>>>>>>>>>>>>> clk_uv_app_mtr_003 AS clk_uv_app_dtr_003]) +-
> > > > > > > >>>>>>>>>>>>>>> [56]:NotNullEnforcer(fields=[rowkey]) +-
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> >
> [56]:Sink(table=[default_catalog.default_database.tb_ads_dwi_pub_hbd_spm_dtr_002_004],
> > > > > > > >>>>>>>>>>>>>>> fields=[rowkey, clk_cnt_app_dtr_001,
> > > > > > > >> clk_uv_app_dtr_001,
> > > > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_002, clk_uv_app_dtr_002,
> > > > > > > >>>>>>> clk_cnt_app_dtr_003,
> > > > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_003])
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> For more detail on the proposal:
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> >
> https://docs.google.com/document/d/1VUVJeHY_We09GY53-K2lETP3HUNZG9wMKyecFWk_Wxk
> > > > > > > >>>>>>>>>>>>>>> <
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> >
> https://docs.google.com/document/d/1VUVJeHY_We09GY53-K2lETP3HUNZG9wMKyecFWk_Wxk/edit#
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Looking forward to your feedback, thanks.
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Bests
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Wenlong Lyu
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Konstantin Knauf
> > > > >
> > > > > https://twitter.com/snntrable
> > > > >
> > > > > https://github.com/knaufk
> > > > >
> > >
> >
>

Reply via email to