hi, yun,
Thanks for the suggestion, but I am not sure whether we need such a prefix
or not, because the log has included vertex id, when the name is concise
enough, we can get the vertex id easily.
Does anyone have some comments on this?


Best,
Wenlong Lyu

On Thu, 18 Nov 2021 at 19:03, Yun Tang <tang...@apache.org> wrote:

> Hi Wenlong,
>
> Thanks for bringing up this discussion and I believe many guys have ever
> suffered from the long and unreadable operator name for long time.
>
> I have another suggestion which inspired by Aitozi, that we could add some
> hint to tell the vertex index. Such as make the pipeline from "source -->
> flatMap --> sink" to "[vertex-0] souce --> [vertex-1] flatMap -->
> [vertex-2] sink".
> This could make user or developer much easier to know which vertex is
> wrong when meeting exceptions.
>
> Best
> Yun Tang
>
> On 2021/11/17 07:42:28 godfrey he wrote:
> > Hi Wenlong, I'm fine with the config options.
> >
> > Best,
> > Godfrey
> >
> > wenlong.lwl <wenlong88....@gmail.com> 于2021年11月17日周三 下午3:13写道:
> >
> > >
> > > Hi Chesney and Konstantin,
> > > thanks for your feedback, I have added a section about How we support
> set
> > > description at DataStream API in the doc.
> > >
> > >
> > > Bests,
> > > Wenlong
> > >
> > > On Tue, 16 Nov 2021 at 21:05, Konstantin Knauf <kna...@apache.org>
> wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > Thanks for starting this discussion. I am in favor of solving this
> for
> > > > DataStream and Table API at the same time, using the same
> configuration
> > > > keys. IMO we shouldn't introduce any additional fragmentation if we
> can
> > > > avoid it.
> > > >
> > > > Cheers,
> > > >
> > > > Konstantin
> > > >
> > > > On Tue, Nov 16, 2021 at 1:50 PM wenlong.lwl <wenlong88....@gmail.com
> >
> > > > wrote:
> > > >
> > > > > hi, Chesney, we focus on sql first because the operator and
> topology of
> > > > sql
> > > > > jobs are generated by the engine, raising most of the problems in
> naming,
> > > > > not only because the name is long but also because the topology
> can be
> > > > more
> > > > > complex than DataStream.
> > > > >
> > > > > The case in Datastream is much better, most of the names in
> DataStream
> > > > API
> > > > > are quite concise except for the windowing you mentioned, and the
> > > > topology
> > > > > is usually simpler,  what's more we can easily expose to
> DataStream API
> > > > as
> > > > > a second step once the foundation implementation is done. If it is
> > > > > necessary, we can also cover the changes on DataStream API now,
> maybe
> > > > take
> > > > > Windowing first as an example?
> > > > >
> > > > > Best,
> > > > > Wenlong
> > > > >
> > > > > On Tue, 16 Nov 2021 at 19:14, Chesnay Schepler <ches...@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > Why should this be specific to the table API? The datastream API
> has
> > > > > > similar issues with long operator names (like windowing).
> > > > > >
> > > > > > On 16/11/2021 11:22, wenlong.lwl wrote:
> > > > > > > Thanks Godfrey for the suggestion.
> > > > > > > Regarding 1, how about
> > > > table.optimizer.simplify-operator-name-enabled,
> > > > > > > which means that we would simplify the name of operator and
> keep the
> > > > > > > details in description only.
> > > > > > > "table.optimizer.operator-name.description-enabled" can not
> describe
> > > > > what
> > > > > > > it means I think.
> > > > > > > Regarding 2, I agree that it is better to use enum instead of
> > > > boolean.
> > > > > > For
> > > > > > > key I think you are meaning
> "pipeline.vertex-description-pattern"
> > > > > instead
> > > > > > > of "pipeline.vertex-name-pattern", and I would like to choose
> > > > > > DEFAULT/TREE
> > > > > > > for values.
> > > > > > >
> > > > > > > Best,
> > > > > > > Wenlong
> > > > > > >
> > > > > > > On Tue, 16 Nov 2021 at 17:28, godfrey he <godfre...@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > >> Thanks for creating this FLIP Wenlong.
> > > > > > >>
> > > > > > >> The FLIP already looks pretty solid, I think the config
> options can
> > > > be
> > > > > > >> improved a little:
> > > > > > >> 1) about table.optimizer.separate-name-and-description, I
> think
> > > > > > >> "operator-name" should be considered in the option,
> > > > > > >> how about table.optimizer.operator-name.description-enabled ?
> > > > > > >> 2) about pipeline.tree-mode-vertex-description, I think we
> can make
> > > > > > >> the mode accept string value,
> > > > > > >> which is more flexible. How about
> pipeline.vertex-name-pattern, the
> > > > > > >> default value is "TREE",
> > > > > > >> another option is "CASCADE" (or "DEFAULT", which is more
> simple)
> > > > > > >>
> > > > > > >> What do you think?
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Godfrey
> > > > > > >>
> > > > > > >> wenlong.lwl <wenlong88....@gmail.com> 于2021年11月15日周一
> 下午6:36写道:
> > > > > > >>
> > > > > > >>> Hi, all, FYI the FLIP doc has been created :
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-195%3A+Improve+the+name+and+structure+of+vertex+and+operator+name+for+sql+job
> > > > > > >>> Best,
> > > > > > >>> Wenlong
> > > > > > >>>
> > > > > > >>> On Mon, 15 Nov 2021 at 11:41, wenlong.lwl <
> wenlong88....@gmail.com
> > > > >
> > > > > > >> wrote:
> > > > > > >>>> Hi all,
> > > > > > >>>> Thanks for the feedback, It seems that the proposal is
> accepted by
> > > > > all
> > > > > > >> of
> > > > > > >>>> you guys. I will prepare a formal FLIP document and then go
> ahead
> > > > to
> > > > > > >> the
> > > > > > >>>> vote stage.
> > > > > > >>>> If any one has any other comments or suggestions, please
> let me
> > > > > know,
> > > > > > >>>> thanks.
> > > > > > >>>>
> > > > > > >>>> Best,
> > > > > > >>>> Wenlong
> > > > > > >>>>
> > > > > > >>>> On Fri, 12 Nov 2021 at 05:54, Neng Lu <nl...@apache.org>
> wrote:
> > > > > > >>>>
> > > > > > >>>>> +1 (non-binding)
> > > > > > >>>>> This change will really help to ease developer life.
> > > > > > >>>>>
> > > > > > >>>>> On Thu, Nov 11, 2021 at 6:33 AM Guowei Ma <
> guowei....@gmail.com>
> > > > > > >> wrote:
> > > > > > >>>>>> +1
> > > > > > >>>>>> This would be very helpful for our debugging online job.
> > > > > > >>>>>>
> > > > > > >>>>>> Best,
> > > > > > >>>>>> Guowei
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>> On Thu, Nov 11, 2021 at 8:03 PM Yuepeng Pan <
> flin...@126.com>
> > > > > > wrote:
> > > > > > >>>>>>
> > > > > > >>>>>>> +1. It's useful to understand the job topology.
> > > > > > >>>>>>> Looking forward to this feature.
> > > > > > >>>>>>> Best,
> > > > > > >>>>>>> Yuepeng Pan.
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> At 2021-11-11 19:44:44, "Yangze Guo" <karma...@gmail.com
> >
> > > > wrote:
> > > > > > >>>>>>>> +1. That's gonna help a lot for debugging.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Best,
> > > > > > >>>>>>>> Yangze Guo
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> On Thu, Nov 11, 2021 at 7:37 PM Till Rohrmann <
> > > > > > >> trohrm...@apache.org>
> > > > > > >>>>>>> wrote:
> > > > > > >>>>>>>>> This improvement looks like it makes the life of our
> users a
> > > > > lot
> > > > > > >>>>>> easier
> > > > > > >>>>>>>>> when it comes to understanding logs and reading the
> UI. Hence
> > > > > > >> +1.
> > > > > > >>>>>>>>> Cheers,
> > > > > > >>>>>>>>> Till
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> On Thu, Nov 11, 2021 at 11:59 AM JING ZHANG <
> > > > > > >> beyond1...@gmail.com>
> > > > > > >>>>>>> wrote:
> > > > > > >>>>>>>>>> Big +1.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> This is a problem frequently encountered in our
> production
> > > > > > >>>>>> platform,
> > > > > > >>>>>>> look
> > > > > > >>>>>>>>>> forward to this improvement.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Best,
> > > > > > >>>>>>>>>> Jing Zhang
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Martijn Visser <mart...@ververica.com> 于2021年11月11日周四
> > > > > > >> 下午6:26写道:
> > > > > > >>>>>>>>>>> +1. Looks much better now
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> On Thu, 11 Nov 2021 at 11:07, godfrey he <
> > > > > > >> godfre...@gmail.com>
> > > > > > >>>>>>> wrote:
> > > > > > >>>>>>>>>>>> Thanks for driving this, this improvement solves a
> > > > > > >>>>>> long-complained
> > > > > > >>>>>>>>>>>> problem, +1
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Best,
> > > > > > >>>>>>>>>>>> Godfrey
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Jark Wu <imj...@gmail.com> 于2021年11月11日周四 下午5:40写道:
> > > > > > >>>>>>>>>>>>> +1 for this. It looks much more clear and
> structured.
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> Best,
> > > > > > >>>>>>>>>>>>> Jark
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> On Thu, 11 Nov 2021 at 17:23, Chesnay Schepler <
> > > > > > >>>>>>> ches...@apache.org>
> > > > > > >>>>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>>> I'm generally in favor of it, and there are
> already
> > > > > > >>>>>> tickets
> > > > > > >>>>>>> that
> > > > > > >>>>>>>>>>>>>> proposed a dedicated operator/vertex description:
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-20388
> > > > > > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-21858
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> On 11/11/2021 10:02, wenlong.lwl wrote:
> > > > > > >>>>>>>>>>>>>>> Hi, all, I would like to start a discussion
> about an
> > > > > > >>>>>>> improvement
> > > > > > >>>>>>>>>> on
> > > > > > >>>>>>>>>>>> name
> > > > > > >>>>>>>>>>>>>>> and structure of job vertex name, mainly to
> improve
> > > > > > >>>>>>> experience of
> > > > > > >>>>>>>>>>>>>> debugging
> > > > > > >>>>>>>>>>>>>>> and analyzing sql job at runtime.
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> the main proposed changes including:
> > > > > > >>>>>>>>>>>>>>> 1. separate description and name for operator, so
> > > > > > >> that
> > > > > > >>>>>> we
> > > > > > >>>>>>> can
> > > > > > >>>>>>>>>> have
> > > > > > >>>>>>>>>>>>>> detailed
> > > > > > >>>>>>>>>>>>>>> info at description and shorter name, which
> could be
> > > > > > >>>>>> more
> > > > > > >>>>>>>>>> friendly
> > > > > > >>>>>>>>>>>> for
> > > > > > >>>>>>>>>>>>>>> external systems like logging/metrics without
> losing
> > > > > > >>>>>> useful
> > > > > > >>>>>>>>>>>> information.
> > > > > > >>>>>>>>>>>>>>> 2. introduce a tree-mode vertex description which
> > > > > > >> can
> > > > > > >>>>>> make
> > > > > > >>>>>>> the
> > > > > > >>>>>>>>>>>>>> description
> > > > > > >>>>>>>>>>>>>>> more readable and easier to understand
> > > > > > >>>>>>>>>>>>>>> 3. clean up and improve description for sql
> operator
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> here is an example with the changes for a sql
> job:
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> vertex name:
> > > > > > >>>>>>>>>>>>>>> GlobalGroupAggregate[52] -> (Calc[53] ->
> > > > > > >>>>>>> NotNullEnforcer[54] ->
> > > > > > >>>>>>>>>>> Sink:
> > > > > > >>>>>>>>>>>>>>> tb_ads_dwi_pub_hbd_spm_dtr_002_003[54], Calc[55]
> ->
> > > > > > >>>>>>>>>>>> NotNullEnforcer[56]
> > > > > > >>>>>>>>>>>>>> ->
> > > > > > >>>>>>>>>>>>>>> Sink: tb_ads_dwi_pub_hbd_spm_dtr_002_004[56])
> > > > > > >>>>>>>>>>>>>>> vertex description:
> > > > > > >>>>>>>>>>>>>>> [52]:GlobalGroupAggregate(groupBy=[stat_date,
> > > > > > >>>>>> spm_url_ab,
> > > > > > >>>>>>>>>> client],
> > > > > > >>>>>>>>>>>>>>> select=[stat_date, spm_url_ab, client,
> > > > > > >> COUNT(count1$0)
> > > > > > >>>>>> AS
> > > > > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_001, COUNT(distinct$0 count$1) AS
> > > > > > >>>>>>>>>>> clk_uv_app_mtr_001,
> > > > > > >>>>>>>>>>>>>>> COUNT(count1$2) AS clk_cnt_app_mtr_002,
> > > > > > >> COUNT(distinct$0
> > > > > > >>>>>>> count$3)
> > > > > > >>>>>>>>>>> AS
> > > > > > >>>>>>>>>>>>>>> clk_uv_app_mtr_002, COUNT(count1$4) AS
> > > > > > >>>>>> clk_cnt_app_mtr_003,
> > > > > > >>>>>>>>>>>>>>> COUNT(distinct$0 count$5) AS
> clk_uv_app_mtr_003]) :-
> > > > > > >>>>>>>>>>>>>>> [53]:Calc(select=[CASE((client <> ''),
> > > > > > >>>>>> CONCAT_WS('\u0004',
> > > > > > >>>>>>>>>>>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab,
> '12345')),
> > > > > > >> 1,
> > > > > > >>>>>> 4),
> > > > > > >>>>>>>>>> ':md5'),
> > > > > > >>>>>>>>>>>>>>> CONCAT(spm_url_ab, ':spmab'), '12345:app',
> > > > > > >>>>>> CONCAT(client,
> > > > > > >>>>>>>>>>> ':client'),
> > > > > > >>>>>>>>>>>>>>> CONCAT('ddd:', stat_date)),
> > > > > > >> null:VARCHAR(2147483647)) AS
> > > > > > >>>>>>> rowkey,
> > > > > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_001 AS clk_cnt_app_dtr_001,
> > > > > > >>>>>>> clk_uv_app_mtr_001 AS
> > > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_001, clk_cnt_app_mtr_002 AS
> > > > > > >>>>>>> clk_cnt_app_dtr_002,
> > > > > > >>>>>>>>>>>>>>> clk_uv_app_mtr_002 AS clk_uv_app_dtr_002,
> > > > > > >>>>>>> clk_cnt_app_mtr_003 AS
> > > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_003, clk_uv_app_mtr_003 AS
> > > > > > >>>>>>> clk_uv_app_dtr_003]) :
> > > > > > >>>>>>>>>>> +-
> > > > > > >>>>>>>>>>>>>>> [54]:NotNullEnforcer(fields=[rowkey]) : +-
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> [54]:Sink(table=[default_catalog.default_database.tb_ads_dwi_pub_hbd_spm_dtr_002_003],
> > > > > > >>>>>>>>>>>>>>> fields=[rowkey, clk_cnt_app_dtr_001,
> > > > > > >> clk_uv_app_dtr_001,
> > > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_002, clk_uv_app_dtr_002,
> > > > > > >>>>>>> clk_cnt_app_dtr_003,
> > > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_003]) +-
> > > > > > >> [55]:Calc(select=[CASE((client
> > > > > > >>>>>> <>
> > > > > > >>>>>>> ''),
> > > > > > >>>>>>>>>>>>>>> CONCAT_WS('\u0004',
> > > > > > >>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab,
> > > > > > >>>>>>>>>>>> '12345')), 1,
> > > > > > >>>>>>>>>>>>>>> 4), ':md5'), CONCAT(spm_url_ab, ':spmab'),
> > > > > > >> '12345:app',
> > > > > > >>>>>>>>>>>> CONCAT('ddd:',
> > > > > > >>>>>>>>>>>>>>> stat_date), CONCAT(client, ':client')), (client =
> > > > > > >> ''),
> > > > > > >>>>>>>>>>>>>> CONCAT_WS('\u0004',
> > > > > > >>>>>>>>>>>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab,
> '92459')),
> > > > > > >> 1,
> > > > > > >>>>>> 4),
> > > > > > >>>>>>>>>> ':md5'),
> > > > > > >>>>>>>>>>>>>>> CONCAT(spm_url_ab, ':spmab'), '92459:app',
> > > > > > >>>>>> CONCAT('ddd:',
> > > > > > >>>>>>>>>>>> stat_date)),
> > > > > > >>>>>>>>>>>>>>> null:VARCHAR(2147483647)) AS rowkey,
> > > > > > >>>>>> clk_cnt_app_mtr_001 AS
> > > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_001, clk_uv_app_mtr_001 AS
> > > > > > >>>>>>> clk_uv_app_dtr_001,
> > > > > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_002 AS clk_cnt_app_dtr_002,
> > > > > > >>>>>>> clk_uv_app_mtr_002 AS
> > > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_002, clk_cnt_app_mtr_003 AS
> > > > > > >>>>>>> clk_cnt_app_dtr_003,
> > > > > > >>>>>>>>>>>>>>> clk_uv_app_mtr_003 AS clk_uv_app_dtr_003]) +-
> > > > > > >>>>>>>>>>>>>>> [56]:NotNullEnforcer(fields=[rowkey]) +-
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> [56]:Sink(table=[default_catalog.default_database.tb_ads_dwi_pub_hbd_spm_dtr_002_004],
> > > > > > >>>>>>>>>>>>>>> fields=[rowkey, clk_cnt_app_dtr_001,
> > > > > > >> clk_uv_app_dtr_001,
> > > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_002, clk_uv_app_dtr_002,
> > > > > > >>>>>>> clk_cnt_app_dtr_003,
> > > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_003])
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> For more detail on the proposal:
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> https://docs.google.com/document/d/1VUVJeHY_We09GY53-K2lETP3HUNZG9wMKyecFWk_Wxk
> > > > > > >>>>>>>>>>>>>>> <
> > > > > > >>
> > > > > >
> > > > >
> > > >
> https://docs.google.com/document/d/1VUVJeHY_We09GY53-K2lETP3HUNZG9wMKyecFWk_Wxk/edit#
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> Looking forward to your feedback, thanks.
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> Bests
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> Wenlong Lyu
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Konstantin Knauf
> > > >
> > > > https://twitter.com/snntrable
> > > >
> > > > https://github.com/knaufk
> > > >
> >
>

Reply via email to