Hi Mason,

Thanks for driving the discussion. +1 for the proposal.

How about we return all operator metrics in a vertex? (the response is a
map of operatorId/operatorName -> operator-metrics). Correspondingly, the
url may be changed to /jobs/<jobid>/vertices/<vertexid>/operator-metrics

In this way, users can skip the step of obtaining the operator id.

Best,
Lijie

Hang Ruan <ruanhang1...@gmail.com> 于2024年1月17日周三 10:31写道:

> Hi, Mason.
>
> The field `operatorName` in JobManagerOperatorQueryScopeInfo refers to the
> fields in OperatorQueryScopeInfo and chooses the operatorName instead of
> OperatorID.
> It is fine by my side to change from opertorName to operatorID in this
> FLIP.
>
> Best,
> Hang
>
> Mason Chen <mas.chen6...@gmail.com> 于2024年1月17日周三 09:39写道:
>
> > Hi Xuyang and Hang,
> >
> > Thanks for your support and feedback! See my responses below:
> >
> > 1. IIRC, in a sense, operator ID and vertex ID are the same thing. The
> > > operator ID can
> > > be converted from the vertex ID[1]. Therefore, it is somewhat strange
> to
> > > have both vertex
> > > ID and operator ID in a single URL.
> > >
> > I think Hang explained it perfectly. Essentially, a vertix may contain
> one
> > or more operators so the operator ID is required to distinguish this
> case.
> >
> > 2. If I misunderstood the semantics of operator IDs here, then what is
> the
> > > relationship
> > > between vertex ID and operator ID, and do we need a url like
> > > `/jobs/<jobid>/vertices/<vertexid>/operators/`
> > > to list all operator ids under this vertices?
> > >
> > Good question, we definitely need expose operator IDs through the REST
> API
> > to make this usable. I'm looking at how users would currently discover
> the
> > vertex id to query. From the supported REST APIs [1], you can currently
> > obtain it from
> >
> > 1. `/jobs/<jobid>`
> > 2. `/jobs/<jobid>/plan`
> >
> > From the response of both these APIs, they include the vertex ids (the
> > vertices AND nodes fields), but not the operator ids. We would need to
> add
> > the logic to the plan generation [2]. The response is a little confusing
> > because there is a field in the vertices called "operator name". I
> propose
> > to add a new field called "operators" to the vertex response object,
> which
> > would be a list of objects with the structure
> >
> > Operator
> > {
> >   "id": "THE-FLINK-GENERATED-ID"
> > }.
> >
> > The JobManagerOperatorQueryScopeInfo has three fields: jobID, vertexID
> and
> > > operatorName. So we should use the operator name in the API.
> > > If you think we should use the operator id, there need be more changes
> > > about it.
> > >
> > I think we should use operator id since it uniquely identifies an
> > operator--on the contrary, the operator name does not (it may be empty or
> > repeated between operators by the user). I actually had a question on
> that
> > since you implemented the metric group. What's the reason we use operator
> > name currently? Could it also use operator id so we can match against the
> > id?
> >
> > [1]
> > https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/
> > [2]
> >
> >
> https://github.com/apache/flink/blob/416cb7aaa02c176e01485ff11ab4269f76b5e9e2/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/jsonplan/JsonPlanGenerator.java#L53
> >
> > Best,
> > Mason
> >
> >
> > On Thu, Jan 11, 2024 at 10:54 PM Hang Ruan <ruanhang1...@gmail.com>
> wrote:
> >
> > > Hi, Mason.
> > >
> > > Thanks for driving this FLIP.
> > >
> > > The JobManagerOperatorQueryScopeInfo has three fields: jobID, vertexID
> > and
> > > operatorName. So we should use the operator name in the API.
> > > If you think we should use the operator id, there need be more changes
> > > about it.
> > >
> > > About the Xuyang's questions, we add both vertexID and operatorID
> > > information because of the operator chain.
> > > A operator chain has a vertexID and contains many different operators.
> > The
> > > operator information helps to distinguish them in the same operator
> > chain.
> > >
> > > Best,
> > > Hang
> > >
> > >
> > > Xuyang <xyzhong...@163.com> 于2024年1月12日周五 10:21写道:
> > >
> > > > Hi, Mason.
> > > > Thanks for driving this Flip. I think it's important for external
> > system
> > > > to be able to
> > > > perceive the metric of the operator coordinator. +1 for it.
> > > >
> > > >
> > > > I just have the following minor questions and am looking forward to
> > your
> > > > reply. Please forgive
> > > > me if I have some misunderstandings.
> > > >
> > > >
> > > > 1. IIRC, in a sense, operator ID and vertex ID are the same thing.
> The
> > > > operator ID can
> > > > be converted from the vertex ID[1]. Therefore, it is somewhat strange
> > to
> > > > have both vertex
> > > > ID and operator ID in a single URL.
> > > >
> > > >
> > > > 2. If I misunderstood the semantics of operator IDs here, then what
> is
> > > the
> > > > relationship
> > > > between vertex ID and operator ID, and do we need a url like
> > > > `/jobs/<jobid>/vertices/<vertexid>/operators/`
> > > > to list all operator ids under this vertices?
> > > >
> > > >
> > > >
> > > >
> > > > [1]
> > > >
> > >
> >
> https://github.com/apache/flink/blob/7bebd2d9fac517c28afc24c0c034d77cfe2b43a6/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/OperatorID.java#L40C27-L40C27
> > > >
> > > > --
> > > >
> > > >     Best!
> > > >     Xuyang
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2024-01-12 04:20:03, "Mason Chen" <mas.chen6...@gmail.com> wrote:
> > > > >Hi Devs,
> > > > >
> > > > >I'm opening this thread to discuss a short FLIP for exposing
> > > > >JobManagerOperatorMetrics via REST API [1].
> > > > >
> > > > >The current set of REST APIs make it impossible to query coordinator
> > > > >metrics. This FLIP proposes a new REST API to query the
> > > > >JobManagerOperatorMetrics.
> > > > >
> > > > >[1]
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-417%3A+Expose+JobManagerOperatorMetrics+via+REST+API
> > > > >
> > > > >Best,
> > > > >Mason
> > > >
> > >
> >
>

Reply via email to