Re: [DISCUSS] FLIP-33: Standardize connector metrics

Chesnay Schepler Thu, 21 Feb 2019 06:14:27 -0800

Regarding 2) It doesn't make sense to investigate this as part of thisFLIP. This is something that could be of interest for the entire metricsystem, and should be designed for as such.


Regarding the proposal as a whole:

Histogram metrics shall not be added to the core of Flink. They aresignificantly more expensive than other metrics, and calculatinghistograms in the application is regarded as an anti-pattern by severalmetric backends, who instead recommend to expose the raw data andcalculate the histogram in the backend.

Second, this seems overly complicated. Given that we already establishedthat not all connectors will export all metrics we are effectivelyreducing this down to a consistent naming scheme. We don't need anythingsophisticated for that; basically just a few constants that allconnectors use.


I'm not convinced that this is worthy of a FLIP.

On 21.02.2019 14:26, Dawid Wysakowicz wrote:

Hi,

Ad 1. In general I undestand and I agree. But those particular metrics
(latency, fetchLatency), right now would only be reported if user uses
KafkaConsumer with internal timestampAssigner with StreamCharacteristic
set to EventTime, right? That sounds like a very specific case. I am not
sure if we should introduce a generic metric that will be
disabled/absent for most of implementations.

Ad.2 That sounds like an orthogonal issue, that might make sense to
investigate in the future.

Best,

Dawid

On 21/02/2019 13:20, Becket Qin wrote:

Hi Dawid,

Thanks for the feedback. That makes sense to me. There are two cases to be
addressed.

1. The metrics are supposed to be a guidance. It is likely that a connector
only supports some but not all of the metrics. In that case, each connector
implementation should have the freedom to decide which metrics are
reported. For the metrics that are supported, the guidance should be
followed.

2. Sometimes users may want to disable certain metrics for some reason
(e.g. performance / reprocessing of data). A generic mechanism should be
provided to allow user choose which metrics are reported. This mechanism
should also be honored by the connector implementations.

Does this sound reasonable to you?

Thanks,

Jiangjie (Becket) Qin



On Thu, Feb 21, 2019 at 4:22 PM Dawid Wysakowicz <[email protected]>
wrote:

Hi,

Generally I like the idea of having a unified, standard set of metrics for
all connectors. I have some slight concerns about fetchLatency and
latency though. They are computed based on EventTime which is not a purely
technical feature. It depends often on some business logic, might be absent
or defined after source. Those metrics could also behave in a weird way in
case of replaying backlog. Therefore I am not sure if we should include
those metrics by default. Maybe we could at least introduce a feature
switch for them? What do you think?

Best,

Dawid
On 21/02/2019 03:13, Becket Qin wrote:

Bump. If there is no objections to the proposed metrics. I'll start a
voting thread later toady.

Thanks,

Jiangjie (Becket) Qin

On Mon, Feb 11, 2019 at 8:17 PM Becket Qin <[email protected]> 
<[email protected]> wrote:


Hi folks,

I would like to start the FLIP discussion thread about standardize the
connector metrics.

In short, we would like to provide a convention of Flink connector
metrics. It will help simplify the monitoring and alerting on Flink jobs.
The FLIP link is following:

https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics

Thanks,

Jiangjie (Becket) Qin

Re: [DISCUSS] FLIP-33: Standardize connector metrics

Reply via email to