Re: flink-metrics如何获取applicationid

2023-09-15 Thread im huzi
退订 On Wed, Aug 30, 2023 at 19:14 allanqinjy wrote: > hi, >请教大家一个问题,就是在上报指标到prometheus时候,jobname会随机生成一个后缀,看源码也是new Abstract > ID(),有方法在这里获取本次上报的作业applicationid吗?

Re: flink-metrics如何获取applicationid

2023-08-30 Thread Feng Jin
hi, 可以尝试获取下 _APP_ID 这个 JVM 环境变量. System.getenv(YarnConfigKeys.ENV_APP_ID); https://github.com/apache/flink/blob/6c9bb3716a3a92f3b5326558c6238432c669556d/flink-yarn/src/main/java/org/apache/flink/yarn/YarnConfigKeys.java#L28 Best, Feng On Wed, Aug 30, 2023 at 7:14 PM allanqinjy wrote: > hi,

Re: Flink metrics via permethous or opentelemerty

2022-02-24 Thread Nicolaus Weidner
Hi Sigalit, first of all, have you read the docs page on metrics [1], and in particular the Prometheus section on metrics reporters [2]? Apart from that, there is also a (somewhat older) blog post about integrating Flink with Prometheus, including a link to a repo with example code [3]. Hope

Re: Flink Metrics Naming

2021-06-01 Thread Chesnay Schepler
Some more background on MetricGroups: Internally there (mostly) 3 types of metric groups: On the one hand we have the ComponentMetricGroups (like TaskManagerMetricGroup) that describe a high-level Flink entity, which just add a constant expression to the logical scope(like taskmanager, task

Re: Flink Metrics Naming

2021-06-01 Thread Mason Chen
Upon further inspection, it seems like the user scope is not universal (i.e. comes through the connectors and not UDFs (like rich map function)), but the question still stands if the process makes sense. > On Jun 1, 2021, at 10:38 AM, Mason Chen wrote: > > Makes sense. We are primarily

Re: Flink Metrics Naming

2021-06-01 Thread Mason Chen
Makes sense. We are primarily concerned with removing the metric labels from the names as the user metrics get too long. i.e. the groups from `addGroup` are concatenated in the metric name. Do you think there would be any issues with removing the group information in the metric name and

Re: Flink Metrics Naming

2021-06-01 Thread Chesnay Schepler
The uniqueness of metrics and the naming of the Prometheus reporter are somewhat related but also somewhat orthogonal. Prometheus works similar to JMX in that the metric name (e.g., taskmanager.job.task.operator.numRecordsIn) is more or less a _class_ of metrics, with tags/labels allowing you

Re: Flink Metrics Naming

2021-06-01 Thread Till Rohrmann
Hi Mason, The idea is that a metric is not uniquely identified by its name alone but instead by its path. The groups in which it is defined specify this path (similar to directories). That's why it is valid to specify two metrics with the same name if they reside in different groups. I think

Re: Flink Metrics emitted from a Kubernetes Application Cluster

2021-04-09 Thread Chesnay Schepler
This is currently not possible. See also FLINK-8358 On 4/9/2021 4:47 AM, Claude M wrote: Hello, I've setup Flink as an Application Cluster in Kubernetes. Now I'm looking into monitoring the Flink cluster in Datadog. This is what is configured in the flink-conf.yaml to emit metrics:

Re: Flink Metrics

2021-03-03 Thread Piotr Nowojski
Hi, 1) Do you want to output those metrics as Flink metrics? Or output those "metrics"/counters as values to some external system (like Kafka)? The problem discussed in [1], was that the metrics (Counters) were not fitting in memory, so David suggested to hold them on Flink's state and treat the

Re: Flink Metrics in kubernetes

2020-05-13 Thread Averell
Hi Gary, Sorry for the false alarm. It's caused by a bug in my deployment - no metrics were added into the registry. Sorry for wasting your time. Thanks and best regards, Averell -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink Metrics in kubernetes

2020-05-12 Thread Averell
Hi Gary, Thanks for the help. Here below is the output from jstack. It seems not being blocked. In my JobManager log, there's this WARN, I am not sure whether it's relevant at all. Attached is the full jstack dump k8xDump.txt

Re: Flink Metrics in kubernetes

2020-05-12 Thread Gary Yao
Hi Averell, If you are seeing the log message from [1] and Scheduled#report() is not called, the thread in the "Flink-MetricRegistry" thread pool might be blocked. You can use the jstack utility to see on which task the thread pool is blocked. Best, Gary [1]

Re: Flink Metrics - PrometheusReporter

2020-01-22 Thread Sidney Feiner
] From: Chesnay Schepler Sent: Wednesday, January 22, 2020 6:07 PM To: Sidney Feiner ; flink-u...@apache.org Subject: Re: Flink Metrics - PrometheusReporter Metrics are exposed via reporters by each process separately, whereas the WebUI aggregates metrics. As such you have to configure

Re: Flink Metrics - PrometheusReporter

2020-01-22 Thread Chesnay Schepler
Metrics are exposed via reporters by each process separately, whereas the WebUI aggregates metrics. As such you have to configure Prometheus to also scrape the TaskExecutors. On 22/01/2020 16:58, Sidney Feiner wrote: Hey, I've been trying to use the PrometheusReporter and when I used in

Re: Flink metrics reporters documentation

2019-10-10 Thread Aleksey Pak
Hi Flavio, Below is my explanation to your question, based on anecdotal evidence: As you may know, Flink distribution package is already scala version specific and bundles some jar artifacts. User Flink job is supposed to be compiled against some of those jars (with maven's `provided` scope).

Re: Flink metrics reporters documentation

2019-10-10 Thread Flavio Pompermaier
Sorry, I just discovered that those jars are actually in the opt folder within Flink dist..however the second point still holds: why there's a single influxdb jar inside flink's opt jar while on maven there are 2 versions (one for scala 2.11 and one for 2.12)? Best, Flavio On Thu, Oct 10, 2019

Re: Flink metrics scope for YARN single job

2019-08-15 Thread Vasily Melnik
Hi Biao! > Do you mean "distinguish metrics from different JobManager running on same host"? Exactly. >Will give you a feedback if there is a conclusion. Thanks! On Thu, 15 Aug 2019 at 06:40, Biao Liu wrote: > Hi Vasily, > > > Is there any way to distinguish logs from different JobManager

Re: Flink metrics scope for YARN single job

2019-08-14 Thread Biao Liu
Hi Vasily, > Is there any way to distinguish logs from different JobManager running on same host? Do you mean "distinguish metrics from different JobManager running on same host"? I guess there is no other variable you could use for now. But I think it's reasonable to support this requirement.

Re: flink metrics的 Reporter 问题

2019-05-15 Thread Xintong Song
取hostname的第一部分是为了和hdfs的用法保持一致,可以参考一下当时的issue,作者专门提到了为什么要这么做。 https://issues.apache.org/jira/browse/FLINK-1170?focusedCommentId=14175285=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14175285 Thank you~ Xintong Song On Wed, May 15, 2019 at 9:11 PM Yun Tang wrote:

Re: flink metrics的 Reporter 问题

2019-05-15 Thread Yun Tang
Hi 嘉诚 不清楚你使用的Flink具体版本,不过这个显示host-name第一部分的逻辑是一直存在的,因为大部分场景下host-name只需要取第一部分即可表征。具体实现代码可以参阅 [1] 和 [2] 。 受到你的启发,我创建了一个JIRA [3] 来追踪这个问题,解法是提供一个metrics options,使得你们场景下可以展示metrics的完整hostname 祝好 唐云 [1]

Re: Flink Metrics

2019-04-18 Thread Zhu Zhu
Hi Brian, You can implement a new org.apache.flink.metrics.reporter.MetricReporter as you like and register it to flink in flink conf. e.g. metrics.reporters:my_reporter metrics.reporter.my_other_reporter.class: xxx metrics.reporter.my_other_reporter.config1: yyy

Re: Flink metrics missing from UI 1.7.2

2019-03-23 Thread Padarn Wilson
Aha! This is almost certainly it. I remembered thinking something like this might be a problem. I'll need to change the deployment a bit to add this (not straightforward to edit the YAML in my case, but thanks! On Sun, Mar 24, 2019 at 10:01 AM dawid <

Re: Flink metrics missing from UI 1.7.2

2019-03-23 Thread dawid
Padarn Wilson-2 wrote > I am running Fink 1.7.2 on Kubernetes in a setup with task manager and job > manager separate. > > I'm having trouble seeing the metrics from my Flink job in the UI > dashboard. Actually I'm using the Datadog reporter to expose most of my > metrics, but latency tracking

Re: Flink metrics missing from UI 1.7.2

2019-03-23 Thread Padarn Wilson
Thanks David. I cannot see the metrics there, so let me play around a bit more and make sure they are enabled correctly. On Sat, Mar 23, 2019 at 9:19 PM David Anderson wrote: > > I have done this (actually I do it in my flink-conf.yaml), but I am not > seeing any metrics at all in the Flink UI,

Re: Flink metrics missing from UI 1.7.2

2019-03-23 Thread David Anderson
> I have done this (actually I do it in my flink-conf.yaml), but I am not seeing any metrics at all in the Flink UI, > let alone the latency tracking. The latency tracking itself does not seem to be exported to datadog (should it be?) The latency metrics are job metrics, and are not shown in the

Re: Flink metrics missing from UI 1.7.2

2019-03-23 Thread David Anderson
Because latency tracking is expensive, it is turned off by default. You turn it on by setting the interval; that looks something like this: env.getConfig().setLatencyTrackingInterval(1000); The full set of configuration options is described in the docs:

Re: Flink metrics in kubernetes deployment

2018-12-18 Thread Chesnay Schepler
If you're working with 1.7/master you're probably running into https://issues.apache.org/jira/browse/FLINK-11127 . On 17.12.2018 18:12, eric hoffmann wrote: Hi, In a Kubernetes delpoyment, im not able to display metrics in the dashboard, I try to expose and fix the

Re: Flink metrics related problems/questions

2017-05-22 Thread Aljoscha Krettek
Ah ok, the onTimer() and processElement() methods are all protected by synchronized blocks on the same lock. So that shouldn’t be a problem. > On 22. May 2017, at 15:08, Chesnay Schepler wrote: > > Yes, that could cause the observed issue. > > The default implementations

Re: Flink metrics related problems/questions

2017-05-22 Thread Chesnay Schepler
Yes, that could cause the observed issue. The default implementations are not thread-safe; if you do concurrent writes they may be lost/overwritten. You will have to either guard accesses to that metric with a synchronized block or implement your own thread-safe counter. On 22.05.2017 14:17,

Re: Flink metrics related problems/questions

2017-05-22 Thread Aljoscha Krettek
@Chesnay With timers it will happen that onTimer() is called from a different Thread than the Tread that is calling processElement(). If Metrics updates happen in both, would that be a problem? > On 19. May 2017, at 11:57, Chesnay Schepler wrote: > > 2. isn't quite

Re: Flink metrics related problems/questions

2017-05-19 Thread Chesnay Schepler
2. isn't quite accurate actually; metrics on the TaskManager are not persisted across restarts. On 19.05.2017 11:21, Chesnay Schepler wrote: 1. This shouldn't happen. Do you access the counter from different threads? 2. Metrics in general are not persisted across restarts, and there is no

Re: Flink metrics related problems/questions

2017-05-19 Thread Chesnay Schepler
1. This shouldn't happen. Do you access the counter from different threads? 2. Metrics in general are not persisted across restarts, and there is no way to configure flink to do so at the moment. 3. Counters are sent as gauges since as far as I know StatsD counters are not allowed to be

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-03 Thread Philipp Bussche
Hi there, I am using Graphite and querying it in Grafana is super easy. You just select fields and they come up automatically for you to select from depending on how your metric structure in Graphite looks like. You can also use wildcards. The only thing I had to do because I am also using

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-02 Thread Anchit Jatana
Hi Jamie, Thanks for sharing your thoughts. I'll try and integrate with Graphite to see if this gets resolved. Regards, Anchit -- View this message in context:

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-02 Thread Jamie Grier
Hi Anchit, That last bit is very interesting - the fact that it works fine with subtasks <= 30. It could be that either Influx or Grafana are not able to keep up with the data being produced. I would guess that the culprit is Grafana if looking at any particular subtask index works fine and

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-01 Thread Anchit Jatana
I've set the metric reporting frequency to InfluxDB as 10s. In the screenshot, I'm using Grafana query interval of 1s. I've tried 10s and more too, the graph shape changes a bit but the incorrect negative values are still plotted(makes no difference). Something to add: If the subtasks are less

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-01 Thread Jamie Grier
Hmm. I can't recreate that behavior here. I have seen some issues like this if you're grouping by a time interval different from the metrics reporting interval you're using, though. How often are you reporting metrics to Influx? Are you using the same interval in your Grafana queries? I see

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-01 Thread Anchit Jatana
Hi Jamie, Thank you so much for your response. The below query: SELECT derivative(sum("count"), 1s) FROM "numRecordsIn" WHERE "task_name" = 'Sink: Unnamed' AND $timeFilter GROUP BY time(1s) behaves the same as with the use of the templating variable in the 'All' case i.e. shows a plots of

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-01 Thread Jamie Grier
Ahh.. I haven’t used templating all that much but this also works for your substask variable so that you don’t have to enumerate all the possible values: Template Variable Type: query query: SHOW TAG VALUES FROM numRecordsIn WITH KEY = "subtask_index" ​ On Tue, Nov 1, 2016 at 2:51 PM, Jamie

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-01 Thread Jamie Grier
Another note. In the example the template variable type is "custom" and the values have to be enumerated manually. So in your case you would have to configure all the possible values of "subtask" to be 0-49. On Tue, Nov 1, 2016 at 2:43 PM, Jamie Grier wrote: > This

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-01 Thread Jamie Grier
This works well for me. This will aggregate the data across all sub-task instances: SELECT derivative(sum("count"), 1s) FROM "numRecordsIn" WHERE "task_name" = 'Sink: Unnamed' AND $timeFilter GROUP BY time(1s) You can also plot each sub-task instance separately on the same graph by doing:

Re: Flink Metrics

2016-10-18 Thread Aljoscha Krettek
.apache.org > *Sent:* Monday, October 17, 2016 12:52 AM > *Subject:* Re: Flink Metrics > > Hi Govind, > > I think the DropwizardMeterWrapper implementation is just a reference > implementation where it was decided to report the minute rate. You can > define your own

Re: Flink Metrics

2016-10-17 Thread amir bahmanyari
ubject: Re: Flink Metrics Hi Govind, I think the DropwizardMeterWrapper implementation is just a reference implementation where it was decided to report the minute rate. You can define your own meter class which allows to configure the rate interval accordingly. Concerning Timers, I think nobody

Re: Flink Metrics

2016-10-17 Thread Chesnay Schepler
Hello, we could also offer a small utility method that creates 3 flink meters, each reporting one rate of a DW meter. Timers weren't added yet since, as Till said, no one requested them yet and we haven't found a proper internal use-case for them Regards, Chesnay On 17.10.2016 09:52, Till

Re: Flink Metrics

2016-10-17 Thread Till Rohrmann
Hi Govind, I think the DropwizardMeterWrapper implementation is just a reference implementation where it was decided to report the minute rate. You can define your own meter class which allows to configure the rate interval accordingly. Concerning Timers, I think nobody requested this metric so