Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/2753
I think we need a different way to solve this.
This pull request adds a very high overhead to the processing of each
record:
- two calls to `System.nanoTime()`
- Maintining a Dropwizard Histogram
Without having benchmarked this, I would expect this to drop the
performance for typical operations like filters or lightweight map functions by
a large degree.
Flink is building a streaming runtime that is performance competitive with
a batch runtime, so the base runtime overhead per record needs to be minimal.
All metrics so far have been designed with that paradigm in mind: Metrics
may not add any cost to the processing.
- Metrics are gathered by asynchronous threads
- The core uses only non-synchronized counters and gauges because they
come quasi for free
- We consciously decided to not use in the data paths any metric type
that has the overhead of creating objects of maintaining a data structure.
I would suggest to first have a design discussion about whether we want to
measure this and how we can do it for free.
For example, have a look at the "end to end" latency measurements #2386 via
latency markers, for an idea how to measure with minimal impact on the data
processing.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---