Hi Morgan, Regarding backpressure, it can be caused by a number of factors, e.g. writing to an external system or slow input partitions.
However, if you know that a particular resource is a bottleneck then it makes sense to monitor its saturation. It can be done by using Flink metrics. Please see the documentation for more details: https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html Regards, Roman On Tue, Feb 25, 2020 at 12:33 PM Morgan Geldenhuys < morgan.geldenh...@tu-berlin.de> wrote: > Hello community, > > I am fairly new to Flink and have a question concerning utilization. I > was hoping someone could help. > > Knowing that backpressure is essentially the point at which utilization > has reached 100% for any particular streaming pipeline and means that > the application cannot "keep up" with the messages coming into the system. > > I was wondering, assuming a fairly stable input throughput, is there a > way of determining the average utilization as a percentage? Can we > determine how much more capacity each operator has before backpressure > kicks in from metrics alone, i.e. 60% of capacity for example? Knowing > that the maximum throughput of the DSP application is dictated by the > slowest part of the pipeline, we would need to identify the slowest > operator and then average horizontally. > > The only method that I can see of determining the point at which the > system cannot keep up any longer is by scaling the input throughput > slowly until the backpressure HIGH alarm is shown and hence the number > of messages/sec is known. > > Yes I know this is a gross oversimplification and there are many many > factors that need to be taken into account when dealing with > backpressure, but it would be nice to have a general indicator, a rough > estimate is fine. > > Thank you in advance. > > Regards, > M. > > > >