Thanks Andor! Li
On Tue, May 4, 2021 at 2:00 PM Andor Molnar <[email protected]> wrote: > Awesome work Li, indeed. > I’m happy to review the PR. Thanks for the contribution. > > Andor > > > > > On 2021. May 4., at 3:49, Li Wang <[email protected]> wrote: > > > > Hi, > > > > Thanks Ted and Enrico again for the inputs and discussions. I would like > to > > share some updates from my side. > > > > 1. The main issue is that Prometheus summary computation is expensive and > > locked/synchronized. To reduce the perf impact, I changed the > > PrometheusLabelledSummary.add() operation to be async, so it is handled > by > > separate threads from the main thread. > > > > 2. Ran benchmark against the changes. The perf of read and write > operations > > are significantly improved. . > > > > a) For read operation, the avg latency is reduced about 50% and avg > > throughout is increased about 95%. > > b) For write operation, the avg latency is reduced about 28% and avg > > throughput is increased about 20%. > > > > 3. I opened a JIRA for the issue and can submit a PR for the change. > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-4289 > > > > Please let me know if you have any thoughts or questions. > > > > Thanks, > > > > Li > > > > > > > > On Wed, Apr 28, 2021 at 11:32 PM Enrico Olivelli <[email protected]> > > wrote: > > > >> Il giorno mer 28 apr 2021 alle ore 21:48 Li Wang <[email protected]> ha > >> scritto: > >>> > >>> Hi Enrico, > >>> > >>> Please see my comments inline. > >>> > >>> Thanks, > >>> > >>> Li. > >>> > >>> On Mon, Apr 26, 2021 at 10:42 PM Enrico Olivelli <[email protected]> > >>> wrote: > >>> > >>>> Sorry for top posting. > >>>> > >>>> I have never seen in other applications that Prometheus has such a > >>>> significant impact. > >>> > >>> > >>> Please see my reply in the previous response. > >>> > >>> > >>>> > >>>> The first things that come into my mind: > >>>> - collect a couple of dumps with some perf tool and dig into the > >> problem > >>> > >>> > >>> Heap dump, thread dump and CPU profiling with Prometheus enabled vs > >>> disabled have been collected. That's how we found the issue. > >>> > >>> > >>>> - verify that we have the latest version of Prometheus client > >>>> > >>> > >>> ZK uses 0.9.0 and the latest version of Prometheus client is 0.10.0. > >>> Checked in the change logs, it doesn't look like there are major > changes > >>> between these two. https://github.com/prometheus/client_java/releases > We > >>> can upgrade the version and see if it makes any difference. > >>> > >>> - tune the few knobs we have in the Prometheus client > >>>> > >>> > >>> What are the knobs we can tune? Can you provide more info on this? > >> > >> Unfortunately there is nothing we can tune directly with the zookeeper > >> configuration file > >> > >> In ZK code we currently have only some basic configuration of > >> summaries, and we are using mostly the default values > >> > >> > https://github.com/apache/zookeeper/blob/11c07921c15e2fb7692375327b53f26a583b77ca/zookeeper-metrics-providers/zookeeper-prometheus-metrics/src/main/java/org/apache/zookeeper/metrics/prometheus/PrometheusMetricsProvider.java#L340 > >> there is no way to fine tune the buckets and all of the structures. > >> > >> we should work on looking for optimal values > >> > > > > > > > > > >> > >> > >>> > >>>> > >>>> In Apache Pulsar and in Apache Bookkeeper we have some customizations > >> in > >>>> the Prometheus metrics collectors, we could the a look and port those > >> to > >>>> Zookeeper (initially I worked on the Prometheus integration based on > my > >>>> usecases I have with Pulsar, Bookkeeper and other systems that already > >> use > >>>> Prometheus, but here in Zookeeper we are using the basic Prometheus > >> client) > >>>> > >>> > >>> Are there any customizations for minimizing the performance impact? Is > >>> there anything that we can port to ZK to help the issue? > >> > >> Yes, there is much we can port, this is the Prometheus Metrics > >> provider in BookKeeper > >> > >> > https://github.com/apache/bookkeeper/tree/master/bookkeeper-stats-providers/prometheus-metrics-provider/src/main/java/org/apache/bookkeeper/stats/prometheus > >> > >> Enrico > >> > >>> > >>> > >>>> Enrico > >>>> > >>>> Il Mar 27 Apr 2021, 06:35 Ted Dunning <[email protected]> ha > >> scritto: > >>>> > >>>>> Batching metrics reporting is very similar to option (c) but with > >> locking > >>>>> like option (a). That can usually be made faster by passing a > >> reference > >>>> to > >>>>> the metrics accumulator to the reporting thread which can do the > >> batch > >>>>> update without locks. Usually requires ping-pong metrics > >> accumulators so > >>>>> that a thread can accumulate in one accumulator for a bit, pass that > >> to > >>>> the > >>>>> merge thread and switch to using the alternate accumulator. Since all > >>>>> threads typically report at the same time, this avoids a stampede on > >> the > >>>>> global accumulator. > >>>>> > >>>>> > >>>>> On Mon, Apr 26, 2021 at 9:30 PM Li Wang <[email protected]> wrote: > >>>>> > >>>>>> batching metrics reporting can help. For example, in the > >>>> CommitProcessor, > >>>>>> increasing the maxCommitBatchSize helps improving the the > >> performance > >>>> of > >>>>>> write operation. > >>>>>> > >>>>>> > >>>>>> On Mon, Apr 26, 2021 at 9:21 PM Li Wang <[email protected]> wrote: > >>>>>> > >>>>>>> Yes, I am thinking that handling metrics reporting in a separate > >>>>> thread, > >>>>>>> so it doesn't impact the "main" thread. > >>>>>>> > >>>>>>> Not sure about the idea of merging at the end of a reporting > >> period. > >>>>> Can > >>>>>>> you elaborate a bit on it? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Li > >>>>>>> > >>>>>>> On Mon, Apr 26, 2021 at 9:11 PM Ted Dunning < > >> [email protected]> > >>>>>> wrote: > >>>>>>> > >>>>>>>> Would it help to keep per thread metrics that are either > >> reported > >>>>>>>> independently or are merged at the end of a reporting period? > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Mon, Apr 26, 2021 at 8:51 PM Li Wang <[email protected]> > >> wrote: > >>>>>>>> > >>>>>>>>> Hi Community, > >>>>>>>>> > >>>>>>>>> I've done further investigation on the issue and found the > >>>> following > >>>>>>>>> > >>>>>>>>> 1. The perf of the read operation was decreased due to the > >> lock > >>>>>>>> contention > >>>>>>>>> in the Prometheus TimeWindowQuantiles APIs. 3 out of 4 > >>>>>> CommitProcWorker > >>>>>>>>> threads were blocked on the TimeWindowQuantiles.insert() API > >> when > >>>>> the > >>>>>>>> test > >>>>>>>>> was. > >>>>>>>>> > >>>>>>>>> 2. The perf of the write operation was decreased because of > >> the > >>>> high > >>>>>> CPU > >>>>>>>>> usage from Prometheus Summary type of metrics. The CPU usage > >> of > >>>>>>>>> CommitProcessor increased about 50% when Prometheus was > >> disabled > >>>>>>>> compared > >>>>>>>>> to enabled (46% vs 80% with 4 CPU, 63% vs 99% with 12 CPU). > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Prometheus integration is a great feature, however the > >> negative > >>>>>>>> performance > >>>>>>>>> impact is very significant. I wonder if anyone has any > >> thoughts > >>>> on > >>>>>> how > >>>>>>>> to > >>>>>>>>> reduce the perf impact. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Li > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Tue, Apr 6, 2021 at 12:33 PM Li Wang <[email protected]> > >>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> I would like to reach out to the community to see if anyone > >> has > >>>>> some > >>>>>>>>>> insights or experience with the performance impact of > >> enabling > >>>>>>>> prometheus > >>>>>>>>>> metrics. > >>>>>>>>>> > >>>>>>>>>> I have done load comparison tests for Prometheus enabled vs > >>>>> disabled > >>>>>>>> and > >>>>>>>>>> found the performance is reduced about 40%-60% for both > >> read and > >>>>>> write > >>>>>>>>>> oeprations (i.e. getData, getChildren and createNode). > >>>>>>>>>> > >>>>>>>>>> The load test was done with Zookeeper 3.7, cluster size of 5 > >>>>>>>> participants > >>>>>>>>>> and 5 observers, each ZK server has 10G heap size and 4 > >> cpu, 500 > >>>>>>>>> concurrent > >>>>>>>>>> users sending requests. > >>>>>>>>>> > >>>>>>>>>> The performance impact is quite significant. I wonder if > >> this > >>>> is > >>>>>>>>> expected > >>>>>>>>>> and what are things we can do to have ZK performing the same > >>>> while > >>>>>>>>>> leveraging the new feature of Prometheus metric. > >>>>>>>>>> > >>>>>>>>>> Best, > >>>>>>>>>> > >>>>>>>>>> Li > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >> > >
