Hi Encrico, Andor,

The PR has been updated.  https://github.com/apache/zookeeper/pull/1698.
Can you help reviewing it, so it can be merged?

Thanks,

Li

On Tue, May 18, 2021 at 5:01 PM Li Wang <li4w...@gmail.com> wrote:

> Hi Andor, Ted, Enrico,
>
> Thanks again for your discussions and inputs. The changes have been
> submitted. https://github.com/apache/zookeeper/pull/1698
>
> Would you mind reviewing the PR?
>
> Best,
>
> Li
>
> On Wed, May 5, 2021 at 10:00 AM Li Wang <li4w...@gmail.com> wrote:
>
>> Thanks Andor!
>>
>> Li
>>
>> On Tue, May 4, 2021 at 2:00 PM Andor Molnar <an...@apache.org> wrote:
>>
>>> Awesome work Li, indeed.
>>> I’m happy to review the PR. Thanks for the contribution.
>>>
>>> Andor
>>>
>>>
>>>
>>> > On 2021. May 4., at 3:49, Li Wang <li4w...@gmail.com> wrote:
>>> >
>>> > Hi,
>>> >
>>> > Thanks Ted and Enrico again for the inputs and discussions. I would
>>> like to
>>> > share some updates from my side.
>>> >
>>> > 1. The main issue is that Prometheus summary computation is expensive
>>> and
>>> > locked/synchronized. To reduce the perf impact, I changed the
>>> > PrometheusLabelledSummary.add() operation to be async, so it is
>>> handled by
>>> > separate threads from the main thread.
>>> >
>>> > 2. Ran benchmark against the changes. The perf of read and write
>>> operations
>>> > are significantly improved.  .
>>> >
>>> > a) For read operation, the avg latency is reduced about 50% and avg
>>> > throughout is increased about 95%.
>>> > b) For write operation, the avg latency is reduced about 28% and avg
>>> > throughput is increased about 20%.
>>> >
>>> > 3. I opened a JIRA for the issue and can submit a PR for the change.
>>> >
>>> > https://issues.apache.org/jira/browse/ZOOKEEPER-4289
>>> >
>>> > Please let me know if you have any thoughts or questions.
>>> >
>>> > Thanks,
>>> >
>>> > Li
>>> >
>>> >
>>> >
>>> > On Wed, Apr 28, 2021 at 11:32 PM Enrico Olivelli <eolive...@gmail.com>
>>> > wrote:
>>> >
>>> >> Il giorno mer 28 apr 2021 alle ore 21:48 Li Wang <li4w...@gmail.com>
>>> ha
>>> >> scritto:
>>> >>>
>>> >>> Hi Enrico,
>>> >>>
>>> >>> Please see my comments inline.
>>> >>>
>>> >>> Thanks,
>>> >>>
>>> >>> Li.
>>> >>>
>>> >>> On Mon, Apr 26, 2021 at 10:42 PM Enrico Olivelli <
>>> eolive...@gmail.com>
>>> >>> wrote:
>>> >>>
>>> >>>> Sorry for top posting.
>>> >>>>
>>> >>>> I have never seen in other applications that Prometheus has such a
>>> >>>> significant impact.
>>> >>>
>>> >>>
>>> >>> Please see my reply in the previous response.
>>> >>>
>>> >>>
>>> >>>>
>>> >>>> The first things that come into my mind:
>>> >>>> - collect a couple of dumps with some perf tool and dig into the
>>> >> problem
>>> >>>
>>> >>>
>>> >>> Heap dump, thread dump and CPU profiling with Prometheus enabled vs
>>> >>> disabled have been collected. That's how we found the issue.
>>> >>>
>>> >>>
>>> >>>> -  verify that we have the latest version of Prometheus client
>>> >>>>
>>> >>>
>>> >>> ZK uses 0.9.0 and the latest version of Prometheus client is 0.10.0.
>>> >>> Checked in the change logs, it doesn't look like there are major
>>> changes
>>> >>> between these two.
>>> https://github.com/prometheus/client_java/releases We
>>> >>> can upgrade the version and see if it makes any difference.
>>> >>>
>>> >>> - tune the few knobs we have in the Prometheus client
>>> >>>>
>>> >>>
>>> >>> What are the knobs we can tune? Can you provide more info on this?
>>> >>
>>> >> Unfortunately there is nothing we can tune directly with the zookeeper
>>> >> configuration file
>>> >>
>>> >> In ZK code we currently have only some basic configuration of
>>> >> summaries, and we are using mostly the default values
>>> >>
>>> >>
>>> https://github.com/apache/zookeeper/blob/11c07921c15e2fb7692375327b53f26a583b77ca/zookeeper-metrics-providers/zookeeper-prometheus-metrics/src/main/java/org/apache/zookeeper/metrics/prometheus/PrometheusMetricsProvider.java#L340
>>> >> there is no way to fine tune the buckets and all of the structures.
>>> >>
>>> >> we should work on looking for optimal values
>>> >>
>>> >
>>> >
>>> >
>>> >
>>> >>
>>> >>
>>> >>>
>>> >>>>
>>> >>>> In Apache Pulsar and in Apache Bookkeeper we have some
>>> customizations
>>> >> in
>>> >>>> the Prometheus metrics collectors, we could the a look and port
>>> those
>>> >> to
>>> >>>> Zookeeper (initially I worked on the Prometheus integration based
>>> on my
>>> >>>> usecases I have with Pulsar, Bookkeeper and other systems that
>>> already
>>> >> use
>>> >>>> Prometheus, but here in Zookeeper we are using the basic Prometheus
>>> >> client)
>>> >>>>
>>> >>>
>>> >>> Are there any customizations for minimizing the performance impact?
>>> Is
>>> >>> there anything that we can port to ZK to help the issue?
>>> >>
>>> >> Yes, there is much we can port, this is the Prometheus Metrics
>>> >> provider in BookKeeper
>>> >>
>>> >>
>>> https://github.com/apache/bookkeeper/tree/master/bookkeeper-stats-providers/prometheus-metrics-provider/src/main/java/org/apache/bookkeeper/stats/prometheus
>>> >>
>>> >> Enrico
>>> >>
>>> >>>
>>> >>>
>>> >>>> Enrico
>>> >>>>
>>> >>>> Il Mar 27 Apr 2021, 06:35 Ted Dunning <ted.dunn...@gmail.com> ha
>>> >> scritto:
>>> >>>>
>>> >>>>> Batching metrics reporting is very similar to option (c) but with
>>> >> locking
>>> >>>>> like option (a). That can usually be made faster by passing a
>>> >> reference
>>> >>>> to
>>> >>>>> the metrics accumulator to the reporting thread which can do the
>>> >> batch
>>> >>>>> update without locks. Usually requires ping-pong metrics
>>> >> accumulators so
>>> >>>>> that a thread can accumulate in one accumulator for a bit, pass
>>> that
>>> >> to
>>> >>>> the
>>> >>>>> merge thread and switch to using the alternate accumulator. Since
>>> all
>>> >>>>> threads typically report at the same time, this avoids a stampede
>>> on
>>> >> the
>>> >>>>> global accumulator.
>>> >>>>>
>>> >>>>>
>>> >>>>> On Mon, Apr 26, 2021 at 9:30 PM Li Wang <li4w...@gmail.com> wrote:
>>> >>>>>
>>> >>>>>> batching metrics reporting can help. For example, in the
>>> >>>> CommitProcessor,
>>> >>>>>> increasing the maxCommitBatchSize helps improving the the
>>> >> performance
>>> >>>> of
>>> >>>>>> write operation.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On Mon, Apr 26, 2021 at 9:21 PM Li Wang <li4w...@gmail.com>
>>> wrote:
>>> >>>>>>
>>> >>>>>>> Yes, I am thinking that handling metrics reporting in a separate
>>> >>>>> thread,
>>> >>>>>>> so it doesn't impact the "main" thread.
>>> >>>>>>>
>>> >>>>>>> Not sure about the idea of merging at the end of a reporting
>>> >> period.
>>> >>>>> Can
>>> >>>>>>> you elaborate a bit on it?
>>> >>>>>>>
>>> >>>>>>> Thanks,
>>> >>>>>>>
>>> >>>>>>> Li
>>> >>>>>>>
>>> >>>>>>> On Mon, Apr 26, 2021 at 9:11 PM Ted Dunning <
>>> >> ted.dunn...@gmail.com>
>>> >>>>>> wrote:
>>> >>>>>>>
>>> >>>>>>>> Would it help to keep per thread metrics that are either
>>> >> reported
>>> >>>>>>>> independently or are merged at the end of a reporting period?
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> On Mon, Apr 26, 2021 at 8:51 PM Li Wang <li4w...@gmail.com>
>>> >> wrote:
>>> >>>>>>>>
>>> >>>>>>>>> Hi Community,
>>> >>>>>>>>>
>>> >>>>>>>>> I've done further investigation on the issue and found the
>>> >>>> following
>>> >>>>>>>>>
>>> >>>>>>>>> 1. The perf of the read operation was decreased due to the
>>> >> lock
>>> >>>>>>>> contention
>>> >>>>>>>>> in the Prometheus TimeWindowQuantiles APIs. 3 out of 4
>>> >>>>>> CommitProcWorker
>>> >>>>>>>>> threads were blocked on the TimeWindowQuantiles.insert() API
>>> >> when
>>> >>>>> the
>>> >>>>>>>> test
>>> >>>>>>>>> was.
>>> >>>>>>>>>
>>> >>>>>>>>> 2. The perf of the write operation was decreased because of
>>> >> the
>>> >>>> high
>>> >>>>>> CPU
>>> >>>>>>>>> usage from Prometheus Summary type of metrics. The CPU usage
>>> >> of
>>> >>>>>>>>> CommitProcessor increased about 50% when Prometheus was
>>> >> disabled
>>> >>>>>>>> compared
>>> >>>>>>>>> to enabled (46% vs 80% with 4 CPU, 63% vs 99% with 12 CPU).
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> Prometheus integration is a great feature, however the
>>> >> negative
>>> >>>>>>>> performance
>>> >>>>>>>>> impact is very significant.  I wonder if anyone has any
>>> >> thoughts
>>> >>>> on
>>> >>>>>> how
>>> >>>>>>>> to
>>> >>>>>>>>> reduce the perf impact.
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> Thanks,
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> Li
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> On Tue, Apr 6, 2021 at 12:33 PM Li Wang <li4w...@gmail.com>
>>> >>>> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>>> Hi,
>>> >>>>>>>>>>
>>> >>>>>>>>>> I would like to reach out to the community to see if anyone
>>> >> has
>>> >>>>> some
>>> >>>>>>>>>> insights or experience with the performance impact of
>>> >> enabling
>>> >>>>>>>> prometheus
>>> >>>>>>>>>> metrics.
>>> >>>>>>>>>>
>>> >>>>>>>>>> I have done load comparison tests for Prometheus enabled vs
>>> >>>>> disabled
>>> >>>>>>>> and
>>> >>>>>>>>>> found the performance is reduced about 40%-60% for both
>>> >> read and
>>> >>>>>> write
>>> >>>>>>>>>> oeprations (i.e. getData, getChildren and createNode).
>>> >>>>>>>>>>
>>> >>>>>>>>>> The load test was done with Zookeeper 3.7, cluster size of 5
>>> >>>>>>>> participants
>>> >>>>>>>>>> and 5 observers, each ZK server has 10G heap size and 4
>>> >> cpu, 500
>>> >>>>>>>>> concurrent
>>> >>>>>>>>>> users sending requests.
>>> >>>>>>>>>>
>>> >>>>>>>>>> The performance impact is quite significant.  I wonder if
>>> >> this
>>> >>>> is
>>> >>>>>>>>> expected
>>> >>>>>>>>>> and what are things we can do to have ZK performing the same
>>> >>>> while
>>> >>>>>>>>>> leveraging the new feature of Prometheus metric.
>>> >>>>>>>>>>
>>> >>>>>>>>>> Best,
>>> >>>>>>>>>>
>>> >>>>>>>>>> Li
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>
>>>
>>>

Reply via email to