Re: Performance impact of enabling Prometheus Metrics

2021-05-03 Thread Ted Dunning
On Mon, May 3, 2021 at 6:50 PM Li Wang wrote: > ... > > Please let me know if you have any thoughts or questions. > My thought is that you have done some nice work.

Re: write performance issue in 3.6.2

2021-05-03 Thread Michael Han
>> because the tests were run with Prometheus enabled, which is new in 3.6 and has significant negative perf impact. Interesting, let's see what the numbers are without Prometheus involved. It could be that the increased latency we observed in CommitProcessor is just a symptom rather than the

Re: Performance impact of enabling Prometheus Metrics

2021-05-03 Thread Li Wang
Hi, Thanks Ted and Enrico again for the inputs and discussions. I would like to share some updates from my side. 1. The main issue is that Prometheus summary computation is expensive and locked/synchronized. To reduce the perf impact, I changed the PrometheusLabelledSummary.add() operation to be

[jira] [Created] (ZOOKEEPER-4289) Reduce the performance impact of Prometheus metrics

2021-05-03 Thread Li Wang (Jira)
Li Wang created ZOOKEEPER-4289: -- Summary: Reduce the performance impact of Prometheus metrics Key: ZOOKEEPER-4289 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4289 Project: ZooKeeper

Re: write performance issue in 3.6.2

2021-05-03 Thread Li Wang
Hi Michael, Thanks for your additional inputs. On Mon, May 3, 2021 at 3:13 PM Michael Han wrote: > Hi Li, > > Thanks for following up. > > >> write_commitproc_time_ms were large > > This measures how long a local write op hears back from the leader. If it's > big, then either the leader is

Re: write performance issue in 3.6.2

2021-05-03 Thread Michael Han
Hi Li, Thanks for following up. >> write_commitproc_time_ms were large This measures how long a local write op hears back from the leader. If it's big, then either the leader is very busy acking the request, or your network RTT is high. How does the local fsync time (fsynctime) look like