> On 10 Dec 2019, at 19:15, thierry bordaz <[email protected]> wrote: > > Hi William, > > Thanks for these very interesting results. > It would help if you can provide the stap scripts to make sure what you are > accounting the latency.
Yes, I plan to put them into a PR soon once I have done a bit more data collection and polishing of the script setup. > Also just to be sure is latency a synonym for response time ? Yep, here I mean the execution time of a single operation. > > Regarding the comparison (tail) 1client/16client. It looks to me that the > tail are equivalent: The more we have clients the more we have long latency. > So in a first approach I would eliminate contention effect. I disagree, the tail is much more pronounced in the 16 client version, and there is a significant shift of response times from the 32768 bucket to 65536 indicating that many operations are now being "held up". > > Regarding the ratio shorter/longer latency (assuming the search are > equivalent) this is interesting to know why we have such effect. One of the > possible cause I was thinking of is the impact of DB thread (checkpointing or > trickling). But if it exists similar long tail in ldbm_back, the absolute > value is much lower than the opshare_search: in short ldbm_back accounted at > most for 4ms while opshared for 67ms. So there is something else (aci, > network, frontend..). > > Regarding USDT I think it is very good idea. However, just to share some > recent stap experience, I found it intrusive. In short, I had not the same > throughput with and without. In my case it was not a problem, as I wanted to > investigate a reproducible contention. But if we want support/user/customers > to use it, the performance impact in production will be valid point. I haven't noticed any "intrusiveness" from USDT so far? How were you running the stap scripts? > > best regards > thierry > > > > On 12/9/19 6:16 AM, William Brown wrote: >> Hi all, >> >> Following last weeks flamegraph runs, I wanted to find more details on >> exactly what was happening. While flamegraphs highlighted that a changed >> existed between single and multithreaded servers, it did not help to isolate >> where >> the change was occuring. >> >> To understand this I have started to work on a set of systemtap scripts that >> we can use to profile 389ds. These will be included in a future PR. >> >> Here are the hisograms from an initial test of profiling "do_search" >> >> 1 thread >> >> stap test_do_search.stp >> ^CDistribution of latencies (in nanoseconds) for 441148 samples >> max/avg/min: 35911542/85694/38927 >> value |-------------------------------------------------- count >> 8192 | 0 >> 16384 | 0 >> 32768 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 167285 >> 65536 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 239280 >> 131072 |@@@@@ 25788 >> 262144 |@ 8530 >> 524288 | 252 >> 1048576 | 6 >> 2097152 | 1 >> 4194304 | 3 >> 8388608 | 0 >> 16777216 | 2 >> 33554432 | 1 >> 67108864 | 0 >> 134217728 | 0 >> >> 16 thread >> >> stap test_do_search.stp >> ^CDistribution of latencies (in nanoseconds) for 407806 samples >> max/avg/min: 100315928/112407/39368 >> value |-------------------------------------------------- count >> 8192 | 0 >> 16384 | 0 >> 32768 |@@@@@@@@@@@@@@@@@@@@ 100281 >> 65536 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 249656 >> 131072 |@@@@@@@ 37837 >> 262144 |@@@ 18322 >> 524288 | 1171 >> 1048576 | 203 >> 2097152 | 90 >> 4194304 | 74 >> 8388608 | 83 >> 16777216 | 58 >> 33554432 | 21 >> 67108864 | 10 >> 134217728 | 0 >> 268435456 | 0 >> >> >> It's interesting to note the tail latency here: On the 16 thread version we >> see 67000 less in the 32768 buckets, shifting mostly through the 131072 and >> 262144 buckets, as well as showing a much greater number of calls in the >> tail. In thread 1 no operation made it to 67108864, but 10 did in 16thread, >> along with ~200 more that are higher than 1048567, and ~1500 more that are >> greater than 524288. This kind of tailing means we have "spikes" of latency >> throughout the execution, which then have a minor flow on cause other >> operations to be increased in latency. >> >> These are all in nanoseconds, so this means most operations are in do_search >> for 0.000131072 seconds or less, while there are spikes of operations taking >> between nearly 0.001048576 and 0.067108864 seconds to complete (which is an >> 8x to 512x increase in execution time. >> >> From these point I took two measurements of back_ldbm and the access log, >> knowing these were easy targets for potential areas of latency. Both of >> these were performed with the 16thread server. I didn't need to do this on >> the 1 thread because I'm looking for tail latency, not comparisons now. >> Because we know that there is an increase of ~1500 events of high latency, >> we are looking for similar numbers to appear in other tail latencies. >> >> >> ^CDistribution of back_ldbm latencies (in nanoseconds) for 364223 samples >> max/avg/min: 4757744/31192/13622 >> value |-------------------------------------------------- count >> 2048 | 0 >> 4096 | 0 >> 8192 |@@ 14318 >> 16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 261265 >> 32768 |@@@@@@@@@@@@ 63533 >> 65536 |@@@ 20624 >> 131072 | 4062 >> 262144 | 304 >> 524288 | 68 >> 1048576 | 33 >> 2097152 | 15 >> 4194304 | 1 >> 8388608 | 0 >> 16777216 | 0 >> >> This shows some tail latency in back_ldbm however it's infrequent, only >> showing a handful of spikes, but they do exist and tend to taper off quickly >> as they approach 2097152. Certainly these should be looked at as they will >> likely add up to affecting do_search. >> >> >> stap -v test_access_log.stp >> ^CDistribution of slapi_log_access latencies (in nanoseconds) for 1284350 >> samples >> max/avg/min: 23843992/5138/1036 >> value |-------------------------------------------------- count >> 256 | 0 >> 512 | 0 >> 1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 379710 >> 2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 361243 >> 4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 412480 >> 8192 |@@@@@@@@@@ 83815 >> 16384 |@@@@ 40087 >> 32768 | 5076 >> 65536 | 1253 >> 131072 | 406 >> 262144 | 130 >> 524288 | 72 >> 1048576 | 33 >> 2097152 | 16 >> 4194304 | 19 >> 8388608 | 1 >> 16777216 | 9 >> 33554432 | 0 >> 67108864 | 0 >> >> >> From this, I can see that while some tail latency exists in some conditions >> of back_ldbm, there is a much longer and present tail in the access log, >> with nearly 2000 operations in or exceeding the 65536 mark - remember, in >> do_search most of the operations as a whole take 65536, so we have latency >> spikes in slapi_log_access that are as large or larger than whole search >> operations as a total. >> >> Now we can see "most! operations are fimly below 16384. This is what we want >> to see, but it's interesting that a slapi_log_access can "take up to" a >> quarter of a whole operations processing on the avg case. >> >> What's truly telling here is the tail down to 16777216, with ~250 operations >> exceeding 262144 ns. This shows that within slapi_log_access, we have have >> large spiked delays in the behaviour, which will be affecting both the tail >> latency of do_search as a whole, but also causing other slapi_log_access >> operations to "hold up". >> >> My next steps from here will be: >> >> * To add USDT probes to the logs and back_ldbm to get better, fine detail >> about what is causing the excessive latency. >> * these probes are also needed to resolve what appears to be a symbol >> resolution issue with systemtap when optimisations are enabled. >> * To add probes in other parts of the code base to get better visualisation >> about where and how long timings are taking through an operation. >> * To run a lock contention profile (I was unable to do this today due to >> bugs in systemtap) >> * To document the setup proceedures >> * Commit all these into a PR >> * Document some actionable improvements we can make to improve the server >> performance. >> >> >> Hope that is interesting to everyone, >> >> >> -- >> Sincerely, >> >> William >> _______________________________________________ >> 389-devel mailing list -- [email protected] >> To unsubscribe send an email to [email protected] >> Fedora Code of Conduct: >> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ >> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines >> List Archives: >> https://lists.fedoraproject.org/archives/list/[email protected] > -- Sincerely, William _______________________________________________ 389-devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/[email protected]
