po 1. 6. 2020 v 15:59 odesÃlatel Amit Khandekar <amitdkhan...@gmail.com> napsal:
> On Mon, 1 Jun 2020 at 12:27, Pavel Stehule <pavel.steh...@gmail.com> > wrote: > > po 1. 6. 2020 v 8:15 odesÃlatel Amit Khandekar <amitdkhan...@gmail.com> > napsal: > >> > >> On Sat, 30 May 2020 at 11:11, Pavel Stehule <pavel.steh...@gmail.com> > wrote: > >> > I think so the effect of these patches strongly depends on CPU and > compile > >> > >> I quickly tried pi() with gcc 10 as well, and saw more or less the > >> same benefit. I think, we are bound to see some differences in the > >> benefits across architectures, kernels and compilers; but looks like > >> some benefit is always there. > >> > >> > but it is micro optimization, and when I look to profiler, the bottle > neck is elsewhere. > >> > >> Please check the perf numbers in my reply to Michael. I suppose you > >> meant CachedPlanIsSimplyValid() when you say the bottle neck is > >> elsewhere ? Yeah, this function is always the hottest spot, which I > >> recall is being discussed elsewhere. But I always see exec_stmt(), > >> exec_assign_value as the next functions. > > > > > > It is hard to read the profile result, because these functions are > nested together. For your example > > > > 18.22% postgres postgres [.] CachedPlanIsSimplyValid > > > > Is little bit strange, and probably this is real bottleneck in your > simple example, and maybe some work can be done there, because you assign > just constant. > > I had earlier had a quick look on this one. CachedPlanIsSimplyValid() > was, I recall, hitting a hotspot when it tries to access > plansource->search_path (possibly cacheline miss). But didn't get a > chance to further dig on that. For now, i am focusing on these other > functions for which the patches were submitted. > > > > > > On second hand, your example is pretty unrealistic - and against any > developer's best practices for writing cycles. > > > > I think so we can look on PostGIS, where is some computing heavy > routines in PLpgSQL, and we can look on real profiles. > > > > Probably the most people will have benefit from these optimization. > > I understand it's not a real world example. For generating perf > figures, I had to use an example which amplifies the benefits, so that > the effect of the patches on the perf figures also becomes visible. > Hence, used that example. I had shown the benefits up-thread using a > practical function avg_space(). But the perf figures for that example > were varying a lot. > > So below, what I did was : Run the avg_space() ~150 times, and took > perf report. This stabilized the results a bit : > > HEAD : > + 16.10% 17.29% 16.82% postgres postgres [.] > ExecInterpExpr > + 13.80% 13.56% 14.49% postgres plpgsql.so [.] > exec_assign_value > + 12.64% 12.10% 12.09% postgres plpgsql.so [.] > plpgsql_param_eval_var > + 12.15% 11.28% 11.05% postgres plpgsql.so [.] > exec_stmt > + 10.81% 10.24% 10.55% postgres plpgsql.so [.] > exec_eval_expr > + 9.50% 9.35% 9.37% postgres plpgsql.so [.] > exec_cast_value > ..... > + 1.19% 1.06% 1.21% postgres plpgsql.so [.] > exec_stmts > > > 0001+0002 patches applied (i.e. inline exec_stmt) : > + 16.90% 17.20% 16.54% postgres postgres [.] > ExecInterpExpr > + 16.42% 15.37% 15.28% postgres plpgsql.so [.] > exec_assign_value > + 11.34% 11.92% 11.93% postgres plpgsql.so [.] > plpgsql_param_eval_var > + 11.18% 11.86% 10.99% postgres plpgsql.so [.] > exec_stmts.part.0 > + 10.51% 9.52% 10.61% postgres plpgsql.so [.] > exec_eval_expr > + 9.39% 9.48% 9.30% postgres plpgsql.so [.] > exec_cast_value > > HEAD : exec_stmts + exec_stmt = ~12.7 % > Patched (0001+0002): exec_stmts = 11.3 % > > Just 0003 patch applied (i.e. inline exec_cast_value) : > + 17.00% 16.77% 17.09% postgres postgres [.] ExecInterpExpr > + 15.21% 15.64% 15.09% postgres plpgsql.so [.] > exec_assign_value > + 14.48% 14.06% 13.94% postgres plpgsql.so [.] exec_stmt > + 13.26% 13.30% 13.14% postgres plpgsql.so [.] > plpgsql_param_eval_var > + 11.48% 11.64% 12.66% postgres plpgsql.so [.] exec_eval_expr > .... > + 1.03% 0.85% 0.87% postgres plpgsql.so [.] exec_stmts > > HEAD : exec_assign_value + exec_cast_value = ~23.4 % > Patched (0001+0002): exec_assign_value = 15.3% > > > Time in milliseconds after calling avg_space() 150 times : > HEAD : 7210 > Patch 0001+0002 : 6925 > Patch 0003 : 6670 > Patch 0001+0002+0003 : 6346 > Is your patch in commitfest in commitfest application? Regards Pavel