Re: [PATCH] Add extra statistics to explain for Nested Loop

2023-09-22 Thread Andrey Lepikhov
On 31/7/2022 10:49, Julien Rouhaud wrote: On Sat, Jul 30, 2022 at 08:54:33PM +0800, Julien Rouhaud wrote: Anyway, 1% is in my opinion still too much overhead for extensions that won't get any extra information. I have read all the thread and still can't understand something. What valuable data

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-07-30 Thread Julien Rouhaud
On Sat, Jul 30, 2022 at 08:54:33PM +0800, Julien Rouhaud wrote: > > It turns out that having pg_stat_statements with INSTRUMENT_EXTRA indirectly > requested by INSTRUMENT_ALL adds a ~27% overhead. > > I'm not sure that I actually believe these results, but they're really > consistent, so maybe

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-07-30 Thread Julien Rouhaud
Hi, On Fri, Jun 24, 2022 at 08:16:06PM +0300, Ekaterina Sokolova wrote: > > We started discussion about overheads and how to calculate it correctly. > > Julien Rouhaud wrote: > > Can you give a bit more details on your bench scenario? > > [...] > > Ideally you would need a custom scenario with a

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-06-24 Thread Ekaterina Sokolova
Hi, hackers! We started discussion about overheads and how to calculate it correctly. Julien Rouhaud wrote: Can you give a bit more details on your bench scenario? I see contradictory results, where the patched version with more code is sometimes way faster, sometimes way slower. If you're

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-04-11 Thread Justin Pryzby
On Tue, Apr 05, 2022 at 05:14:09PM -0400, Greg Stark wrote: > This is not passing regression tests due to some details of the plan > output - marking Waiting on Author: It's unstable due to parallel workers. I'm not sure what the usual workarounds here. Maybe set parallel_leader_participation=no

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-04-05 Thread Greg Stark
This is not passing regression tests due to some details of the plan output - marking Waiting on Author: diff -w -U3 c:/cirrus/src/test/regress/expected/partition_prune.out c:/cirrus/src/test/recovery/tmp_check/results/partition_prune.out ---

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-04-02 Thread Julien Rouhaud
Hi, On Fri, Apr 01, 2022 at 11:46:47PM +0300, Ekaterina Sokolova wrote: > > > Most of the comments I have are easy to fix. But I think that the real > > problem > > is the significant overhead shown by Ekaterina that for now would apply > > even if > > you don't consume the new stats, for

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-04-02 Thread Justin Pryzby
This message lost track of the email headers so CFBOT isn't processing the new patches. Which I'm attempting to remedy now. https://www.postgresql.org/message-id/flat/ae576cac3f451d318374f2a2e494a...@postgrespro.ru On Fri, Apr 01, 2022 at 11:46:47PM +0300, Ekaterina Sokolova wrote: > Hi,

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-04-01 Thread Ekaterina Sokolova
Hi, hackers. Thank you for your attention to this topic. Julien Rouhaud wrote: +static void show_loop_info(Instrumentation *instrument, bool isworker, + ExplainState *es); I think this should be done as a separate refactoring commit. Sure. I divided the patch. Now

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-03-28 Thread Julien Rouhaud
Hi, On Mon, Mar 28, 2022 at 03:09:12PM -0400, Greg Stark wrote: > This patch got some very positive feedback and some significant amount > of work earlier in the release cycle. The feedback from Julien earlier > this month seemed pretty minor. > > Ekaterina, is there any chance you'll be able to

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-03-28 Thread Justin Pryzby
> > +static void show_loop_info(Instrumentation *instrument, bool isworker, > > + ExplainState *es); > > > > I think this should be done as a separate refactoring commit. Right - the 0001 patch I sent seems independently beneficial, and makes the changes in 0002 more

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-03-28 Thread Greg Stark
This patch got some very positive feedback and some significant amount of work earlier in the release cycle. The feedback from Julien earlier this month seemed pretty minor. Ekaterina, is there any chance you'll be able to work on this this week and do you think it has a chance of making this

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-03-06 Thread Julien Rouhaud
Hi, On Thu, Feb 03, 2022 at 12:59:03AM +0300, Ekaterina Sokolova wrote: > > I apply the new version of patch. > > I wanted to measure overheads, but could't choose correct way. Thanks for > idea with auto_explain. > I loaded it and made 10 requests of pgbench (number of clients: 1, of > threads:

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-02-02 Thread Ekaterina Sokolova
are in file overhead_v0.txt. Please feel free to share your suggestions and comments. Regards, -- Ekaterina Sokolova Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom: Ekaterina Sokolova Subject: [PATCH] Add extra statistics to explain for Nested Loop For some

Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-01-06 Thread Lukas Fittl
On Sun, Nov 21, 2021 at 8:55 PM Justin Pryzby wrote: > I'm curious to hear what you and others think of the refactoring. > > It'd be nice if there's a good way to add a test case for verbose output > involving parallel workers, but the output is unstable ... > I've reviewed this patch, and it

Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-11-21 Thread Justin Pryzby
ingInfo(es->str, + "actual rows=%.0f loops=%.0f", + rows, nloops); + + if (!isworker) + appendStringInfoChar(es->str, ')'); + } + else + { + if (es->timing) + { + ExplainPropertyFloat("Actual Startup Time", "ms", startup_ms, +

Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-08-17 Thread Ekaterina Sokolova
Hi, hackers. Here is the new version of patch that add printing of min, max and total statistics for time and rows across all loops to EXPLAIN ANALYSE. 1) Please add VERBOSE to display extra statistics. 2) Format of extra statistics is: a) FORMAT TEXT Loop min_time: N max_time: N

Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-07-14 Thread vignesh C
On Wed, Apr 14, 2021 at 4:57 PM wrote: > > Thank you for working on this issue. Your comments helped me make this > patch more correct. > > > Lines with "colon" format shouldn't use equal signs, and should use two > > spaces > > between fields. > Done. Now extra line looks like "Loop min_rows:

Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-04-14 Thread e . sokolova
Thank you for working on this issue. Your comments helped me make this patch more correct. Lines with "colon" format shouldn't use equal signs, and should use two spaces between fields. Done. Now extra line looks like "Loop min_rows: %.0f max_rows: %.0f total_rows: %.0f" or "Loop min_time:

Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-03-25 Thread Justin Pryzby
> diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c > index afc45429ba4..723eccca013 100644 > --- a/src/backend/commands/explain.c > +++ b/src/backend/commands/explain.c > @@ -1589,29 +1589,82 @@ ExplainNode(PlanState *planstate, List *ancestors, > double

Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-03-25 Thread e . sokolova
Thank you all for your feedback and reforms. I attach a new version of the patch with the some changes and fixes. Here's a list of the major changes: 1) New format of extra statistics. This is now contained in a line separate from the main statistics. Julien Rouhaud писал 2021-02-01 08:28: On

Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-02-01 Thread Yugo NAGATA
On Mon, 1 Feb 2021 13:28:45 +0800 Julien Rouhaud wrote: > On Thu, Jan 28, 2021 at 8:38 PM Yugo NAGATA wrote: > > > > postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j; > > > > QUERY PLAN > >

Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-01-31 Thread Julien Rouhaud
On Thu, Jan 28, 2021 at 8:38 PM Yugo NAGATA wrote: > > postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j; > > QUERY PLAN >

Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-01-28 Thread Yugo NAGATA
Hello, On Thu, 12 Nov 2020 23:10:05 +0300 e.sokol...@postgrespro.ru wrote: > New version of this patch prints extra statistics for all cases of > multiple loops, not only for Nested Loop. Also I fixed the example by > adding VERBOSE. I think this extra statistics seems good because it is

Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-01-18 Thread Michael Christofides
> New version of this patch prints extra statistics for all cases of > multiple loops, not only for Nested Loop. Also I fixed the example by > adding VERBOSE. > > Please don't hesitate to share any thoughts on this topic! Thanks a lot for working on this! I really like the extra details, and

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-11-10 Thread Georgios Kokolatos
Hi, I noticed that this patch fails on the cfbot. For this, I changed the status to: 'Waiting on Author'. Cheers, //Georgios The new status of this patch is: Waiting on Author

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-30 Thread Tomas Vondra
Hello Ekaterina, seems like an interesting and useful improvement. I did a quick review of the patch - attached is a 0002 patch with a couple minor changes (the 0001 is just your v1 patch, to keep cfbot happy). 1) There's a couple changes to follow project code style (e.g. brackets after "if"

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-23 Thread e . sokolova
wrote: You should update the explain_parallel_append() plpgsql function created in that test file to make sure that both "rows" and the two new counters are changed to "N". There might be other similar changes needed. Thank you for watching this issue. I made the necessary changes in tests

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-19 Thread Andres Freund
Hi, On 2020-10-16 10:42:43 +0300, e.sokol...@postgrespro.ru wrote: > For some distributions of data in tables, different loops in nested loop > joins can take different time and process different amounts of entries. It > makes average statistics returned by explain analyze not very useful for >

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-19 Thread Pierre Giraud
Le 17/10/2020 à 06:26, Julien Rouhaud a écrit : > On Sat, Oct 17, 2020 at 12:15 PM Pavel Stehule > wrote: >> >> so 17. 10. 2020 v 0:11 odesílatel Anastasia Lubennikova >> napsal: >>> >>> On 16.10.2020 12:07, Julien Rouhaud wrote: >>> >>> Le ven. 16 oct. 2020 à 16:12, Pavel Stehule a >>>

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-18 Thread Julien Rouhaud
On Sat, Oct 17, 2020 at 6:11 AM Anastasia Lubennikova wrote: > > On 16.10.2020 12:07, Julien Rouhaud wrote: > > Le ven. 16 oct. 2020 à 16:12, Pavel Stehule a écrit > : >> >> >> >> pá 16. 10. 2020 v 9:43 odesílatel napsal: >>> >>> Hi, hackers. >>> For some distributions of data in tables,

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-17 Thread David G. Johnston
On Fri, Oct 16, 2020 at 3:11 PM Anastasia Lubennikova < a.lubennik...@postgrespro.ru> wrote: > User visible change is: > > > - -> Nested Loop (actual rows=N loops=N) > + -> Nested Loop (actual min_rows=0 rows=0 max_rows=0 > loops=2) > I'd be inclined to append both

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-17 Thread hubert depesz lubaczewski
On Sat, Oct 17, 2020 at 12:26:08PM +0800, Julien Rouhaud wrote: > >> - -> Nested Loop (actual rows=N loops=N) > >> + -> Nested Loop (actual min_rows=0 rows=0 max_rows=0 > >> loops=2) > > This interface is ok - there is not too much space for creativity. > Yes I also

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-16 Thread Pavel Stehule
so 17. 10. 2020 v 6:26 odesílatel Julien Rouhaud napsal: > On Sat, Oct 17, 2020 at 12:15 PM Pavel Stehule > wrote: > > > > so 17. 10. 2020 v 0:11 odesílatel Anastasia Lubennikova < > a.lubennik...@postgrespro.ru> napsal: > >> > >> On 16.10.2020 12:07, Julien Rouhaud wrote: > >> > >> Le ven. 16

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-16 Thread Julien Rouhaud
On Sat, Oct 17, 2020 at 12:15 PM Pavel Stehule wrote: > > so 17. 10. 2020 v 0:11 odesílatel Anastasia Lubennikova > napsal: >> >> On 16.10.2020 12:07, Julien Rouhaud wrote: >> >> Le ven. 16 oct. 2020 à 16:12, Pavel Stehule a >> écrit : >>> >>> >>> >>> pá 16. 10. 2020 v 9:43 odesílatel

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-16 Thread Pavel Stehule
so 17. 10. 2020 v 0:11 odesílatel Anastasia Lubennikova < a.lubennik...@postgrespro.ru> napsal: > On 16.10.2020 12:07, Julien Rouhaud wrote: > > Le ven. 16 oct. 2020 à 16:12, Pavel Stehule a > écrit : > >> >> >> pá 16. 10. 2020 v 9:43 odesílatel napsal: >> >>> Hi, hackers. >>> For some

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-16 Thread Anastasia Lubennikova
On 16.10.2020 12:07, Julien Rouhaud wrote: Le ven. 16 oct. 2020 à 16:12, Pavel Stehule > a écrit : pá 16. 10. 2020 v 9:43 odesílatel mailto:e.sokol...@postgrespro.ru>> napsal: Hi, hackers. For some distributions of data in tables, different

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-16 Thread Julien Rouhaud
Le ven. 16 oct. 2020 à 16:12, Pavel Stehule a écrit : > > > pá 16. 10. 2020 v 9:43 odesílatel napsal: > >> Hi, hackers. >> For some distributions of data in tables, different loops in nested loop >> joins can take different time and process different amounts of entries. >> It makes average

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-16 Thread Pavel Stehule
pá 16. 10. 2020 v 9:43 odesílatel napsal: > Hi, hackers. > For some distributions of data in tables, different loops in nested loop > joins can take different time and process different amounts of entries. > It makes average statistics returned by explain analyze not very useful > for DBA. > To