Re: Question: test "aggregates" failed in 32-bit machine

2022-10-03 Thread Tomas Vondra


On 10/3/22 16:05, Tom Lane wrote:
> "Jonathan S. Katz"  writes:
>> Based on the above discussion, the RMT asks for a revert of db0d67db2 in 
>> the v15 release. The RMT also recommends a revert in HEAD but does not 
>> have the power to request that.
> 
> Roger, I'll push these shortly.
> 

Thanks for resolving this, and apologies for not noticing this thread
earlier (and for the bugs in the code, ofc).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-03 Thread Tom Lane
[ Just for the archives' sake at this point, in case somebody has
another go at this feature. ]

I wrote:
> ... I'm now discovering that the code I'd hoped to salvage in
> pathkeys.c (get_useful_group_keys_orderings and related) has its very own
> bugs.  It's imagining that it can rearrange a PathKeys list arbitrarily
> and then rearrange the GROUP BY SortGroupClause list to match, but that's
> easier said than done, for a couple of different reasons.

It strikes me that the easy solution here is to *not* rearrange the
SortGroupClause list at all.  What that would be used for later is
to generate a Unique node's list of columns to compare, but since
Unique only cares about equality-or-not, there's no strong reason
why it has to compare the columns in the same order they're sorted
in.  (Indeed, if anything we should prefer to compare them in the
opposite order, since the least-significant column should be the
most likely to be different from the previous row.)

I'm fairly sure that the just-reverted code is buggy on its
own terms, in that it might sometimes produce a clause list
that's not ordered the same as the pathkeys; but there's no
visible misbehavior, because that does not in fact matter.

So this'd let us simplify the APIs here, in particular PathKeyInfo
seems unnecessary, because we don't have to pass the SortGroupClause
list into or out of the pathkey-reordering logic.

regards, tom lane




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-03 Thread Tom Lane
"Jonathan S. Katz"  writes:
> Based on the above discussion, the RMT asks for a revert of db0d67db2 in 
> the v15 release. The RMT also recommends a revert in HEAD but does not 
> have the power to request that.

Roger, I'll push these shortly.

regards, tom lane




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-03 Thread Jonathan S. Katz

On 10/2/22 8:45 PM, Michael Paquier wrote:

On Sun, Oct 02, 2022 at 02:11:12PM -0400, Tom Lane wrote:

"Jonathan S. Katz"  writes:

OK. For v15 I am heavily in favor for the least risky approach given the
point we are at in the release cycle. The RMT hasn’t met yet to discuss,
but from re-reading this thread again, I would recommend to revert
(i.e. the “straight up revert”).


OK by me.


I don't quite see why it would be to let this code live on HEAD if it
is not ready to be merged as there is a risk of creating side issues
with things tied to the costing still ready to be merged, so I agree
that the reversion done on both branches is the way to go for now.
This could always be reworked and reproposed in the future.


[RMT-hat]

Just to follow things procedure-wise[1], while there do not seem to be 
any objections to reverting through regular community processes, I do 
think the RMT has to make this ask as Tomas (patch committer) has not 
commented and we are up against release deadlines.


Based on the above discussion, the RMT asks for a revert of db0d67db2 in 
the v15 release. The RMT also recommends a revert in HEAD but does not 
have the power to request that.


We do hope to see continued work and inclusion of this feature for a 
future release. We understand that the work on this optimization is 
complicated and appreciate all of the efforts on it.


Thanks,

Jonathan

[1] https://wiki.postgresql.org/wiki/Release_Management_Team



OpenPGP_signature
Description: OpenPGP digital signature


Re: Question: test "aggregates" failed in 32-bit machine

2022-10-02 Thread Michael Paquier
On Sun, Oct 02, 2022 at 02:11:12PM -0400, Tom Lane wrote:
> "Jonathan S. Katz"  writes:
>> OK. For v15 I am heavily in favor for the least risky approach given the
>> point we are at in the release cycle. The RMT hasn’t met yet to discuss,
>> but from re-reading this thread again, I would recommend to revert
>> (i.e. the “straight up revert”).
> 
> OK by me.

I don't quite see why it would be to let this code live on HEAD if it
is not ready to be merged as there is a risk of creating side issues
with things tied to the costing still ready to be merged, so I agree
that the reversion done on both branches is the way to go for now.
This could always be reworked and reproposed in the future.

> I'm just about to throw up my hands and go for reversion in both branches,
> because I'm now discovering that the code I'd hoped to salvage in
> pathkeys.c (get_useful_group_keys_orderings and related) has its very own
> bugs.  It's imagining that it can rearrange a PathKeys list arbitrarily
> and then rearrange the GROUP BY SortGroupClause list to match, but that's
> easier said than done, for a couple of different reasons.  (I now
> understand why db0d67db2 made a cowboy hack in get_eclass_for_sort_expr ...
> but it's still a cowboy hack with difficult-to-foresee side effects.)
> There are other things in there that make it painfully obvious that
> this code wasn't very carefully reviewed, eg XXX comments that should
> have been followed up and were not, or a reference to a nonexistent
> "debug_group_by_match_order_by" flag (maybe that was a GUC at some point?).

Okay.  Ugh.
--
Michael


signature.asc
Description: PGP signature


Re: Question: test "aggregates" failed in 32-bit machine

2022-10-02 Thread Tom Lane
David Rowley  writes:
> As for the slight misuse of group_pathkeys, I guess since there are no
> users that require just the plain pathkeys belonging to the GROUP BY,
> then likely the best thing would be just to rename that field to
> something like groupagg_pathkeys.  Maintaining two separate fields and
> concatenating them every time we want group_pathkeys does not seem
> that appealing to me. Seems like a waste of memory and effort. I don't
> want to hi-jack this thread to discuss that, but if you have a
> preferred course of action, then I'm happy to kick off a discussion on
> a new thread.

I don't feel any great urgency to resolve this.  Let's wait and see
what comes out of the other thread.

regards, tom lane




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-02 Thread David Rowley
On Mon, 3 Oct 2022 at 09:59, Tom Lane  wrote:
>
> David Rowley  writes:
> > For the master version, I think it's safe just to get rid of
> > PlannerInfo.num_groupby_pathkeys now.  I only added that so I could
> > strip off the ORDER BY / DISTINCT aggregate PathKeys from the group by
> > pathkeys before passing to the functions that rearranged the GROUP BY
> > clause.
>
> I was kind of unhappy with that data structure too, but from the
> other direction: I didn't like that you were folding aggregate-derived
> pathkeys into root->group_pathkeys in the first place.  That seems like
> a kluge that might work all right for the moment but will cause problems
> down the road.  (Despite the issues with the patch at hand, I don't
> think it's unreasonable to suppose that somebody will have a more
> successful go at optimizing GROUP BY sorting later.)  If we keep the
> data structure like this, I think we absolutely need num_groupby_pathkeys,
> or some other way of recording which pathkeys came from what source.

Ok, I don't feel too strongly about removing num_groupby_pathkeys. I'm
fine to leave it there.  However, I'll reserve slight concerns that
we'll likely receive sporadic submissions of cleanup patches that
remove the unused field over the course of the next few years and that
dealing with those might take up more time than just removing it now
and putting it back when we need it. We have been receiving quite a
few patches along those lines lately.

As for the slight misuse of group_pathkeys, I guess since there are no
users that require just the plain pathkeys belonging to the GROUP BY,
then likely the best thing would be just to rename that field to
something like groupagg_pathkeys.  Maintaining two separate fields and
concatenating them every time we want group_pathkeys does not seem
that appealing to me. Seems like a waste of memory and effort. I don't
want to hi-jack this thread to discuss that, but if you have a
preferred course of action, then I'm happy to kick off a discussion on
a new thread.

David




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-02 Thread Tom Lane
David Rowley  writes:
> For the master version, I think it's safe just to get rid of
> PlannerInfo.num_groupby_pathkeys now.  I only added that so I could
> strip off the ORDER BY / DISTINCT aggregate PathKeys from the group by
> pathkeys before passing to the functions that rearranged the GROUP BY
> clause.

I was kind of unhappy with that data structure too, but from the
other direction: I didn't like that you were folding aggregate-derived
pathkeys into root->group_pathkeys in the first place.  That seems like
a kluge that might work all right for the moment but will cause problems
down the road.  (Despite the issues with the patch at hand, I don't
think it's unreasonable to suppose that somebody will have a more
successful go at optimizing GROUP BY sorting later.)  If we keep the
data structure like this, I think we absolutely need num_groupby_pathkeys,
or some other way of recording which pathkeys came from what source.

One way to manage that would be to insist that the length of
root->group_clauses should indicate the number of associated grouping
pathkeys.  Right now they might not be the same because we might discover
some of the pathkeys to be redundant --- but if we do, ISTM that the
corresponding GROUP BY clauses are also redundant and could get dropped.
That ties into the stuff I was worried about in [1], though.  I'll keep
this in mind when I get back to messing with that.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/1657885.1657647073%40sss.pgh.pa.us




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-02 Thread David Rowley
On Mon, 3 Oct 2022 at 08:10, Tom Lane  wrote:
> As attached.

For the master version, I think it's safe just to get rid of
PlannerInfo.num_groupby_pathkeys now.  I only added that so I could
strip off the ORDER BY / DISTINCT aggregate PathKeys from the group by
pathkeys before passing to the functions that rearranged the GROUP BY
clause.

David




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-02 Thread Tom Lane
"Jonathan S. Katz"  writes:
> OK. For v15 I am heavily in favor for the least risky approach given the
> point we are at in the release cycle. The RMT hasn’t met yet to discuss,
> but from re-reading this thread again, I would recommend to revert
> (i.e. the “straight up revert”).

OK by me.

> I’m less opinionated on the approach for what’s in HEAD, but the rewrite
> you suggest sounds promising.

I'm just about to throw up my hands and go for reversion in both branches,
because I'm now discovering that the code I'd hoped to salvage in
pathkeys.c (get_useful_group_keys_orderings and related) has its very own
bugs.  It's imagining that it can rearrange a PathKeys list arbitrarily
and then rearrange the GROUP BY SortGroupClause list to match, but that's
easier said than done, for a couple of different reasons.  (I now
understand why db0d67db2 made a cowboy hack in get_eclass_for_sort_expr ...
but it's still a cowboy hack with difficult-to-foresee side effects.)
There are other things in there that make it painfully obvious that
this code wasn't very carefully reviewed, eg XXX comments that should
have been followed up and were not, or a reference to a nonexistent
"debug_group_by_match_order_by" flag (maybe that was a GUC at some point?).

On top of that, it's producing several distinct pathkey orderings for
the caller to try, but it's completely unclear to me that the subsequent
choice of cheapest path isn't going to largely reduce to the question
of whether we can accurately estimate the relative costs of different
sort-column orders.  Which is exactly what we're finding we can't do.
So that end of it seems to need a good deal of rethinking as well.

In short, this needs a whole lotta work, and I'm not volunteering.

regards, tom lane




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-02 Thread Jonathan S. Katz


> On Oct 2, 2022, at 1:12 PM, Tom Lane  wrote:
> 
> "Jonathan S. Katz"  writes:
>>> On 10/1/22 6:57 PM, Tom Lane wrote:
>>> I plan to have a look tomorrow at the idea of reverting only the cost_sort
>>> changes, and rewriting get_cheapest_group_keys_order() to just sort the
>>> keys by decreasing numgroups estimates as I suggested upthread.  That
>>> might be substantially less messy, because of fewer interactions with
>>> 1349d2790.
> 
>> Maybe this leads to a follow-up question of do we continue to improve 
>> what is in HEAD while reverting the code in v15 (particularly if it's 
>> easier to do it that way)?
> 
> No.  I see no prospect that the cost_sort code currently in HEAD is going
> to become shippable in the near future.  Quite aside from the plain bugs,
> I think it's based on untenable assumptions about how accurately we can
> estimate the CPU costs associated with different sort-column orders.

OK.

> Having said that, it's certainly possible that we should do something
> different in HEAD than in v15.  We could do the rewrite I suggest above
> in HEAD while doing a straight-up revert in v15.  I've been finding that
> 1349d2790 is sufficiently entwined with this code that the patches would
> look significantly different in any case, so that might be the most
> reliable way to proceed in v15.

OK. For v15 I am heavily in favor for the least risky approach given the
point we are at in the release cycle. The RMT hasn’t met yet to discuss,
but from re-reading this thread again, I would recommend to revert
(i.e. the “straight up revert”).

I’m less opinionated on the approach for what’s in HEAD, but the rewrite
you suggest sounds promising.

Thanks,

Jonathan



Re: Question: test "aggregates" failed in 32-bit machine

2022-10-02 Thread Tom Lane
"Jonathan S. Katz"  writes:
> On 10/1/22 6:57 PM, Tom Lane wrote:
>> I plan to have a look tomorrow at the idea of reverting only the cost_sort
>> changes, and rewriting get_cheapest_group_keys_order() to just sort the
>> keys by decreasing numgroups estimates as I suggested upthread.  That
>> might be substantially less messy, because of fewer interactions with
>> 1349d2790.

> Maybe this leads to a follow-up question of do we continue to improve 
> what is in HEAD while reverting the code in v15 (particularly if it's 
> easier to do it that way)?

No.  I see no prospect that the cost_sort code currently in HEAD is going
to become shippable in the near future.  Quite aside from the plain bugs,
I think it's based on untenable assumptions about how accurately we can
estimate the CPU costs associated with different sort-column orders.

Having said that, it's certainly possible that we should do something
different in HEAD than in v15.  We could do the rewrite I suggest above
in HEAD while doing a straight-up revert in v15.  I've been finding that
1349d2790 is sufficiently entwined with this code that the patches would
look significantly different in any case, so that might be the most
reliable way to proceed in v15.

regards, tom lane




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-02 Thread Jonathan S. Katz

On 10/1/22 6:57 PM, Tom Lane wrote:

"Jonathan S. Katz"  writes:

On 10/1/22 3:13 PM, Tom Lane wrote:

I'm still of the opinion that we need to revert this code for now.



[RMT hat, but speaking just for me] reading through Tom's analysis, this
seems to be the safest path forward. I have a few questions to better
understand:



1. How invasive would the revert be?


I've just finished constructing a draft full-reversion patch.  I'm not
confident in this yet; in particular, teasing it apart from 1349d2790
("Improve performance of ORDER BY / DISTINCT aggregates") was fairly
messy.  I need to look through the regression test changes and make
sure that none are surprising.  But this is approximately the right
scope if we rip it out entirely.

I plan to have a look tomorrow at the idea of reverting only the cost_sort
changes, and rewriting get_cheapest_group_keys_order() to just sort the
keys by decreasing numgroups estimates as I suggested upthread.  That
might be substantially less messy, because of fewer interactions with
1349d2790.


Maybe this leads to a follow-up question of do we continue to improve 
what is in HEAD while reverting the code in v15 (particularly if it's 
easier to do it that way)?


I know we're generally not in favor of that approach, but wanted to ask.


2. Are the other user-visible items that would be impacted?


See above.  (But note that 1349d2790 is HEAD-only, not in v15.)


With the RMT hat, I'm hyperfocused on PG15 stability. We have plenty of 
time time to stabilize head for v16 :)





3. Is there an option of disabling the feature by default viable?


Not one that usefully addresses my concerns.  The patch did add an
enable_group_by_reordering GUC which we could change to default-off,
but it does nothing about the cost_sort behavioral changes.  I would
be a little inclined to rip out that GUC in either case, because
I doubt that we need it with the more restricted change.


Understood.

I'll wait for your analysis of reverting only the cost_sort changes etc. 
mentioned above.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: Question: test "aggregates" failed in 32-bit machine

2022-10-01 Thread Tom Lane
"Jonathan S. Katz"  writes:
> On 10/1/22 3:13 PM, Tom Lane wrote:
>> I'm still of the opinion that we need to revert this code for now.

> [RMT hat, but speaking just for me] reading through Tom's analysis, this 
> seems to be the safest path forward. I have a few questions to better 
> understand:

> 1. How invasive would the revert be?

I've just finished constructing a draft full-reversion patch.  I'm not
confident in this yet; in particular, teasing it apart from 1349d2790
("Improve performance of ORDER BY / DISTINCT aggregates") was fairly
messy.  I need to look through the regression test changes and make
sure that none are surprising.  But this is approximately the right
scope if we rip it out entirely.

I plan to have a look tomorrow at the idea of reverting only the cost_sort
changes, and rewriting get_cheapest_group_keys_order() to just sort the
keys by decreasing numgroups estimates as I suggested upthread.  That
might be substantially less messy, because of fewer interactions with
1349d2790.

> 2. Are the other user-visible items that would be impacted?

See above.  (But note that 1349d2790 is HEAD-only, not in v15.)

> 3. Is there an option of disabling the feature by default viable?

Not one that usefully addresses my concerns.  The patch did add an
enable_group_by_reordering GUC which we could change to default-off,
but it does nothing about the cost_sort behavioral changes.  I would
be a little inclined to rip out that GUC in either case, because
I doubt that we need it with the more restricted change.

regards, tom lane

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 2e4e82a94f..cc9e39c4a5 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2862,13 +2862,16 @@ select c2 * (random() <= 1)::int as c2 from ft2 group by c2 * (random() <= 1)::i
 -- GROUP BY clause in various forms, cardinal, alias and constant expression
 explain (verbose, costs off)
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
- QUERY PLAN 
-
- Foreign Scan
+  QUERY PLAN   
+---
+ Sort
Output: (count(c2)), c2, 5, 7.0, 9
-   Relations: Aggregate on (public.ft1)
-   Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
-(4 rows)
+   Sort Key: ft1.c2
+   ->  Foreign Scan
+ Output: (count(c2)), c2, 5, 7.0, 9
+ Relations: Aggregate on (public.ft1)
+ Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5
+(7 rows)
 
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
   w  | x | y |  z  
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d8848bc774..d750290f13 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5062,20 +5062,6 @@ ANY num_sync ( 
-  enable_group_by_reordering (boolean)
-  
-   enable_group_by_reordering configuration parameter
-  
-  
-  
-   
-Enables or disables reordering of keys in a GROUP BY
-clause. The default is on.
-   
-  
- 
-
  
   enable_hashagg (boolean)
   
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index f486d42441..5ef29eea69 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1814,327 +1814,6 @@ cost_recursive_union(Path *runion, Path *nrterm, Path *rterm)
 	rterm->pathtarget->width);
 }
 
-/*
- * is_fake_var
- *		Workaround for generate_append_tlist() which generates fake Vars with
- *		varno == 0, that will cause a fail of estimate_num_group() call
- *
- * XXX Ummm, why would estimate_num_group fail with this?
- */
-static bool
-is_fake_var(Expr *expr)
-{
-	if (IsA(expr, RelabelType))
-		expr = (Expr *) ((RelabelType *) expr)->arg;
-
-	return (IsA(expr, Var) && ((Var *) expr)->varno == 0);
-}
-
-/*
- * get_width_cost_multiplier
- *		Returns relative complexity of comparing two values based on its width.
- * The idea behind is that the comparison becomes more expensive the longer the
- * value is. Return value is in cpu_operator_cost units.
- */
-static double
-get_width_cost_multiplier(PlannerInfo *root, Expr *expr)
-{
-	double		width = -1.0;	/* fake value */
-
-	if (IsA(expr, RelabelType))
-		expr = (Expr *) ((RelabelType *) expr)->arg;
-
-	/* Try to find actual stat in corresponding relation */
-	if (IsA(expr, Var))
-	{
-		Var		   *var = (Var *) expr;
-
-		if (var->varno > 0 && var->varno < 

Re: Question: test "aggregates" failed in 32-bit machine

2022-10-01 Thread Jonathan S. Katz

On 10/1/22 3:13 PM, Tom Lane wrote:


I'm still of the opinion that we need to revert this code for now.


[RMT hat, but speaking just for me] reading through Tom's analysis, this 
seems to be the safest path forward. I have a few questions to better 
understand:


1. How invasive would the revert be?
2. Are the other user-visible items that would be impacted?
3. Is there an option of disabling the feature by default viable?

Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: Question: test "aggregates" failed in 32-bit machine

2022-10-01 Thread Tom Lane
Peter Geoghegan  writes:
> Reminds me of the other sort testing program that you wrote when the
> B code first went in:
> https://www.postgresql.org/message-id/18732.1142967...@sss.pgh.pa.us

Ha, I'd totally forgotten about that ...

regards, tom lane




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-01 Thread Peter Geoghegan
On Sat, Oct 1, 2022 at 12:14 PM Tom Lane  wrote:
> I spent some time today looking into the question of what our qsort
> code actually does.  I wrote a quick-n-dirty little test module
> (attached) to measure the number of comparisons qsort really uses
> for assorted sample inputs.

Reminds me of the other sort testing program that you wrote when the
B code first went in:

https://www.postgresql.org/message-id/18732.1142967...@sss.pgh.pa.us

This was notable for recreating the tests from the original B paper.
The paper uses various types of test inputs with characteristics that
were challenging to the implementation and worth specifically getting
right. For example, "saw tooth" input.

-- 
Peter Geoghegan




Re: Question: test "aggregates" failed in 32-bit machine

2022-10-01 Thread Tom Lane
I wrote:
> So at this point I've lost all faith in the estimates being meaningful
> at all.

I spent some time today looking into the question of what our qsort
code actually does.  I wrote a quick-n-dirty little test module
(attached) to measure the number of comparisons qsort really uses
for assorted sample inputs.  The results will move a bit from run
to run because of randomization, but the average counts should be
pretty stable I think.  I got results like these:

regression=# create temp table data as
select * from qsort_comparisons(10);
SELECT 10
regression=# select n * log(groups)/log(2) as est, 100*(n * log(groups)/log(2) 
- avg_cmps)/avg_cmps as pct_err, * from data;
est |  pct_err   |   n| groups | avg_cmps | 
min_cmps | max_cmps |  note  
++++--+--+--+
  0 |   -100 | 10 |  1 |9 |
9 |9 | all values the same
 1660964.0474436812 | -5.419880052975057 | 10 | 10 |  1756145 |  
1722569 |  1835627 | all values distinct
 10 | -33.33911061041376 | 10 |  2 |   150013 |   
150008 |   150024 | 2 distinct values
 40 | 11.075628618635713 | 10 | 16 |   360115 |   
337586 |   431376 | 16 distinct values
 60 |  8.369757612975473 | 10 | 64 |   553660 |   
523858 |   639492 | 64 distinct values
 80 |  4.770461016221087 | 10 |256 |   763574 |   
733898 |   844450 | 256 distinct values
100 | 1.5540821186618827 | 10 |   1024 |   984697 |   
953830 |  384 | 1024 distinct values
 1457116.0087927429 |  41.97897366170798 | 10 |  24342 |  1026290 |   
994694 |  1089503 | Zipfian, parameter 1.1
 1150828.9986140348 | 158.28880094758154 | 10 |   2913 |   445559 |   
426575 |   511214 | Zipfian, parameter 1.5
  578135.9713524659 |  327.6090378488971 | 10 | 55 |   135202 |   
132541 |   213467 | Zipfian, parameter 3.0
(10 rows)

So "N * log(NumberOfGroups)" is a pretty solid estimate for
uniformly-sized groups ... except when NumberOfGroups = 1 ... but it
is a significant overestimate if the groups aren't uniformly sized.
Now a factor of 2X or 3X isn't awful --- we're very happy to accept
estimates only that good in other contexts --- but I still wonder
whether this is reliable enough to justify the calculations being
done in compute_cpu_sort_cost.  I'm still very afraid that the
conclusions we're drawing about the sort costs for different column
orders are mostly junk.

In any case, something's got to be done about the failure at
NumberOfGroups = 1.  Instead of this:

correctedNGroups = Max(1.0, ceil(correctedNGroups));
per_tuple_cost += totalFuncCost * LOG2(correctedNGroups);

I suggest

if (correctedNGroups > 1.0)
per_tuple_cost += totalFuncCost * LOG2(correctedNGroups);
else  /* Sorting N all-alike tuples takes only N-1 comparisons */
per_tuple_cost += totalFuncCost;

(Note that the ceil() here is a complete waste, because all paths leading
to this produced integral estimates already.  Even if they didn't, I see
no good argument why ceil() makes the result better.)

I'm still of the opinion that we need to revert this code for now.

regards, tom lane

/*

create function qsort_comparisons(N int)
  returns table (N int, groups int,
 avg_cmps int8, min_cmps int8, max_cmps int8,
 note text)
strict volatile language c as '/path/to/count_comparisons.so';

select * from qsort_comparisons(1);

 */

#include "postgres.h"

#include 

#include "common/pg_prng.h"
#include "fmgr.h"
#include "funcapi.h"
#include "miscadmin.h"
#include "utils/tuplestore.h"

PG_MODULE_MAGIC;

/*
 * qsort_arg comparator function for integers.
 * The number of calls is tracked in *arg.
 */
static int
cmp_func(const void *a, const void *b, void *arg)
{
	int			aa = *(const int *) a;
	int			bb = *(const int *) b;
	size_t	   *count_ptr = (size_t *) arg;

	(*count_ptr)++;
	if (aa < bb)
		return -1;
	if (aa > bb)
		return 1;
	return 0;
}

/*
 * Randomly shuffle an array of integers.
 */
static void
shuffle(int *x, int n)
{
	/*
	 * Shuffle array using Fisher-Yates algorithm. Iterate array and swap head
	 * (i) with a randomly chosen same-or-later item (j) at each iteration.
	 */
	for (int i = 0; i < n; i++)
	{
		int			j = (int) pg_prng_uint64_range(_global_prng_state,
   i, n - 1);
		int			tmp = x[i];

		x[i] = x[j];
		x[j] = tmp;
	}
}

/*
 * Test qsort for the given test case.
 * Shuffle the given array, sort it using qsort_arg,
 * and report the numbers of comparisons made.
 */
static void
qsort_test_case(int *array, int n, char *note, int trials,
Tuplestorestate *tupstore, AttInMetadata *attinmeta)
{
	size_t		tot_count = 0;
	size_t		min_count = SIZE_MAX;
	

Re: Question: test "aggregates" failed in 32-bit machine

2022-09-30 Thread Tom Lane
I wrote:
> Given the previous complaints about db0d67db2, I wonder if it's not
> most prudent to revert it.  I doubt we are going to get satisfactory
> behavior out of it until there's fairly substantial improvements in
> all these underlying estimates.

After spending some more time looking at the code, I think that that
is something we absolutely have to discuss.  I already complained at
[1] about how db0d67db2 made very significant changes in sort cost
estimation behavior, which seem likely to result in significant
user-visible plan changes that might or might not be for the better.
But I hadn't read any of the code at that point.  Now I have, and
frankly it's not ready for prime time.  Beyond the question of
whether we have sufficiently accurate input values, I see these
issues in and around compute_cpu_sort_cost():

1. The algorithm is said to be based on Sedgewick & Bentley 2002 [2].
I have the highest regard for those two gentlemen, so I'm quite
prepared to believe that their estimate of the number of comparisons
used by Quicksort is good.  However, the expression given in our
comments:

 *  log(N! / (X1! * X2! * ..))  ~  sum(Xi * log(N/Xi))

doesn't look much like anything they wrote.  More, what we're actually
doing is

 * We assume all Xi the same because now we don't have any estimation of
 * group sizes, we have only know the estimate of number of groups (distinct
 * values). In that case, formula becomes:
 *  N * log(NumberOfGroups)

That's a pretty drastic simplification.  No argument is given as to why
that's still reliable enough to be useful for the purposes to which this
code tries to put it --- especially when you consider that real-world
data is more likely to follow Zipf's law than have uniform group sizes.
If you're going to go as far as doing this:

 * For multi-column sorts we need to estimate the number of comparisons for
 * each individual column - for example with columns (c1, c2, ..., ck) we
 * can estimate that number of comparisons on ck is roughly
 *  ncomparisons(c1, c2, ..., ck) / ncomparisons(c1, c2, ..., c(k-1))

you'd better pray that your number-of-comparisons estimates are pretty
darn good, or what you're going to get out is going to be mostly
fiction.

2. Sedgewick & Bentley analyzed a specific version of Quicksort,
which is ... um ... not the version we are using.  It doesn't look
to me like the choice of partitioning element is the same.  Maybe
that doesn't matter much in the end, but there's sure no discussion
of the point in this patch.

So at this point I've lost all faith in the estimates being meaningful
at all.  And that's assuming that the simplified algorithm is
implemented accurately, which it is not:

3. totalFuncCost is started off at 1.0.  Surely that should be zero?
If not, at least a comment to justify it would be nice.

4. The code around the add_function_cost call evidently wants to carry
the procost lookup result from one column to the next, because it
skips the lookup when prev_datatype == em->em_datatype.  However, the
value of funcCost isn't carried across columns, because it's local to
the loop.  The effect of this is that anyplace where adjacent GROUP BY
columns are of the same datatype, we'll use the fixed 1.0 value of
funcCost instead of looking up the real procost.  Admittedly, since
the real procost is probably also 1.0, this might not mean much in
practice.  Nonetheless it's broken code.  (Oh, btw: I doubt that
using add_function_cost rather than raw procost is of any value
whatsoever if you're just going to pass it a NULL node tree.)

5. I'm pretty dubious about the idea that we can use the rather-random
first element of the EquivalenceClass to determine the datatype that
will be compared, much less the average widths of the columns.  It's
entirely possible for an EC to contain both int4 and int8 vars, or
text vars of substantially different average widths.  I think we
really need to be going back to the original GroupClauses and looking
at the variables named there.

6. Worse than that, we're also using the first element of the
EquivalenceClass to calculate the number of groups of this sort key.
This is FLAT OUT WRONG, as certainly different EC members can have
very different stats.

7. The code considers that presorted-key columns do not add to the
comparison costs, yet the comment about it claims the opposite:

/*
 * Presorted keys are not considered in the cost above, but we still
 * do have to compare them in the qsort comparator. So make sure to
 * factor in the cost in that case.
 */
if (i >= nPresortedKeys)
{

I'm not entirely sure whether the code is broken or the comment is,
but at least one of them is.  I'm also pretty confused about why
we still add such columns' comparison functions to the running
totalFuncCost if we think they're not sorted on.

8. In the case complained of to start this thread, we're unable
to perceive any sort-cost difference between "p, 

Re: Question: test "aggregates" failed in 32-bit machine

2022-09-30 Thread Tom Lane
I wrote:
> The most likely theory, I think, is that that compiler is generating
> slightly different floating-point code causing different plans to
> be costed slightly differently than what the test case is expecting.
> Probably, the different orderings of the keys in this test case have
> exactly the same cost, or almost exactly, so that different roundoff
> error could be enough to change the selected plan.

I added some debug printouts to get_cheapest_group_keys_order()
and verified that in the two problematic queries, there are two
different orderings that have (on my machine) exactly equal lowest
cost.  So the code picks the first of those and ignores the second.
Different roundoff error would be enough to make it do something
else.

I find this problematic because "exactly equal" costs are not going
to be unusual.  That's because the values that cost_sort_estimate
relies on are, sadly, just about completely fictional.  It's expecting
that it can get a good cost estimate based on:

* procost.  In case you hadn't noticed, this is going to be 1 for
just about every function we might be considering here.

* column width.  This is either going to be a constant (e.g. 4
for integers) or, again, largely fictional.  The logic for
converting widths to cost multipliers adds yet another layer
of debatability.

* numdistinct estimates.  Sometimes we know what we're talking
about there, but often we don't.

So what I'm afraid we are dealing with here is usually going to
be garbage in, garbage out.  And we're expending an awful lot
of code and cycles to arrive at these highly questionable choices.

Given the previous complaints about db0d67db2, I wonder if it's not
most prudent to revert it.  I doubt we are going to get satisfactory
behavior out of it until there's fairly substantial improvements in
all these underlying estimates.

regards, tom lane




Re: Question: test "aggregates" failed in 32-bit machine

2022-09-30 Thread Andres Freund
Hi,

On 2022-09-30 12:13:11 -0400, Tom Lane wrote:
> "kuroda.hay...@fujitsu.com"  writes:
> > Hmm, I was not sure about additional conditions, sorry.
> > I could reproduce with followings steps: 
> 
> I tried this on a 32-bit VM with gcc 11.3, but couldn't reproduce.
> You said earlier
> 
> >> OS: RHEL 6.10 server 
> >> Arch: i686
> >> Gcc: 4.4.7
> 
> That is an awfully old compiler; I fear I no longer have anything
> comparable on a working platform.
> 
> The most likely theory, I think, is that that compiler is generating
> slightly different floating-point code causing different plans to
> be costed slightly differently than what the test case is expecting.
> Probably, the different orderings of the keys in this test case have
> exactly the same cost, or almost exactly, so that different roundoff
> error could be enough to change the selected plan.

Yea. I suspect that's because that compiler version doesn't have
-fexcess-precision=standard:

> CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith 
> -Wdeclaration-after-statement -Werror=vla -Wendif-labels 
> -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g 
> -O2

It's possible one could work around the issue with -msse -mfpmath=sse instead
of -fexcess-precision=standard. 

Greetings,

Andres Freund




Re: Question: test "aggregates" failed in 32-bit machine

2022-09-30 Thread Tom Lane
"kuroda.hay...@fujitsu.com"  writes:
> Hmm, I was not sure about additional conditions, sorry.
> I could reproduce with followings steps: 

I tried this on a 32-bit VM with gcc 11.3, but couldn't reproduce.
You said earlier

>> OS: RHEL 6.10 server 
>> Arch: i686
>> Gcc: 4.4.7

That is an awfully old compiler; I fear I no longer have anything
comparable on a working platform.

The most likely theory, I think, is that that compiler is generating
slightly different floating-point code causing different plans to
be costed slightly differently than what the test case is expecting.
Probably, the different orderings of the keys in this test case have
exactly the same cost, or almost exactly, so that different roundoff
error could be enough to change the selected plan.

This probably doesn't have a lot of real-world impact, but it's
still annoying on a couple of grounds.  Failing regression isn't
nice, and also this suggests that db0d67db2 is causing us to waste
time considering multiple plans with effectively equal costs.
Maybe that code needs to filter a little harder.

regards, tom lane




RE: Question: test "aggregates" failed in 32-bit machine

2022-09-29 Thread kuroda.hay...@fujitsu.com
Dear Tom,

> Hmm, we're not seeing any such failures in the buildfarm's 32-bit
> animals, so there must be some additional condition needed to make
> it happen.  Can you be more specific?

Hmm, I was not sure about additional conditions, sorry.
I could reproduce with followings steps: 

$ git clone https://github.com/postgres/postgres.git
$ cd postgres
$ ./configure --enable-cassert --enable-debug
$ make -j2
$ make check LANG=C

->  aggregates   ... FAILED 3562 ms




The hypervisor of the virtual machine is " VMware vSphere 7.0"

And I picked another information related with the machine.
Could you find something?

```

pg_config]$ ./pg_config 
...
CONFIGURE =  '--enable-cassert' '--enable-debug'
CC = gcc -std=gnu99
CPPFLAGS = -D_GNU_SOURCE
CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Werror=vla -Wendif-labels 
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -O2
CFLAGS_SL = -fPIC
LDFLAGS = -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags
LDFLAGS_EX = 
LDFLAGS_SL = 
LIBS = -lpgcommon -lpgport -lz -lreadline -lrt -ldl -lm 
VERSION = PostgreSQL 16devel

$ locale
LANG=C
...

$ arch 
i686


$cat /proc/cpuinfo 
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 85
model name  : Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz
stepping: 7
microcode   : 83898371
cpu MHz : 2394.374
cache size  : 36608 KB
physical id : 0
siblings: 1
core id : 0
cpu cores   : 1
apicid  : 0
initial apicid  : 0
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 22
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ss nx pdpe1gb rdtscp lm constant_tsc 
arch_perfmon xtopology tsc_reliable nonstop_tsc unfair_spinlock eagerfpu pni 
pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt 
tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 
3dnowprefetch arat xsaveopt ssbd ibrs ibpb stibp fsgsbase bmi1 avx2 smep bmi2 
invpcid avx512f rdseed adx avx512cd md_clear flush_l1d arch_capabilities
bogomips: 4788.74
clflush size: 64
cache_alignment : 64
address sizes   : 43 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model   : 85
model name  : Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz
stepping: 7
microcode   : 83898371
cpu MHz : 2394.374
cache size  : 36608 KB
physical id : 2
siblings: 1
core id : 0
cpu cores   : 1
apicid  : 2
initial apicid  : 2
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 22
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ss nx pdpe1gb rdtscp lm constant_tsc 
arch_perfmon xtopology tsc_reliable nonstop_tsc unfair_spinlock eagerfpu pni 
pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt 
tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 
3dnowprefetch arat xsaveopt ssbd ibrs ibpb stibp fsgsbase bmi1 avx2 smep bmi2 
invpcid avx512f rdseed adx avx512cd md_clear flush_l1d arch_capabilities
bogomips: 4788.74
clflush size: 64
cache_alignment : 64
address sizes   : 43 bits physical, 48 bits virtual
power management
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED





Re: Question: test "aggregates" failed in 32-bit machine

2022-09-29 Thread Tom Lane
"kuroda.hay...@fujitsu.com"  writes:
> While running `make check LANC=C` with 32-bit virtual machine,
> I found that it was failed at "aggregates".

Hmm, we're not seeing any such failures in the buildfarm's 32-bit
animals, so there must be some additional condition needed to make
it happen.  Can you be more specific?

regards, tom lane