Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Wed, Jul 10, 2019 at 09:11:41AM -0700, Ashwin Agrawal wrote: > Will post patch for the tool, once I get in little decent shape. That would be nice! We may be able to get something into v13 this way then. -- Michael signature.asc Description: PGP signature
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Wed, Jul 10, 2019 at 6:46 AM Tom Lane wrote: > Michael Paquier writes: > > On Wed, Jul 10, 2019 at 12:51:28PM +0530, Amit Kapila wrote: > >> It would be good if we can come up with something like that. It will > >> be helpful for zheap, where in some cases we get different row > >> ordering due to in-place updates. As of now, we try to add Order By > >> or do some extra magic to get consistent row ordering. > > > That was an issue for me as well when working with Postgres-XC when > > the row ordering was not guaranteed depending on the number of nodes > > (speaking of which Greenplum has the same issues, no?). Adding ORDER > > BY clauses to a set of tests may make sense, but then this may impact > > the plans generated for some of them.. > > Yeah, I do not want to get into a situation where we can't test > queries that lack ORDER BY. Also, the fact that tableam X doesn't > reproduce heap's row ordering is not a good reason to relax the > strength of the tests for heap. So I'm wondering about some > postprocessing that we could optionally apply. Perhaps the tools > Melanie mentions could help. > Surprisingly, I have been working from a couple of days to use those Perl tools from Greenplum for Zedstore. As for Zedstore plans differ for many regress tests because relation size not being the same as heap and all. Also, for similar reasons, row orders change as well. So, to effectively use the test untouched to validate Zedstore and yes was thinking will help Zheap testing as well. I also tested the same for regressplans.sh and it will lift a lot of manual burden of investigating the results. As one can specify to completely ignore explain plan outputs from the comparison between results and expected. Will post patch for the tool, once I get in little decent shape.
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
Michael Paquier writes: > On Wed, Jul 10, 2019 at 12:51:28PM +0530, Amit Kapila wrote: >> It would be good if we can come up with something like that. It will >> be helpful for zheap, where in some cases we get different row >> ordering due to in-place updates. As of now, we try to add Order By >> or do some extra magic to get consistent row ordering. > That was an issue for me as well when working with Postgres-XC when > the row ordering was not guaranteed depending on the number of nodes > (speaking of which Greenplum has the same issues, no?). Adding ORDER > BY clauses to a set of tests may make sense, but then this may impact > the plans generated for some of them.. Yeah, I do not want to get into a situation where we can't test queries that lack ORDER BY. Also, the fact that tableam X doesn't reproduce heap's row ordering is not a good reason to relax the strength of the tests for heap. So I'm wondering about some postprocessing that we could optionally apply. Perhaps the tools Melanie mentions could help. regards, tom lane
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Wed, Jul 10, 2019 at 12:40 AM Michael Paquier wrote: > On Wed, Jul 10, 2019 at 12:51:28PM +0530, Amit Kapila wrote: > > It would be good if we can come up with something like that. It will > > be helpful for zheap, where in some cases we get different row > > ordering due to in-place updates. As of now, we try to add Order By > > or do some extra magic to get consistent row ordering. > > That was an issue for me as well when working with Postgres-XC when > the row ordering was not guaranteed depending on the number of nodes > (speaking of which Greenplum has the same issues, no?). Adding ORDER > BY clauses to a set of tests may make sense, but then this may impact > the plans generated for some of them.. > -- > Michael > We have a tool that does this. gpdiff [1] is used for results post-processing and it uses a perl module called atmsort [2] to deal with the specific ORDER BY case discussed here. [1] https://github.com/greenplum-db/gpdb/blob/master/src/test/regress/gpdiff.pl [2] https://github.com/greenplum-db/gpdb/blob/master/src/test/regress/atmsort.pl -- Melanie Plageman
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Wed, Jul 10, 2019 at 12:51:28PM +0530, Amit Kapila wrote: > It would be good if we can come up with something like that. It will > be helpful for zheap, where in some cases we get different row > ordering due to in-place updates. As of now, we try to add Order By > or do some extra magic to get consistent row ordering. That was an issue for me as well when working with Postgres-XC when the row ordering was not guaranteed depending on the number of nodes (speaking of which Greenplum has the same issues, no?). Adding ORDER BY clauses to a set of tests may make sense, but then this may impact the plans generated for some of them.. -- Michael signature.asc Description: PGP signature
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Wed, Jul 10, 2019 at 10:12 AM Michael Paquier wrote: > > On Tue, Jul 09, 2019 at 11:54:29AM -0700, Melanie Plageman wrote: > > It might be worth post-processing results files to ignore row ordering > > in some cases to allow for easier comparison. Has this been proposed > > in the past? > > Not that I recall. > It would be good if we can come up with something like that. It will be helpful for zheap, where in some cases we get different row ordering due to in-place updates. As of now, we try to add Order By or do some extra magic to get consistent row ordering. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Tue, Jul 09, 2019 at 11:54:29AM -0700, Melanie Plageman wrote: > It might be worth post-processing results files to ignore row ordering > in some cases to allow for easier comparison. Has this been proposed > in the past? Not that I recall. -- Michael signature.asc Description: PGP signature
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Mon, Jul 8, 2019 at 12:21 PM Tom Lane wrote: > The point of regressplans.sh is to see if anything goes seriously > wrong when forcing non-default plan choices --- seriously wrong being > defined as crashes or semantically wrong answers. It's not expected > that the regression tests will automatically pass when you do that, > because of their dependencies on output row ordering, not to mention > all the EXPLAINs. I'm not for removing it --- the fact that its > results require manual evaluation doesn't make it useless. > > It might be worth post-processing results files to ignore row ordering in some cases to allow for easier comparison. Has this been proposed in the past? -- Melanie Plageman
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Tue, Jul 09, 2019 at 03:04:27PM +1200, Thomas Munro wrote: > It's my mistake. I'll fix it later today. Thanks! -- Michael signature.asc Description: PGP signature
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Tue, Jul 9, 2019 at 2:45 PM Michael Paquier wrote: > On Tue, Jul 09, 2019 at 02:30:51PM +1200, Thomas Munro wrote: > > Yeah. I had obviously never noticed that test script. +1 for just > > enabling hash joins the top of join_hash.sql in 12+, and the > > equivalent section in 11's join.sql (which is luckily at the end of > > the file). > > Right, I did not pay much attention to REL_11_STABLE. In this case > the test begins around line 2030 and reaches the bottom of the file. > I would actually add a RESET at the bottom of it to avoid any tests to > be impacted, as usually bug-fix tests are just appended. Thomas, > perhaps you would prefer fixing it yourself? Or should I? It's my mistake. I'll fix it later today. -- Thomas Munro https://enterprisedb.com
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
Michael Paquier writes: > On Tue, Jul 09, 2019 at 02:30:51PM +1200, Thomas Munro wrote: >> Yeah. I had obviously never noticed that test script. +1 for just >> enabling hash joins the top of join_hash.sql in 12+, and the >> equivalent section in 11's join.sql (which is luckily at the end of >> the file). > Right, I did not pay much attention to REL_11_STABLE. In this case > the test begins around line 2030 and reaches the bottom of the file. > I would actually add a RESET at the bottom of it to avoid any tests to > be impacted, as usually bug-fix tests are just appended. Agreed that the scope should be limited. But in 12/HEAD, I think the relevant tests are all wrapped into one transaction block, so that using SET LOCAL would be enough. Not sure if 11 is the same. regards, tom lane
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Tue, Jul 09, 2019 at 02:30:51PM +1200, Thomas Munro wrote: > Yeah. I had obviously never noticed that test script. +1 for just > enabling hash joins the top of join_hash.sql in 12+, and the > equivalent section in 11's join.sql (which is luckily at the end of > the file). Right, I did not pay much attention to REL_11_STABLE. In this case the test begins around line 2030 and reaches the bottom of the file. I would actually add a RESET at the bottom of it to avoid any tests to be impacted, as usually bug-fix tests are just appended. Thomas, perhaps you would prefer fixing it yourself? Or should I? -- Michael signature.asc Description: PGP signature
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Tue, Jul 9, 2019 at 2:22 PM Tom Lane wrote: > Thomas Munro writes: > > Based on a suggestion from Andres (if I recall correctly), I wrapped > > each individual test in savepoint/rollback, and then set just the GUCs > > needed to get the plan shape and execution code path I wanted to > > exercise, and I guess I found that I only needed to disable merge > > joins for some of them. The idea was that the individual tests could > > be understood independently. > > But per this discussion, they can only be "understood independently" > if you make some assumptions about the prevailing values of the > planner GUCs. Yeah. I had obviously never noticed that test script. +1 for just enabling hash joins the top of join_hash.sql in 12+, and the equivalent section in 11's join.sql (which is luckily at the end of the file). -- Thomas Munro https://enterprisedb.com
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
Thomas Munro writes: > On Tue, Jul 9, 2019 at 2:19 AM Tom Lane wrote: >> Given the purposes of this test, I think it'd be reasonable to force >> both enable_hashjoin = on and enable_mergejoin = off at the very top >> of the join_hash script, or the corresponding place in join.sql in >> v11. Thomas, was there a specific reason for forcing enable_mergejoin >> = off for only some of these tests? > Based on a suggestion from Andres (if I recall correctly), I wrapped > each individual test in savepoint/rollback, and then set just the GUCs > needed to get the plan shape and execution code path I wanted to > exercise, and I guess I found that I only needed to disable merge > joins for some of them. The idea was that the individual tests could > be understood independently. But per this discussion, they can only be "understood independently" if you make some assumptions about the prevailing values of the planner GUCs. regards, tom lane
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
Michael Paquier writes: > On Mon, Jul 08, 2019 at 03:21:41PM -0400, Tom Lane wrote: >> Having said that, join_hash.sql in particular seems to have zero >> value if it's not testing hash joins, so I think it'd be reasonable >> for it to override a global enable_hashjoin = off setting. None of >> the other regression test scripts seem to take nearly as much of a >> performance hit from globally forcing poor plans. > I am a bit confused here. Don't you mean to have enable_hashjoin = > *on* at the top of hash_join.sql instead like in the attached? Right, overriding any enable_hashjoin = off that might've come from PGOPTIONS or wherever. regards, tom lane
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Tue, Jul 9, 2019 at 2:19 AM Tom Lane wrote: > Given the purposes of this test, I think it'd be reasonable to force > both enable_hashjoin = on and enable_mergejoin = off at the very top > of the join_hash script, or the corresponding place in join.sql in > v11. Thomas, was there a specific reason for forcing enable_mergejoin > = off for only some of these tests? Based on a suggestion from Andres (if I recall correctly), I wrapped each individual test in savepoint/rollback, and then set just the GUCs needed to get the plan shape and execution code path I wanted to exercise, and I guess I found that I only needed to disable merge joins for some of them. The idea was that the individual tests could be understood independently. -- Thomas Munro https://enterprisedb.com
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Mon, Jul 08, 2019 at 03:21:41PM -0400, Tom Lane wrote: > Having said that, join_hash.sql in particular seems to have zero > value if it's not testing hash joins, so I think it'd be reasonable > for it to override a global enable_hashjoin = off setting. None of > the other regression test scripts seem to take nearly as much of a > performance hit from globally forcing poor plans. I am a bit confused here. Don't you mean to have enable_hashjoin = *on* at the top of hash_join.sql instead like in the attached? -- Michael diff --git a/src/test/regress/expected/join_hash.out b/src/test/regress/expected/join_hash.out index 9eee39bdd3..0890608acf 100644 --- a/src/test/regress/expected/join_hash.out +++ b/src/test/regress/expected/join_hash.out @@ -1,6 +1,9 @@ -- -- exercises for the hash join code -- +-- Enforce the use of hash joins in this test, which could get disabled +-- at system-level. +set enable_hashjoin = on; begin; set local min_parallel_table_scan_size = 0; set local parallel_setup_cost = 0; diff --git a/src/test/regress/sql/join_hash.sql b/src/test/regress/sql/join_hash.sql index ae352e9b0b..b38081d1d5 100644 --- a/src/test/regress/sql/join_hash.sql +++ b/src/test/regress/sql/join_hash.sql @@ -2,6 +2,10 @@ -- exercises for the hash join code -- +-- Enforce the use of hash joins in this test, which could get disabled +-- at system-level. +set enable_hashjoin = on; + begin; set local min_parallel_table_scan_size = 0; signature.asc Description: PGP signature
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
Michael Paquier writes: > Well, another thing I'd like to think about is if there is any point > to keep regressplans.sh and the relevant options in postgres at this > stage. From the top of the file one can read that: The point of regressplans.sh is to see if anything goes seriously wrong when forcing non-default plan choices --- seriously wrong being defined as crashes or semantically wrong answers. It's not expected that the regression tests will automatically pass when you do that, because of their dependencies on output row ordering, not to mention all the EXPLAINs. I'm not for removing it --- the fact that its results require manual evaluation doesn't make it useless. Having said that, join_hash.sql in particular seems to have zero value if it's not testing hash joins, so I think it'd be reasonable for it to override a global enable_hashjoin = off setting. None of the other regression test scripts seem to take nearly as much of a performance hit from globally forcing poor plans. regards, tom lane
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Mon, Jul 8, 2019 at 5:53 PM Michael Paquier wrote: > I have begun playing with regressplans.sh which enforces various > combinations of "-f s|i|n|m|h" when running the regression tests, and > I have noticed that -fh can cause the server to become stuck in the > test join_hash.sql with this query (not sure which portion of the SET > LOCAL parameters are involved) : > select count(*) from simple r join extremely_skewed s using (id); > > This does not happen with REL_10_STABLE where the test executes > immediately, so we has visibly an issue caused by v11 here. If you don't allow hash joins it makes this plan: Aggregate -> Nested Loop Join Filter: (r.id = s.id) -> Seq Scan on simple r -> Materialize -> Seq Scan on extremely_skewed s "simple" has 20k rows and "extremely_skewed" has 20k rows but the planner thinks it only has 2. So this going to take O(n^2) time and n is 20k. Not sure what to do about that. Maybe "join_hash" should be skipped for the -h (= no hash joins please) case? -- Thomas Munro https://enterprisedb.com
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
On Mon, Jul 08, 2019 at 06:49:44PM +1200, Thomas Munro wrote: > "simple" has 20k rows and "extremely_skewed" has 20k rows but the > planner thinks it only has 2. So this going to take O(n^2) time and n > is 20k. Not sure what to do about that. Maybe "join_hash" should be > skipped for the -h (= no hash joins please) case? Ah, thanks. Yes that's going to take a while :) Well, another thing I'd like to think about is if there is any point to keep regressplans.sh and the relevant options in postgres at this stage. From the top of the file one can read that: # This script runs the Postgres regression tests with all useful combinations # of the backend options that disable various query plan types. If the # results are not all the same, it may indicate a bug in a particular # plan type, or perhaps just a regression test whose results aren't fully # determinate (eg, due to lack of an ORDER BY keyword). However if you run any option with make check, then in all runs there are tests failing. We can improve the situation for some of them with ORDER BY queries by looking at the query outputs, but some EXPLAIN queries are sensitive to that, and the history around regressplans.sh does not play in favor of it (some people really use it?). If you look at the latest commits for it, it has not been really touched in 19 years. So I would be rather in favor in nuking it. -- Michael signature.asc Description: PGP signature
Re: PGOPTIONS="-fh" make check gets stuck since Postgres 11
Michael Paquier writes: > I have begun playing with regressplans.sh which enforces various > combinations of "-f s|i|n|m|h" when running the regression tests, and > I have noticed that -fh can cause the server to become stuck in the > test join_hash.sql with this query (not sure which portion of the SET > LOCAL parameters are involved) : > select count(*) from simple r join extremely_skewed s using (id); > This does not happen with REL_10_STABLE where the test executes > immediately, so we has visibly an issue caused by v11 here. Yeah, these test cases were added by fa330f9ad in v11. What it looks like to me is that some of these test cases force "enable_mergejoin = off", so if you also have enable_hashjoin off then you are going to get a nestloop plan, and it's hardly surprising that that takes an unreasonable amount of time on the rather large test tables used in these tests. Given the purposes of this test, I think it'd be reasonable to force both enable_hashjoin = on and enable_mergejoin = off at the very top of the join_hash script, or the corresponding place in join.sql in v11. Thomas, was there a specific reason for forcing enable_mergejoin = off for only some of these tests? regards, tom lane
PGOPTIONS="-fh" make check gets stuck since Postgres 11
Hi all, I have begun playing with regressplans.sh which enforces various combinations of "-f s|i|n|m|h" when running the regression tests, and I have noticed that -fh can cause the server to become stuck in the test join_hash.sql with this query (not sure which portion of the SET LOCAL parameters are involved) : select count(*) from simple r join extremely_skewed s using (id); This does not happen with REL_10_STABLE where the test executes immediately, so we has visibly an issue caused by v11 here. Any thoughts? -- Michael signature.asc Description: PGP signature