Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-28 Thread Tom Lane
Mark Dilger  writes:
>> The swtich in src/backend/parser/analyze.c circa line 2819 should
>> probably have an explicit case for RTE_RESULT rather than just a
>> comment and allowing the default to log "unrecognized RTE type",
>> since it's not really unrecognized, just unexpected.  But I'm not
>> too exercised about that, and won't argue if you want to leave it
>> as is.

Meh --- I doubt we need two different "can't happen" messages there.
The reason I treated this differently from some other places is that
transformLockingClause is only used during parsing, when there certainly
shouldn't be any RTE_RESULT RTEs present.  Some of the other functions
in that file are also called from outside the parser, so that it's
less certain they couldn't see a RTE_RESULT, so I added explicit
errors for them.

There's been some talk of having more uniform handling of switches
on enums, which might change the calculus here (i.e. we might not want
to have a default: case at all).  But I don't feel a need to add code
to transformLockingClause till we have that.

>> Similarly, in src/backend/commands/explain.c, should there be a
>> case for T_Result in ExplainTargetRel's switch circa line 2974?
>> I'm not sure given your design whether this could ever be relevant,
>> but I noticed that T_Result / RTE_RELATION isn't handled here.

We don't get to that function for a T_Result plan (cf. switch in
ExplainNode), so it'd just be dead code.

> Ok, version 6 of the patch applies cleanly, compiles, and passes
> all tests for me on my platform (mac os x).

Again, thanks for reviewing!

Pushed now.  I did yield to popular demand and change the temp table's
name in join.sql to be "onerow" instead of "dual" ;-)

regards, tom lane



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-28 Thread Laurenz Albe
David Rowley wrote:
> hmm. You both know the table of that name exists only as part of a
> regression test, right?  It'll just exist for a handful of
> milliseconds during make check.

Er, I wasn't aware of that, as I didn't read the patch.
Then I think my comment should be discarded as being overly pedantic.

Yours,
Laurenz Albe




Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-28 Thread David Rowley
On Mon, 28 Jan 2019 at 22:31, Laurenz Albe  wrote:
>
> Stefan Keller wrote:
> > Pls. try to give DUAL a better name, since it's IMHO neither
> > self-explaining nor correct.
>
> I agree with the sentiment.
> On the other hand, people who migrate from Oracle might be happy if
> there is a DUAL table that allows them to migrate some of their
> statements unmodified.

hmm. You both know the table of that name exists only as part of a
regression test, right?  It'll just exist for a handful of
milliseconds during make check.

My comment about the poorly named table may have been a bit pedantic
as far as what a table in a test should be called, but I also felt a
bit of an urge to make a little bit of fun about having a table named
dual which always has just a single row. Knowing the reasons for
Oracle's naming of the table helps explain why they ended up with the
poorly named table. I didn't think that warranted us doing the same
thing, even if it's just a short-lived table inside a regression test.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-28 Thread Laurenz Albe
Stefan Keller wrote:
> Pls. try to give DUAL a better name, since it's IMHO neither
> self-explaining nor correct.

I agree with the sentiment.
On the other hand, people who migrate from Oracle might be happy if
there is a DUAL table that allows them to migrate some of their
statements unmodified.

I don't know if that is a good enough argument, though. 
Currently there is "orafce" which provides DUAL, and it might be
good enough if it defines DUAL as a view on DUMMY.

Yours,
Laurenz Albe




Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-27 Thread Stefan Keller
Dear all,

I'm following this list since years - especially PostGIS related - and
you and PG are just awesome!
Pls. let me chime in as a university teacher, therefore used to
explain every year the same things :-).
My 2 cents here are:

Pls. try to give DUAL a better name, since it's IMHO neither
self-explaining nor correct.
Taken from [1], citing the originator:
<<
The name, DUAL, seemed apt for the process of creating a pair of
rows from just one.[1]
The original DUAL table had two rows in it (hence its name), but
subsequently it only had one row.
<<

My first guess is to name the dummy table with with no columns and one
row "DUMMY" - but I actually want to leave the fun to you to name it.

:Stefan

[1] https://en.wikipedia.org/wiki/DUAL_table

Am So., 27. Jan. 2019 um 21:53 Uhr schrieb Mark Dilger
:
>
>
>
> > On Jan 27, 2019, at 12:04 PM, Mark Dilger  wrote:
> >
> >
> >
> >> On Jan 25, 2019, at 5:09 PM, Tom Lane  wrote:
> >>
> >> David Rowley  writes:
> >>> As far as I can see the patch is ready to go, but I'll defer to Mark,
> >>> who's also listed on the reviewer list for this patch.
> >>
> >> Mark, are you planning to do further review on this patch?
> >> I'd like to move it along, since (IMO anyway) we need it in
> >> before progress can be made on
> >> https://commitfest.postgresql.org/21/1664/
> >>
> >>  regards, tom lane
> >
> >
> > These two observations are not based on any deep understanding of
> > your patch, but just some cursory review:
> >
> > The swtich in src/backend/parser/analyze.c circa line 2819 should
> > probably have an explicit case for RTE_RESULT rather than just a
> > comment and allowing the default to log "unrecognized RTE type",
> > since it's not really unrecognized, just unexpected.  But I'm not
> > too exercised about that, and won't argue if you want to leave it
> > as is.
> >
> > Similarly, in src/backend/commands/explain.c, should there be a
> > case for T_Result in ExplainTargetRel's switch circa line 2974?
> > I'm not sure given your design whether this could ever be relevant,
> > but I noticed that T_Result / RTE_RELATION isn't handled here.
> >
> >
> > Applying your patch and running the regression tests is failing
> > left and right, so I'm working to pull a fresh copy from git and
> > build again -- probably just something wrong in my working directory.
>
> Ok, version 6 of the patch applies cleanly, compiles, and passes
> all tests for me on my platform (mac os x).  You can address the two
> minor observations above, or not, at your option.
>
> mark
>
>



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-27 Thread Mark Dilger



> On Jan 27, 2019, at 12:04 PM, Mark Dilger  wrote:
> 
> 
> 
>> On Jan 25, 2019, at 5:09 PM, Tom Lane  wrote:
>> 
>> David Rowley  writes:
>>> As far as I can see the patch is ready to go, but I'll defer to Mark,
>>> who's also listed on the reviewer list for this patch.
>> 
>> Mark, are you planning to do further review on this patch?
>> I'd like to move it along, since (IMO anyway) we need it in
>> before progress can be made on
>> https://commitfest.postgresql.org/21/1664/
>> 
>>  regards, tom lane
> 
> 
> These two observations are not based on any deep understanding of
> your patch, but just some cursory review:
> 
> The swtich in src/backend/parser/analyze.c circa line 2819 should
> probably have an explicit case for RTE_RESULT rather than just a
> comment and allowing the default to log "unrecognized RTE type",
> since it's not really unrecognized, just unexpected.  But I'm not
> too exercised about that, and won't argue if you want to leave it
> as is.
> 
> Similarly, in src/backend/commands/explain.c, should there be a
> case for T_Result in ExplainTargetRel's switch circa line 2974?
> I'm not sure given your design whether this could ever be relevant,
> but I noticed that T_Result / RTE_RELATION isn't handled here.
> 
> 
> Applying your patch and running the regression tests is failing
> left and right, so I'm working to pull a fresh copy from git and
> build again -- probably just something wrong in my working directory.

Ok, version 6 of the patch applies cleanly, compiles, and passes
all tests for me on my platform (mac os x).  You can address the two
minor observations above, or not, at your option.

mark




Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-27 Thread Mark Dilger



> On Jan 25, 2019, at 5:09 PM, Tom Lane  wrote:
> 
> David Rowley  writes:
>> As far as I can see the patch is ready to go, but I'll defer to Mark,
>> who's also listed on the reviewer list for this patch.
> 
> Mark, are you planning to do further review on this patch?
> I'd like to move it along, since (IMO anyway) we need it in
> before progress can be made on
> https://commitfest.postgresql.org/21/1664/
> 
>   regards, tom lane


These two observations are not based on any deep understanding of
your patch, but just some cursory review:

The swtich in src/backend/parser/analyze.c circa line 2819 should
probably have an explicit case for RTE_RESULT rather than just a
comment and allowing the default to log "unrecognized RTE type",
since it's not really unrecognized, just unexpected.  But I'm not
too exercised about that, and won't argue if you want to leave it
as is.

Similarly, in src/backend/commands/explain.c, should there be a
case for T_Result in ExplainTargetRel's switch circa line 2974?
I'm not sure given your design whether this could ever be relevant,
but I noticed that T_Result / RTE_RELATION isn't handled here.


Applying your patch and running the regression tests is failing
left and right, so I'm working to pull a fresh copy from git and
build again -- probably just something wrong in my working directory.

mark





Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-27 Thread Mark Dilger



> On Jan 25, 2019, at 5:09 PM, Tom Lane  wrote:
> 
> David Rowley  writes:
>> As far as I can see the patch is ready to go, but I'll defer to Mark,
>> who's also listed on the reviewer list for this patch.
> 
> Mark, are you planning to do further review on this patch?
> I'd like to move it along, since (IMO anyway) we need it in
> before progress can be made on
> https://commitfest.postgresql.org/21/1664/

Doing a quick review now.  Sorry I didn't see your messages earlier.

mark



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-25 Thread Tom Lane
David Rowley  writes:
> As far as I can see the patch is ready to go, but I'll defer to Mark,
> who's also listed on the reviewer list for this patch.

Mark, are you planning to do further review on this patch?
I'd like to move it along, since (IMO anyway) we need it in
before progress can be made on
https://commitfest.postgresql.org/21/1664/

regards, tom lane



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-20 Thread David Rowley
On Tue, 15 Jan 2019 at 13:17, Tom Lane  wrote:
>
> David Rowley  writes:
> > I also did a quick benchmark of v6 and found the slowdown to be
> > smaller after the change made in build_simple_rel()
>
> Thanks for confirming.  I was not very sure that was worth the extra
> few bytes of code space, but if you see a difference too, then it's
> probably worthwhile.

It occurred to me that a common case where you'll hit the new code is
INSERT INTO ... VALUES.

I thought I'd better test this, so I carefully designed the following
table so it would have as little INSERT overhead as possible.

create table t();

With fsync=off and a truncate between each pgbench run.

insert.sql = insert into t default values;

Unpatched:

$ pgbench -n -f insert.sql -T 60 postgres
tps = 27986.757396 (excluding connections establishing)
tps = 28220.905728 (excluding connections establishing)
tps = 28234.331176 (excluding connections establishing)
tps = 28254.392421 (excluding connections establishing)
tps = 28691.946948 (excluding connections establishing)

Patched:

tps = 28426.183388 (excluding connections establishing)
tps = 28464.517261 (excluding connections establishing)
tps = 28505.178616 (excluding connections establishing)
tps = 28414.275662 (excluding connections establishing)
tps = 28648.103349 (excluding connections establishing)

The patch seems to average out slightly faster on those runs, but the
variance is around the noise level.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-14 Thread David Rowley
On Tue, 15 Jan 2019 at 13:17, Tom Lane  wrote:
>
> David Rowley  writes:
> > 1. I don't think having a table named "dual" makes a whole lot of
> > sense for a table with a single row.
>
> Well, I borrowed Oracle terminology there ;-)

yep, but ... go on... break the mould :)

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-14 Thread Tom Lane
David Rowley  writes:
> 1. I don't think having a table named "dual" makes a whole lot of
> sense for a table with a single row.

Well, I borrowed Oracle terminology there ;-)

> (Uppercasing these additions would also make them look less of an 
> afterthought.)

Don't really care, can do.

> I also did a quick benchmark of v6 and found the slowdown to be
> smaller after the change made in build_simple_rel()

Thanks for confirming.  I was not very sure that was worth the extra
few bytes of code space, but if you see a difference too, then it's
probably worthwhile.

regards, tom lane



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-14 Thread David Rowley
On Tue, 15 Jan 2019 at 09:48, Tom Lane  wrote:
>
> David Rowley  writes:
> > SELECT 1; I believe is a common query for some connection poolers as a
> > sort of ping to the database.  In light of that, the performance drop
> > of 2 microseconds per query is not going to amount to very much in
> > total for that use case. i.e you'll need to do half a million pings
> > before it'll cost you 1 second of additional CPU time.
>
> Yeah, I agree this is not something to get hot & bothered over, but
> I thought it was worth spending an hour seeing if there were any
> easy wins.  Not much luck.

Thanks for putting in the effort.

> Anyway, herewith v6, rebased up to HEAD, with the build_simple_rel
> improvement and the regression test fix I mentioned earlier.

I had a look at these changes, I only have 1 comment:

1. I don't think having a table named "dual" makes a whole lot of
sense for a table with a single row.  I'm sure we can come up with a
more suitably named table to serve the purpose. How about "single"?

 INSERT INTO J2_TBL VALUES (0, NULL);
 INSERT INTO J2_TBL VALUES (NULL, NULL);
 INSERT INTO J2_TBL VALUES (NULL, 0);
+-- useful in some tests below
+create temp table dual();
+insert into dual default values;
+analyze dual;

(Uppercasing these additions would also make them look less of an afterthought.)

I also did a quick benchmark of v6 and found the slowdown to be
smaller after the change made in build_simple_rel()

Test 1 = explain select 1;

Unpatched:
$ pgbench -n -f bench.sql -T 60 postgres
tps = 30259.096585 (excluding connections establishing)
tps = 30094.533610 (excluding connections establishing)
tps = 30124.154255 (excluding connections establishing)

Patched:
tps = 29667.414788 (excluding connections establishing)
tps = 29555.325522 (excluding connections establishing)
tps = 29101.083145 (excluding connections establishing)

(2.38% down)

Test 2 = select 1;

Unpatched:
tps = 36535.991023 (excluding connections establishing)
tps = 36568.604011 (excluding connections establishing)
tps = 35938.923066 (excluding connections establishing)

Patched:
tps = 35187.363260 (excluding connections establishing)
tps = 35166.993210 (excluding connections establishing)
tps = 35436.486315 (excluding connections establishing)

(2.98% down)

As far as I can see the patch is ready to go, but I'll defer to Mark,
who's also listed on the reviewer list for this patch.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-14 Thread Tom Lane
David Rowley  writes:
> I ran a few benchmarks on an AWS m5d.large instance based on top of
> c5c7fa261f5. The biggest regression I see is from a simple SELECT 1 at
> around 5-6%. A repeat of your test of SELECT 2+2 showed about half
> that regression so the simple addition function call is introducing
> enough overhead to lower the slowdown percentage by a good amount.

I can reproduce a small slowdown on "SELECT 1;", though for me it's
circa 2% not 5-6%.  I'm not entirely sure that's above the noise level
--- I tend to see variations of that size even from unrelated code
changes.  But to the extent that it's real, it seems like it must be
coming from one of these places:

* replace_empty_jointree adds a few pallocs for the new RTE and
jointree entry.

* After subquery_planner calls replace_empty_jointree, subsequent
places that loop over the rtable will see one entry instead of none.
They won't do much of anything with it, but it's a few more cycles.

* remove_useless_result_rtes is new code; it won't do much in this
case either, but it still has to examine the jointree.

* query_planner() does slightly more work before reaching its fast-path
exit.

None of these are exactly large costs, and it's hard to get rid of
any of them without ugly code contortions.  I experimented with
micro-optimizing the trivial case in remove_useless_result_rtes,
but it didn't seem to make much difference for "SELECT 1", and it
would add cycles uselessly in all larger queries.

I also noticed that I'd been lazy in adding RTE_RESULT support to
build_simple_rel: we can save a couple of palloc's if we give it its own
code path.  That did seem to reduce the penalty a shade, though I'm
still not sure that it's above the noise level.

> SELECT 1; I believe is a common query for some connection poolers as a
> sort of ping to the database.  In light of that, the performance drop
> of 2 microseconds per query is not going to amount to very much in
> total for that use case. i.e you'll need to do half a million pings
> before it'll cost you 1 second of additional CPU time.

Yeah, I agree this is not something to get hot & bothered over, but
I thought it was worth spending an hour seeing if there were any
easy wins.  Not much luck.

Anyway, herewith v6, rebased up to HEAD, with the build_simple_rel
improvement and the regression test fix I mentioned earlier.

regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index f177eba..96d9828 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2476,6 +2476,8 @@ JumbleRangeTable(pgssJumbleState *jstate, List *rtable)
 			case RTE_NAMEDTUPLESTORE:
 APP_JUMB_STRING(rte->enrname);
 break;
+			case RTE_RESULT:
+break;
 			default:
 elog(ERROR, "unrecognized RTE kind: %d", (int) rte->rtekind);
 break;
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bb92d9d..b3894d0 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -5361,7 +5361,7 @@ INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass;
QUERY PLAN
 -
  Insert on public.ft2
-   Output: (tableoid)::regclass
+   Output: (ft2.tableoid)::regclass
Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
->  Result
  Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone, NULL::character varying, 'ft2   '::character(10), NULL::user_enum
diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index 12cb18c..cd77f99 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -437,9 +437,12 @@ ExecSupportsMarkRestore(Path *pathnode)
 return ExecSupportsMarkRestore(((ProjectionPath *) pathnode)->subpath);
 			else if (IsA(pathnode, MinMaxAggPath))
 return false;	/* childless Result */
+			else if (IsA(pathnode, GroupResultPath))
+return false;	/* childless Result */
 			else
 			{
-Assert(IsA(pathnode, ResultPath));
+/* Simple RTE_RESULT base relation */
+Assert(IsA(pathnode, Path));
 return false;	/* childless Result */
 			}
 
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 19b65f6..c3d27a0 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2325,10 +2325,6 @@ range_table_walker(List 

Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-05 Thread David Rowley
On Sat, 5 Jan 2019 at 08:48, Tom Lane  wrote:
> v5 attached; this responds to your comments plus Alexander's earlier
> gripe about not getting a clean build with --disable-cassert.
> No really substantive changes though.

I ran a few benchmarks on an AWS m5d.large instance based on top of
c5c7fa261f5. The biggest regression I see is from a simple SELECT 1 at
around 5-6%. A repeat of your test of SELECT 2+2 showed about half
that regression so the simple addition function call is introducing
enough overhead to lower the slowdown percentage by a good amount.
Test 3 improved performance a bit.

SELECT 1; I believe is a common query for some connection poolers as a
sort of ping to the database.  In light of that, the performance drop
of 2 microseconds per query is not going to amount to very much in
total for that use case. i.e you'll need to do half a million pings
before it'll cost you 1 second of additional CPU time.

Results and tests are:

Setup: create table t1 (id int primary key);

Test 1: explain select 1;

Unpatched:

$ pgbench -n -f bench1.sql -T 60 postgres
tps = 30899.599603 (excluding connections establishing)
tps = 30806.247429 (excluding connections establishing)
tps = 30330.971411 (excluding connections establishing)

Patched:

tps = 28971.551297 (excluding connections establishing)
tps = 28892.053072 (excluding connections establishing)
tps = 28881.105928 (excluding connections establishing)

(5.75% drop)

Test 2: explain select * from t1 inner join (select 1 as x) x on t1.id=x.x;

Unpatched:

$ pgbench -n -f bench2.sql -T 60 postgres
tps = 14340.027655 (excluding connections establishing)
tps = 14392.871399 (excluding connections establishing)
tps = 14335.615020 (excluding connections establishing)

Patched:
tps = 14269.714239 (excluding connections establishing)
tps = 14305.901601 (excluding connections establishing)
tps = 14261.319313 (excluding connections establishing)

(0.54% drop)

Test 3: explain select * from t1 left join (select 1 as x) x on t1.id=x.x;

Unpatched:

$ pgbench -n -f bench3.sql -T 60 postgres
tps = 11404.769545 (excluding connections establishing)
tps = 11477.229511 (excluding connections establishing)
tps = 11365.426342 (excluding connections establishing)

Patched:
tps = 11624.081759 (excluding connections establishing)
tps = 11649.150950 (excluding connections establishing)
tps = 11571.724571 (excluding connections establishing)

(1.74% gain)

Test 4: explain select * from t1 inner join (select * from t1) t2 on
t1.id=t2.id;

Unpatched:
$ pgbench -n -f bench4.sql -T 60 postgres
tps = 9966.796818 (excluding connections establishing)
tps = 9887.775388 (excluding connections establishing)
tps = 9906.681296 (excluding connections establishing)

Patched:
tps = 9845.451081 (excluding connections establishing)
tps = 9936.377521 (excluding connections establishing)
tps = 9915.724816 (excluding connections establishing)

(0.21% drop)

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-04 Thread Tom Lane
I wrote:
> If you reverse out just that change you'll see why I added it: without it,
> the plan for the earlier "test a corner case in which we shouldn't apply
> the star-schema optimization" isn't optimized as much as I thought it
> should be.

Hmm ... looking at this closer, it seems like this patch probably breaks
what that regression test case was actually meant to test, ie once we've
deleted the VALUES subselects from the jointree, it's likely that the
planner join-ordering mistake that was testing for can no longer happen,
because the join just plain doesn't exist.

I'll plan to deal with that by running the test case with actual small
tables instead of VALUES subselects.  It might be useful to run the test
case in its current form as an exercise for the RTE_RESULT optimizations,
but that would be separate from its current intention.

regards, tom lane



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-04 Thread Tom Lane
David Rowley  writes:
> I've just looked over the v4 patch. I agree that having an RTE for a
> from-less SELECT seems like a good idea.

Thanks for looking!

> While reading the patch, I noted the following:
> 1. It's more efficient to use bms_next_member as it does not need to
> re-skip 0 words on each iteration. (Likely bms_first_member() is not
> needed anywhere in the code base)

Sure.  As the comment says, this isn't meant to be super efficient for
multiple removable RTEs, but we might as well use the other API.

> 2. The following comment seems to indicate that we can go ahead and
> make parallelise the result processing, but the code still defers to
> the checks below and may still end up not parallelising if say,
> there's a non-parallel safe function call in the SELECT's target list.

Adjusted the comment.

> 3. You may as well just ditch the variable and just do:
> Assert(rel->relid > 0);
> Assert(planner_rt_fetch(rel->relid, root)->rtekind == RTE_RESULT);
> instead of:
> RangeTblEntry *rte PG_USED_FOR_ASSERTS_ONLY;
> There are a few other cases doing just that in costsize.c

Meh.  I'm not really on board with doing it that way, as it'll just
mean more to change if the code is ever modified to look at other
fields of the RTE.  Still, I see your point that other places in
costsize.c are doing it without PG_USED_FOR_ASSERTS_ONLY, so changed.

> 4. I don't quite understand why this changed in join.out

> @@ -3596,7 +3588,7 @@ select t1.* from
>> Hash Right Join
>   Output: i8.q2
>   Hash Cond: ((NULL::integer) = i8b1.q2)
> - ->  Hash Left Join
> + ->  Hash Join

> Can you explain why that's valid?

Excellent question.  The reason that plan changed is the logic I added
in find_nonnullable_rels_walker:

+ * If the PHV's syntactic scope is exactly one rel, it will be forced
+ * to be evaluated at that rel, and so it will behave like a Var of
+ * that rel: if the rel's entire output goes to null, so will the PHV.

In this case there's a PHV wrapped around b2.d2, and this change allows
reduce_outer_joins_pass2 to detect that the i8-to-b2 left join can
be simplified to an inner join because the qual just above it (on the
left join of b1 to i8/b2) is strict for b2; that is, the condition
(b2.d2 = b1.q2) cannot succeed for a null-extended i8/b2 row.

If you reverse out just that change you'll see why I added it: without it,
the plan for the earlier "test a corner case in which we shouldn't apply
the star-schema optimization" isn't optimized as much as I thought it
should be.

v5 attached; this responds to your comments plus Alexander's earlier
gripe about not getting a clean build with --disable-cassert.
No really substantive changes though.

regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index e8ef966..6696f92 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2404,6 +2404,8 @@ JumbleRangeTable(pgssJumbleState *jstate, List *rtable)
 			case RTE_NAMEDTUPLESTORE:
 APP_JUMB_STRING(rte->enrname);
 break;
+			case RTE_RESULT:
+break;
 			default:
 elog(ERROR, "unrecognized RTE kind: %d", (int) rte->rtekind);
 break;
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bb92d9d..b3894d0 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -5361,7 +5361,7 @@ INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass;
QUERY PLAN
 -
  Insert on public.ft2
-   Output: (tableoid)::regclass
+   Output: (ft2.tableoid)::regclass
Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
->  Result
  Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone, NULL::character varying, 'ft2   '::character(10), NULL::user_enum
diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index 12cb18c..cd77f99 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -437,9 +437,12 @@ ExecSupportsMarkRestore(Path *pathnode)
 return ExecSupportsMarkRestore(((ProjectionPath *) pathnode)->subpath);
 			else if (IsA(pathnode, MinMaxAggPath))
 return false;	/* childless Result */
+			else if (IsA(pathnode, GroupResultPath))
+return 

Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-03 Thread David Rowley
I've just looked over the v4 patch. I agree that having an RTE for a
from-less SELECT seems like a good idea.

While reading the patch, I noted the following:

1. It's more efficient to use bms_next_member as it does not need to
re-skip 0 words on each iteration. (Likely bms_first_member() is not
needed anywhere in the code base)

int varno;

while ((varno = bms_first_member(result_relids)) >= 0)
remove_result_refs(root, varno, (Node *) f);

can also make the loop condition > 0, rather than  >= 0 to save the
final futile attempt at finding a value that'll never be there.

2. The following comment seems to indicate that we can go ahead and
make parallelise the result processing, but the code still defers to
the checks below and may still end up not parallelising if say,
there's a non-parallel safe function call in the SELECT's target list.

case RTE_RESULT:
/* Sure, execute it in a worker if you want. */
break;

3. You may as well just ditch the variable and just do:

Assert(rel->relid > 0);
Assert(planner_rt_fetch(rel->relid, root)->rtekind == RTE_RESULT);

instead of:

RangeTblEntry *rte PG_USED_FOR_ASSERTS_ONLY;

/* Should only be applied to RTE_RESULT base relations */
Assert(rel->relid > 0);
rte = planner_rt_fetch(rel->relid, root);
Assert(rte->rtekind == RTE_RESULT);

There are a few other cases doing just that in costsize.c

4. I don't quite understand why this changed in join.out

@@ -3596,7 +3588,7 @@ select t1.* from
>  Hash Right Join
  Output: i8.q2
  Hash Cond: ((NULL::integer) = i8b1.q2)
- ->  Hash Left Join
+ ->  Hash Join

Can you explain why that's valid?  I understand this normally occurs
when a qual exists which would filter out the NULL rows produced by
the join, but I don't see that in this case.

--
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-11-30 Thread Alexander Kuzmenkov

On 11/29/18 22:13, Tom Lane wrote:

Ooops, I had not seen this before sending v4 patch.  Doesn't seem worth
posting a v5 for, but I'll be sure to fix it.



Thanks for updating, v4 looks good to me.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-11-29 Thread Tom Lane
Alexander Kuzmenkov  writes:
> Oh, one more thing: I see a warning without --enable-cassert in 
> create_resultscan_plan, because rte is only used in an Assert.
> src/backend/optimizer/plan/createplan.c:3457:17: warning: variable ‘rte’ 
> set but not used [-Wunused-but-set-variable]
>    RangeTblEntry *rte;

Ooops, I had not seen this before sending v4 patch.  Doesn't seem worth
posting a v5 for, but I'll be sure to fix it.

regards, tom lane



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-11-29 Thread Tom Lane
Alexander Kuzmenkov  writes:
> I was also looking at this patch, here are some things I noticed:

Thanks for reviewing!  I incorporated your suggestions in the v4
patch I just sent.

regards, tom lane



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-11-29 Thread Tom Lane
Mark Dilger  writes:
> Patch applies cleanly on master (b238527664ec6f6c9d00dba4cc2f3dab1c8b8b04), 
> compiles, and passes both 'make check-world' and 'make installcheck-world'.

Thanks for reviewing!

> The patch includes changes to the expected output of a few tests, and adds 
> new code comments and changes existing code comments, but I did not notice 
> any new tests or new documentation to specifically test or explain the 
> behavioral change this patch is intended to introduce.  None of the code 
> comments seem to adequately explain what an RTE_RESULT is and when it would 
> be used.  This information can be gleaned with some difficulty by reading 
> every file containing RTE_RESULT, but that seems rather unfriendly.

Well, there's no user-facing documentation because it's not a user-facing
feature; it just exists to make some things simpler inside the planner.

I will admit that the comment for RTE_RESULT in parsenodes.h is a bit
vague; that's mostly because when I started on this patch, it wasn't clear
to me whether or not to have RTE_RESULT present from the beginning (i.e.
have the parser create one) or let the planner inject them.  I ended up
doing the latter, so the attached update of the patch now says

@@ -950,7 +950,10 @@ typedef enum RTEKind
RTE_TABLEFUNC,  /* TableFunc(.., column list) */
RTE_VALUES, /* VALUES (), (), ... */
RTE_CTE,/* common table expr (WITH list element) */
-   RTE_NAMEDTUPLESTORE /* tuplestore, e.g. for AFTER triggers */
+   RTE_NAMEDTUPLESTORE,/* tuplestore, e.g. for AFTER triggers */
+   RTE_RESULT  /* RTE represents an empty FROM clause; such
+* RTEs are added by the planner, they're not
+* present during parsing or rewriting */
 } RTEKind;
 
 typedef struct RangeTblEntry

I'm not sure if that's enough to address your concern or not.  But none
of the other RTEKinds are documented much more than this, either ...

> As an example of where I could use a bit more documentation, see
> src/backend/rewrite/rewriteHandler.c circa line 447; I don't know why
> the switch statement lacks a case for RTE_RESULT.  Why would RTE_VALUES
> contain bare expressions but RTE_RESULT would not?

Well, it just doesn't.  The comments in struct RangeTblEntry are pretty
clear about which fields apply to which RTE kinds, and none of the ones
containing subexpressions are valid for RTE_RESULT.  As to *why* it's
like that, it's because an empty FROM clause doesn't produce any columns,
by definition.

> Does this mean that
>   INSERT INTO mytable VALUES ('foo', 'bar');
> differs from 
>   SELECT 'foo', 'bar';
> in terms of whether 'foo' and 'bar' are bare expressions?

Well, it does if there are multiple rows of VALUES items; look at the
parser's transformInsertStmt, which does things differently for a
single-row VALUES than multiple rows.  We only create a VALUES RTE
for the multi-rows case, for which "SELECT 'foo', 'bar'" doesn't work.

Attached is an updated patch that responds to your comments and
Alexander's, and also adds a test case for EvalPlanQual involving
a RTE_RESULT RTE, because I got worried about whether that really
worked.

regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 33f9a79..efa5596 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2404,6 +2404,8 @@ JumbleRangeTable(pgssJumbleState *jstate, List *rtable)
 			case RTE_NAMEDTUPLESTORE:
 APP_JUMB_STRING(rte->enrname);
 break;
+			case RTE_RESULT:
+break;
 			default:
 elog(ERROR, "unrecognized RTE kind: %d", (int) rte->rtekind);
 break;
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index e653c30..fda95fc 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -5351,7 +5351,7 @@ INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass;
QUERY PLAN
 -
  Insert on public.ft2
-   Output: (tableoid)::regclass
+   Output: (ft2.tableoid)::regclass
Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
->  Result
  Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone, NULL::character varying, 'ft2   '::character(10), NULL::user_enum
diff 

Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-11-29 Thread Alexander Kuzmenkov
Oh, one more thing: I see a warning without --enable-cassert in 
create_resultscan_plan, because rte is only used in an Assert.


src/backend/optimizer/plan/createplan.c:3457:17: warning: variable ‘rte’ 
set but not used [-Wunused-but-set-variable]

  RangeTblEntry *rte;

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-11-29 Thread Alexander Kuzmenkov

I was also looking at this patch, here are some things I noticed:

In remove_useless_results_recurse where it processes JOIN_SEMI there is 
this comment:


             * However, we can't simplify if there are PHVs to 
evaluate at

                 * the RTE_RESULT ... but that's impossible isn't it?

Is that impossible because the RHS of semi join can't be used above it? 
Let's write this down. There is similar code above for JOIN_LEFT and it 
does have to check for PHVs, so a comment that clarifies the reasons for 
the difference would help.



Also around there:

                if ((varno = is_result_ref(root, j->rarg)) != 0 &&

I'd expect a function that starts with "is_" to return a bool, so this 
catches the eye. Maybe varno = get_result_relid()?



Looking at the coverage report of regression tests, most of the new code 
is covered except for the aforementioned simplification of JOIN_LEFT and 
JOIN_RIGHT, but it's probably not worth adding a special test. I checked 
these cases manually and they work OK.



I also repeated the benchmark with trivial select and can confirm that 
there is no change in performance.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-11-28 Thread Mark Dilger
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Sorry about the prior review; I neglected to select all the appropriate 
buttons, leading to an errant "tested, failed" in the review.

Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-11-28 Thread Mark Dilger
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Patch applies cleanly on master (b238527664ec6f6c9d00dba4cc2f3dab1c8b8b04), 
compiles, and passes both 'make check-world' and 'make installcheck-world'.

The patch includes changes to the expected output of a few tests, and adds new 
code comments and changes existing code comments, but I did not notice any new 
tests or new documentation to specifically test or explain the behavioral 
change this patch is intended to introduce.  None of the code comments seem to 
adequately explain what an RTE_RESULT is and when it would be used.  This 
information can be gleaned with some difficulty by reading every file 
containing RTE_RESULT, but that seems rather unfriendly.

As an example of where I could use a bit more documentation, see 
src/backend/rewrite/rewriteHandler.c circa line 447; I don't know why the 
switch statement lacks a case for RTE_RESULT.  Why would RTE_VALUES contain 
bare expressions but RTE_RESULT would not?  Does this mean that

  INSERT INTO mytable VALUES ('foo', 'bar');

differs from 

  SELECT 'foo', 'bar';

in terms of whether 'foo' and 'bar' are bare expressions?  Admittedly, I don't 
know this code very well, and this might be obvious to others.

Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-10-14 Thread Tom Lane
I wrote:
> * There's a hack in nodeResult.c to prevent the executor from crashing
> on a whole-row Var for an RTE_RESULT RTE, which is something that the
> planner will create in SELECT FOR UPDATE cases, because it thinks it
> needs to provide a ROW_MARK_COPY image of the RTE's output.  We might
> be able to get rid of that if we could teach the planner that it need
> not bother rowmarking RESULT RTEs, but that seemed like it would be
> really messy.  (At the point where the decision is made, we don't know
> whether a subquery might end up as just a RESULT, or indeed vanish
> entirely.)  Since I couldn't measure any reproducible penalty from
> having the extra setup cost for a Result plan, I left it like this.

Well, I'd barely sent this when I realized that there was a better way.
The nodeResult.c hack predates my decision to postpone cleaning up
RTE_RESULT RTEs till near the end of the preprocessing phase, and
given that code, it is easy to get rid of rowmarking RESULT RTEs ...
in fact, the code was doing it already, except in the edge case of
a SELECT with only a RESULT RTE.  So here's a version that does not
touch nodeResult.c.

regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 33f9a79..efa5596 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2404,6 +2404,8 @@ JumbleRangeTable(pgssJumbleState *jstate, List *rtable)
 			case RTE_NAMEDTUPLESTORE:
 APP_JUMB_STRING(rte->enrname);
 break;
+			case RTE_RESULT:
+break;
 			default:
 elog(ERROR, "unrecognized RTE kind: %d", (int) rte->rtekind);
 break;
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 21a2ef5..ac3722a 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -5374,7 +5374,7 @@ INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass;
QUERY PLAN
 -
  Insert on public.ft2
-   Output: (tableoid)::regclass
+   Output: (ft2.tableoid)::regclass
Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
->  Result
  Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone, NULL::character varying, 'ft2   '::character(10), NULL::user_enum
diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index 9e78421..8f7d4ba 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -437,9 +437,12 @@ ExecSupportsMarkRestore(Path *pathnode)
 return ExecSupportsMarkRestore(((ProjectionPath *) pathnode)->subpath);
 			else if (IsA(pathnode, MinMaxAggPath))
 return false;	/* childless Result */
+			else if (IsA(pathnode, GroupResultPath))
+return false;	/* childless Result */
 			else
 			{
-Assert(IsA(pathnode, ResultPath));
+/* Simple RTE_RESULT base relation */
+Assert(IsA(pathnode, Path));
 return false;	/* childless Result */
 			}
 
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index a10014f..ecaeeb3 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2319,10 +2319,6 @@ range_table_walker(List *rtable,
 if (walker(rte->tablesample, context))
 	return true;
 break;
-			case RTE_CTE:
-			case RTE_NAMEDTUPLESTORE:
-/* nothing to do */
-break;
 			case RTE_SUBQUERY:
 if (!(flags & QTW_IGNORE_RT_SUBQUERIES))
 	if (walker(rte->subquery, context))
@@ -2345,6 +2341,11 @@ range_table_walker(List *rtable,
 if (walker(rte->values_lists, context))
 	return true;
 break;
+			case RTE_CTE:
+			case RTE_NAMEDTUPLESTORE:
+			case RTE_RESULT:
+/* nothing to do */
+break;
 		}
 
 		if (walker(rte->securityQuals, context))
@@ -3150,10 +3151,6 @@ range_table_mutator(List *rtable,
 	   TableSampleClause *);
 /* we don't bother to copy eref, aliases, etc; OK? */
 break;
-			case RTE_CTE:
-			case RTE_NAMEDTUPLESTORE:
-/* nothing to do */
-break;
 			case RTE_SUBQUERY:
 if (!(flags & QTW_IGNORE_RT_SUBQUERIES))
 {
@@ -3184,6 +3181,11 @@ range_table_mutator(List *rtable,
 			case RTE_VALUES:
 MUTATE(newrte->values_lists, rte->values_lists, List *);
 break;
+			case RTE_CTE:
+			case RTE_NAMEDTUPLESTORE:
+			case RTE_RESULT:
+/* nothing to do */
+break;
 		}
 		MUTATE(newrte->securityQuals, rte->securityQuals, List *);
 		newrt 

Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-10-14 Thread Thomas Munro
On Fri, Mar 16, 2018 at 4:27 AM Tom Lane  wrote:
> We've long made fun of Oracle(TM) for the fact that if you just want
> to evaluate some expressions, you have to write "select ... from dual"
> rather than just "select ...".  But I've realized recently that there's
> a bit of method in that madness after all.

We can still make fun of that table.  Apparently it had two rows, so
you could double rows by cross joining against it, but at some point
one of them went missing, leaving a strange name behind.  Source:
https://en.wikipedia.org/wiki/DUAL_table#History

-- 
Thomas Munro
http://www.enterprisedb.com



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-10-14 Thread Tom Lane
Robert Haas  writes:
> On Thu, Mar 15, 2018 at 11:27 AM, Tom Lane  wrote:
>> We've long made fun of Oracle(TM) for the fact that if you just want
>> to evaluate some expressions, you have to write "select ... from dual"
>> rather than just "select ...".  But I've realized recently that there's
>> a bit of method in that madness after all.  Specifically, having to
>> cope with FromExprs that contain no base relation is fairly problematic
>> in the planner.  prepjointree.c is basically unable to cope with
>> flattening a subquery that looks that way, although we've inserted a
>> lot of overly-baroque logic to handle some subsets of the case (cf
>> is_simple_subquery(), around line 1500).  If memory serves, there are
>> other places that are complicated by the case.
>> 
>> Suppose that, either in the rewriter or early in the planner, we were
>> to replace such cases with nonempty FromExprs, by adding a dummy RTE
>> representing a table with no columns and one row.  This would in turn
>> give rise to an ordinary Path that converts to a Result plan, so that
>> the case is handled without any special contortions later.  Then there
>> is no case where we don't have a nonempty relids set identifying a
>> subquery, so that all that special-case hackery in prepjointree.c
>> goes away, and we can simplify whatever else is having a hard time
>> with it.

> Good idea.

Here's a proposed patch along those lines.  Some notes for review:

* The new RTE kind is "RTE_RESULT" after the kind of Plan node it will
produce.  I'm not entirely in love with that name, but couldn't think
of a better idea.

* I renamed the existing ResultPath node type to GroupResultPath to
clarify that it's not used to scan RTE_RESULT RTEs, but just for
degenerate grouping cases.  RTE_RESULT RTEs (usually) give rise to plain
Path nodes with pathtype T_Result.  It doesn't work very well to try to
unify those two cases, even though they give rise to identical Plans,
because there are different rules about where the associated quals live
during query_planner.

* The interesting changes are in prepjointree.c; almost all the rest
of the patch is boilerplate to support RTE_RESULT RTEs in mostly the
same way that other special RTE-scan plan types are handled in the
planner.  In prepjointree.c, I ended up getting rid of the original
decision to try to delete removable RTEs during pull_up_subqueries,
and instead made it happen in a separate traversal of the join tree.
That's a lot less complicated, and it has better results because we
can optimize more cases once we've seen the results of expression
preprocessing and outer-join strength reduction.

* I tried to get rid of the empty-jointree special case in query_planner
altogether.  While the patch works fine that way, it makes for a
measurable slowdown in planning trivial queries --- I saw close to 10%
degradation in TPS rate for a pgbench test case that was just 
$ cat trivialselect.sql 
select 2+2;
$ pgbench -n -T 10 -f trivialselect.sql
So I ended up putting back the special case, but it's much less of a
cheat than it was before; the RelOptInfo it hands back is basically the
same as the normal path would produce.  For me, the patch as given is
within the noise level of being the same speed as HEAD on this case.

* There's a hack in nodeResult.c to prevent the executor from crashing
on a whole-row Var for an RTE_RESULT RTE, which is something that the
planner will create in SELECT FOR UPDATE cases, because it thinks it
needs to provide a ROW_MARK_COPY image of the RTE's output.  We might
be able to get rid of that if we could teach the planner that it need
not bother rowmarking RESULT RTEs, but that seemed like it would be
really messy.  (At the point where the decision is made, we don't know
whether a subquery might end up as just a RESULT, or indeed vanish
entirely.)  Since I couldn't measure any reproducible penalty from
having the extra setup cost for a Result plan, I left it like this.

* There are several existing test cases in join.sql whose plans change
for the better with this patch, so I didn't really think we needed any
additional test cases to show that it's working.

* There are a couple of cosmetic changes in EXPLAIN output as a result
of ruleutils.c seeing more RTEs in the plan's rtable than it did before,
causing it to decide to table-qualify Var names in more cases.  We could
maybe spend more effort on ruleutils.c's heuristic for when to qualify,
but that seems like a separable problem, and anyway it's only cosmetic.

I'll add this to the next CF.

regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 33f9a79..efa5596 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2404,6 +2404,8 @@ JumbleRangeTable(pgssJumbleState *jstate, List *rtable)
 			case RTE_NAMEDTUPLESTORE:
 

Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-03-16 Thread Tom Lane
Ashutosh Bapat  writes:
> On Thu, Mar 15, 2018 at 8:57 PM, Tom Lane  wrote:
>> Suppose that, either in the rewriter or early in the planner, we were
>> to replace such cases with nonempty FromExprs, by adding a dummy RTE
>> representing a table with no columns and one row.  This would in turn
>> give rise to an ordinary Path that converts to a Result plan, so that
>> the case is handled without any special contortions later.

> Since table in the dummy FROM clause returns one row without any
> column, I guess, there will be at least one row in the output. I am
> curious how would we handle cases which do not return any row
> like

> create function set_ret_func() returns setof record as $$select * from
> pg_class where oid = 0;$$ language sql;
> select set_ret_func();
>  set_ret_func
> --
> (0 rows)

It'd be the same as now, so far as the executor is concerned:

regression=# explain select set_ret_func();
QUERY PLAN
--
 ProjectSet  (cost=0.00..5.27 rows=1000 width=32)
   ->  Result  (cost=0.00..0.01 rows=1 width=0)
(2 rows)

The difference is that, within the planner, the ResultPath would be
associated with a "real" base relation instead of needing its very
own code path in query_planner().

regards, tom lane



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-03-16 Thread Ashutosh Bapat
On Thu, Mar 15, 2018 at 8:57 PM, Tom Lane  wrote:
> We've long made fun of Oracle(TM) for the fact that if you just want
> to evaluate some expressions, you have to write "select ... from dual"
> rather than just "select ...".  But I've realized recently that there's
> a bit of method in that madness after all.  Specifically, having to
> cope with FromExprs that contain no base relation is fairly problematic
> in the planner.  prepjointree.c is basically unable to cope with
> flattening a subquery that looks that way, although we've inserted a
> lot of overly-baroque logic to handle some subsets of the case (cf
> is_simple_subquery(), around line 1500).  If memory serves, there are
> other places that are complicated by the case.
>
> Suppose that, either in the rewriter or early in the planner, we were
> to replace such cases with nonempty FromExprs, by adding a dummy RTE
> representing a table with no columns and one row.  This would in turn
> give rise to an ordinary Path that converts to a Result plan, so that
> the case is handled without any special contortions later.  Then there
> is no case where we don't have a nonempty relids set identifying a
> subquery, so that all that special-case hackery in prepjointree.c
> goes away, and we can simplify whatever else is having a hard time
> with it.
>

The idea looks neat.

Since table in the dummy FROM clause returns one row without any
column, I guess, there will be at least one row in the output. I am
curious how would we handle cases which do not return any row
like

create function set_ret_func() returns setof record as $$select * from
pg_class where oid = 0;$$ language sql;
select set_ret_func();
 set_ret_func
--
(0 rows)

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2018-03-15 Thread Robert Haas
On Thu, Mar 15, 2018 at 11:27 AM, Tom Lane  wrote:
> We've long made fun of Oracle(TM) for the fact that if you just want
> to evaluate some expressions, you have to write "select ... from dual"
> rather than just "select ...".  But I've realized recently that there's
> a bit of method in that madness after all.  Specifically, having to
> cope with FromExprs that contain no base relation is fairly problematic
> in the planner.  prepjointree.c is basically unable to cope with
> flattening a subquery that looks that way, although we've inserted a
> lot of overly-baroque logic to handle some subsets of the case (cf
> is_simple_subquery(), around line 1500).  If memory serves, there are
> other places that are complicated by the case.
>
> Suppose that, either in the rewriter or early in the planner, we were
> to replace such cases with nonempty FromExprs, by adding a dummy RTE
> representing a table with no columns and one row.  This would in turn
> give rise to an ordinary Path that converts to a Result plan, so that
> the case is handled without any special contortions later.  Then there
> is no case where we don't have a nonempty relids set identifying a
> subquery, so that all that special-case hackery in prepjointree.c
> goes away, and we can simplify whatever else is having a hard time
> with it.

Good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company