Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-21 Thread Robert Haas
On Thu, Feb 18, 2016 at 4:52 PM, Ashutosh Bapat
 wrote:
> If the list in the joining relation changes (may be because we appended
> parameterized conditions), we would be breaking links on all the upper
> relations in the join tree in an unpredictable manner. The problem may not
> show up now, but it's an avenue for unrecognizable bugs. So, it's safer to
> copy the lists in the state that we want them.

Agreed.  The lists figure to be short, so copying them shouldn't be
very expensive, and it's better to do that in all cases than to leave
shared-substructure hazards around for future patch authors to worry
about.

Committed Ashutosh's version of the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-18 Thread Ashutosh Bapat
On Thu, Feb 18, 2016 at 3:48 PM, Etsuro Fujita 
wrote:

> On 2016/02/16 16:40, Etsuro Fujita wrote:
>
>> On 2016/02/16 16:02, Ashutosh Bapat wrote:
>>
>>> On Tue, Feb 16, 2016 at 12:26 PM, Etsuro Fujita
>>> >
>>> wrote:
>>>
>>
> On 2016/02/16 15:22, Ashutosh Bapat wrote:
>>>
>>
> During join planning, the planner tries multiple combinations of
>>> joining
>>> relations, thus the same base or join relation can be part of
>>> multiple
>>> of combination. Hence remote_conds or joinclauses will get linked
>>> multiple times as they are bidirectional lists, thus breaking
>>> linkages
>>> of previous join combinations tried. E.g. while planning A join
>>> B join C
>>> join D planner will come up with combinations like A(B(CD)) or
>>> (AB)(CD)
>>> or ((AB)C)D etc. and remote_conds from A will first be linked
>>> into
>>> A(B(CD)), then AB breaking the first linkages.
>>>
>>
> Exactly, but I don't think that that needs to be considered because
>>> we have this at the beginning of postgresGetGForeignJoinPaths:
>>>
>>>  /*
>>>   * Skip if this join combination has been considered already.
>>>   */
>>>  if (joinrel->fdw_private)
>>>  return;
>>>
>>
> There will be different joinrels for A(B(CD)) and (AB) where A's
>>> remote_conds need to be pulled up.
>>>
>>
> Agreed.
>>
>
> The check you have mentioned above
>>> only protects us from adding paths multiple times to (AB) when we
>>> encounter it for (AB)(CD) and ((AB)C)D.
>>>
>>
> Sorry, I don't understand this fully.
>>
>
> Another thing I don't really understand is why list_copy is needed in the
> second list_concat for the case of INNER/FULL JOIN or in both list_concats
> for the case of LEFT/RIGHT JOIN, in your patch.  Since list_concat is
> nondestructive of its second argument, I don't think list_copy is needed in
> any such list_concat.  Maybe I'm missing something, though.
>

If the list in the joining relation changes (may be because we appended
parameterized conditions), we would be breaking links on all the upper
relations in the join tree in an unpredictable manner. The problem may not
show up now, but it's an avenue for unrecognizable bugs. So, it's safer to
copy the lists in the state that we want them.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-18 Thread Etsuro Fujita

On 2016/02/16 16:40, Etsuro Fujita wrote:

On 2016/02/16 16:02, Ashutosh Bapat wrote:

On Tue, Feb 16, 2016 at 12:26 PM, Etsuro Fujita
> wrote:



On 2016/02/16 15:22, Ashutosh Bapat wrote:



During join planning, the planner tries multiple combinations of
joining
relations, thus the same base or join relation can be part of
multiple
of combination. Hence remote_conds or joinclauses will get linked
multiple times as they are bidirectional lists, thus breaking
linkages
of previous join combinations tried. E.g. while planning A join
B join C
join D planner will come up with combinations like A(B(CD)) or
(AB)(CD)
or ((AB)C)D etc. and remote_conds from A will first be linked
into
A(B(CD)), then AB breaking the first linkages.



Exactly, but I don't think that that needs to be considered because
we have this at the beginning of postgresGetGForeignJoinPaths:

 /*
  * Skip if this join combination has been considered already.
  */
 if (joinrel->fdw_private)
 return;



There will be different joinrels for A(B(CD)) and (AB) where A's
remote_conds need to be pulled up.



Agreed.



The check you have mentioned above
only protects us from adding paths multiple times to (AB) when we
encounter it for (AB)(CD) and ((AB)C)D.



Sorry, I don't understand this fully.


Another thing I don't really understand is why list_copy is needed in 
the second list_concat for the case of INNER/FULL JOIN or in both 
list_concats for the case of LEFT/RIGHT JOIN, in your patch.  Since 
list_concat is nondestructive of its second argument, I don't think 
list_copy is needed in any such list_concat.  Maybe I'm missing 
something, though.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-15 Thread Etsuro Fujita

On 2016/02/16 16:02, Ashutosh Bapat wrote:

On Tue, Feb 16, 2016 at 12:26 PM, Etsuro Fujita
> wrote:



On 2016/02/16 15:22, Ashutosh Bapat wrote:



During join planning, the planner tries multiple combinations of
joining
relations, thus the same base or join relation can be part of
multiple
of combination. Hence remote_conds or joinclauses will get linked
multiple times as they are bidirectional lists, thus breaking
linkages
of previous join combinations tried. E.g. while planning A join
B join C
join D planner will come up with combinations like A(B(CD)) or
(AB)(CD)
or ((AB)C)D etc. and remote_conds from A will first be linked into
A(B(CD)), then AB breaking the first linkages.



Exactly, but I don't think that that needs to be considered because
we have this at the beginning of postgresGetGForeignJoinPaths:

 /*
  * Skip if this join combination has been considered already.
  */
 if (joinrel->fdw_private)
 return;



There will be different joinrels for A(B(CD)) and (AB) where A's
remote_conds need to be pulled up.


Agreed.


The check you have mentioned above
only protects us from adding paths multiple times to (AB) when we
encounter it for (AB)(CD) and ((AB)C)D.


Sorry, I don't understand this fully.

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-15 Thread Ashutosh Bapat
On Tue, Feb 16, 2016 at 12:26 PM, Etsuro Fujita  wrote:

> On 2016/02/16 15:22, Ashutosh Bapat wrote:
>
>> During join planning, the planner tries multiple combinations of joining
>> relations, thus the same base or join relation can be part of multiple
>> of combination. Hence remote_conds or joinclauses will get linked
>> multiple times as they are bidirectional lists, thus breaking linkages
>> of previous join combinations tried. E.g. while planning A join B join C
>> join D planner will come up with combinations like A(B(CD)) or (AB)(CD)
>> or ((AB)C)D etc. and remote_conds from A will first be linked into
>> A(B(CD)), then AB breaking the first linkages.
>>
>
> Exactly, but I don't think that that needs to be considered because we
> have this at the beginning of postgresGetGForeignJoinPaths:
>
> /*
>  * Skip if this join combination has been considered already.
>  */
> if (joinrel->fdw_private)
> return;
>
>
There will be different joinrels for A(B(CD)) and (AB) where A's
remote_conds need to be pulled up. The check you have mentioned above only
protects us from adding paths multiple times to (AB) when we encounter it
for (AB)(CD) and ((AB)C)D.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-15 Thread Etsuro Fujita

On 2016/02/16 15:22, Ashutosh Bapat wrote:

During join planning, the planner tries multiple combinations of joining
relations, thus the same base or join relation can be part of multiple
of combination. Hence remote_conds or joinclauses will get linked
multiple times as they are bidirectional lists, thus breaking linkages
of previous join combinations tried. E.g. while planning A join B join C
join D planner will come up with combinations like A(B(CD)) or (AB)(CD)
or ((AB)C)D etc. and remote_conds from A will first be linked into
A(B(CD)), then AB breaking the first linkages.


Exactly, but I don't think that that needs to be considered because we 
have this at the beginning of postgresGetGForeignJoinPaths:


/*
 * Skip if this join combination has been considered already.
 */
if (joinrel->fdw_private)
return;

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-15 Thread Etsuro Fujita

On 2016/02/15 21:33, Ashutosh Bapat wrote:

Here's patch with better way to fix it. I think while concatenating the
lists, we need to copy the lists being appended and in all the cases. If
we don't copy, a change in those lists can cause changes in the upward
linkages and thus lists of any higher level joins.


Maybe I'm missing something, but I don't understand why such a change in 
those lists happens.  Could you explain about that in more detail?


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-15 Thread Ashutosh Bapat
During join planning, the planner tries multiple combinations of joining
relations, thus the same base or join relation can be part of multiple of
combination. Hence remote_conds or joinclauses will get linked multiple
times as they are bidirectional lists, thus breaking linkages of previous
join combinations tried. E.g. while planning A join B join C join D planner
will come up with combinations like A(B(CD)) or (AB)(CD) or ((AB)C)D etc.
and remote_conds from A will first be linked into A(B(CD)), then AB
breaking the first linkages.

On Tue, Feb 16, 2016 at 11:36 AM, Etsuro Fujita  wrote:

> On 2016/02/15 21:33, Ashutosh Bapat wrote:
>
>> Here's patch with better way to fix it. I think while concatenating the
>> lists, we need to copy the lists being appended and in all the cases. If
>> we don't copy, a change in those lists can cause changes in the upward
>> linkages and thus lists of any higher level joins.
>>
>
> Maybe I'm missing something, but I don't understand why such a change in
> those lists happens.  Could you explain about that in more detail?
>
> Best regards,
> Etsuro Fujita
>
>
>


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-15 Thread Ashutosh Bapat
Thanks Fujita-san for bug report and the fix. Sorry for bug.

Here's patch with better way to fix it. I think while concatenating the
lists, we need to copy the lists being appended and in all the cases. If we
don't copy, a change in those lists can cause changes in the upward
linkages and thus lists of any higher level joins.

On Mon, Feb 15, 2016 at 1:10 PM, Etsuro Fujita 
wrote:

> On 2016/02/10 4:16, Robert Haas wrote:
>
>> On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
>>  wrote:
>>
>>> Thanks Jeevan for your review and comments. PFA the patch which fixes
>>> those.
>>>
>>
> Committed with a couple more small adjustments.
>>
>
> Thanks for working on this, Robert, Ashutosh, and everyone involved!
>
> I happened to notice that this code in foreign_join_ok():
>
> switch (jointype)
> {
> case JOIN_INNER:
> fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
>fpinfo_i->remote_conds);
> fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
>fpinfo_o->remote_conds);
> break;
>
> case JOIN_LEFT:
> fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
>   fpinfo_i->remote_conds);
> fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
>fpinfo_o->remote_conds);
> break;
>
> case JOIN_RIGHT:
> fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
>   fpinfo_o->remote_conds);
> fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
>fpinfo_i->remote_conds);
> break;
>
> case JOIN_FULL:
> fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
>   fpinfo_i->remote_conds);
> fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
>   fpinfo_o->remote_conds);
> break;
>
> default:
> /* Should not happen, we have just check this above */
> elog(ERROR, "unsupported join type %d", jointype);
> }
>
> would break the list fpinfo_i->remote_conds in the case of INNER JOIN or
> FULL JOIN.  You can see the list breakage from e.g., the following queries
> on an Assert-enabled build:
>
> postgres=# create extension postgres_fdw;
> CREATE EXTENSION
> postgres=# create server myserver foreign data wrapper postgres_fdw
> options (dbname 'mydatabase');
> CREATE SERVER
> postgres=# create user mapping for current_user server myserver;
> CREATE USER MAPPING
> postgres=# create foreign table foo (a int) server myserver options
> (table_name 'foo');
> CREATE FOREIGN TABLE
> postgres=# create foreign table bar (a int) server myserver options
> (table_name 'bar');
> CREATE FOREIGN TABLE
> postgres=# create foreign table baz (a int) server myserver options
> (table_name 'baz');
> CREATE FOREIGN TABLE
> postgres=# select * from foo, bar, baz where foo.a = bar.a and bar.a =
> baz.a and foo.a < 10 and bar.a < 10 and baz.a < 10;
>
> Attached is a patch to avoid the breakage.
>
> Best regards,
> Etsuro Fujita
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


foreign_join_ok_v2.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-14 Thread Etsuro Fujita

On 2016/02/10 4:16, Robert Haas wrote:

On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
 wrote:

Thanks Jeevan for your review and comments. PFA the patch which fixes those.



Committed with a couple more small adjustments.


Thanks for working on this, Robert, Ashutosh, and everyone involved!

I happened to notice that this code in foreign_join_ok():

switch (jointype)
{
case JOIN_INNER:
fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
   fpinfo_i->remote_conds);
fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
   fpinfo_o->remote_conds);
break;

case JOIN_LEFT:
fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
  fpinfo_i->remote_conds);
fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
   fpinfo_o->remote_conds);
break;

case JOIN_RIGHT:
fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
  fpinfo_o->remote_conds);
fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
   fpinfo_i->remote_conds);
break;

case JOIN_FULL:
fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
  fpinfo_i->remote_conds);
fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
  fpinfo_o->remote_conds);
break;

default:
/* Should not happen, we have just check this above */
elog(ERROR, "unsupported join type %d", jointype);
}

would break the list fpinfo_i->remote_conds in the case of INNER JOIN or 
FULL JOIN.  You can see the list breakage from e.g., the following 
queries on an Assert-enabled build:


postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# create server myserver foreign data wrapper postgres_fdw 
options (dbname 'mydatabase');

CREATE SERVER
postgres=# create user mapping for current_user server myserver;
CREATE USER MAPPING
postgres=# create foreign table foo (a int) server myserver options 
(table_name 'foo');

CREATE FOREIGN TABLE
postgres=# create foreign table bar (a int) server myserver options 
(table_name 'bar');

CREATE FOREIGN TABLE
postgres=# create foreign table baz (a int) server myserver options 
(table_name 'baz');

CREATE FOREIGN TABLE
postgres=# select * from foo, bar, baz where foo.a = bar.a and bar.a = 
baz.a and foo.a < 10 and bar.a < 10 and baz.a < 10;


Attached is a patch to avoid the breakage.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***
*** 3488,3495  foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
  		case JOIN_INNER:
  			fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
  			   fpinfo_i->remote_conds);
! 			fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
! 			   fpinfo_o->remote_conds);
  			break;
  
  		case JOIN_LEFT:
--- 3488,3496 
  		case JOIN_INNER:
  			fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
  			   fpinfo_i->remote_conds);
! 			if (fpinfo_o->remote_conds)
! fpinfo->remote_conds = list_concat(list_copy(fpinfo->remote_conds),
!    fpinfo_o->remote_conds);
  			break;
  
  		case JOIN_LEFT:
***
*** 3509,3516  foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
  		case JOIN_FULL:
  			fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
  			  fpinfo_i->remote_conds);
! 			fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
! 			  fpinfo_o->remote_conds);
  			break;
  
  		default:
--- 3510,3518 
  		case JOIN_FULL:
  			fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
  			  fpinfo_i->remote_conds);
! 			if (fpinfo_o->remote_conds)
! fpinfo->joinclauses = list_concat(list_copy(fpinfo->joinclauses),
!   fpinfo_o->remote_conds);
  			break;
  
  		default:

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-10 Thread Ashutosh Bapat
Here's patch to remove this declaration. The Assert next probably prevents
the warning for build with asserts. But both those lines are not needed.

On Wed, Feb 10, 2016 at 12:01 PM, Jeff Janes  wrote:

> On Tue, Feb 9, 2016 at 11:16 AM, Robert Haas 
> wrote:
> > On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
> >  wrote:
> >> Thanks Jeevan for your review and comments. PFA the patch which fixes
> those.
> >
> > Committed with a couple more small adjustments.
>
> I'm getting a compiler warning which I think is coming from this commit.
>
> postgres_fdw.c: In function 'fetch_more_data':
> postgres_fdw.c:2526:17: warning: unused variable 'fsplan'
> [-Wunused-variable]
> ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
>
> Thanks,
>
> Jeff
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


extra_fsplan.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-10 Thread Robert Haas
On Wed, Feb 10, 2016 at 7:12 AM, Ashutosh Bapat
 wrote:
> Here's patch to remove this declaration. The Assert next probably prevents
> the warning for build with asserts. But both those lines are not needed.

I like the Assert(), so I kept that and ditched the variable.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-09 Thread Robert Haas
On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
 wrote:
> Thanks Jeevan for your review and comments. PFA the patch which fixes those.

Committed with a couple more small adjustments.

Woohoo, finally!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-09 Thread Jeevan Chalke
Hi,

I have reviewed the patch and it looks good to me.
make/make install/make check is fine (when done without -Wall -Werror).

Here are few comments:
1.
With -Wall -Werror, I see couple of warnings:

postgres_fdw.c: In function ‘estimate_path_cost_size’:
postgres_fdw.c:2248:13: error: ‘run_cost’ may be used uninitialized in this
function [-Werror=uninitialized]
postgres_fdw.c: In function ‘conversion_error_callback’:
postgres_fdw.c:3832:6: error: ‘attname’ may be used uninitialized in this
function [-Werror=uninitialized]
cc1: all warnings being treated as errors
make: *** [postgres_fdw.o] Error 1

2. Typo:
scna_clauses => scan_clauses

3. Does this new addition requires documentation?


I did not see any issues with my testing. Code changes are good too.
Patch has very good test-cases testing everything required. Nice work.

Thanks.

On Mon, Feb 8, 2016 at 7:11 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
>
> On Mon, Feb 8, 2016 at 4:15 PM, Etsuro Fujita  > wrote:
>
>> On 2016/02/05 17:50, Ashutosh Bapat wrote:
>>
>> Btw, IIUC, I think the patch fails to adjust the targetlist of the
>>> top plan created that way, to output the fdw_scan_tlist, as
>>> discussed in [1] (ie, I think the attached patch is needed, which is
>>> created on top of your patch pg_fdw_join_v8.patch).
>>>
>>
>> fdw_scan_tlist represents the output fetched from the foreign server and
>>> is not necessarily the output of ForeignScan. ForeignScan node's output
>>> is represented by tlist argument to.
>>>
>>> 1119 return make_foreignscan(tlist,
>>> 1120 local_exprs,
>>> 1121 scan_relid,
>>> 1122 params_list,
>>> 1123 fdw_private,
>>> 1124 fdw_scan_tlist,
>>> 1125 remote_exprs,
>>> 1126 outer_plan);
>>>
>>> This tlist is built using build_path_tlist() for all join plans. IIUC,
>>> all of them output the same targetlist. We don't need to make sure that
>>> targetlist match as long as we are using the targetlist passed in by
>>> create_scan_plan(). Do you have a counter example?
>>>
>>
>> Maybe my explanation was not correct, but I'm saying that the targertlist
>> of the above outer_plan should be set to the fdw_scan_tlist, to avoid
>> misbehavior.  Here is such an example (add() in the example is a user
>> defined function that simply adds two arguments, defined by: create
>> function add(integer, integer) returns integer as '/path/to/func', 'add'
>> language c strict):
>>
>> postgres=# create foreign table foo (a int) server myserver options
>> (table_name 'foo');
>> postgres=# create foreign table bar (a int) server myserver options
>> (table_name 'bar');
>> postgres=# create table tab (a int, b int);
>> postgres=# insert into foo select a from generate_series(1, 1000) a;
>> postgres=# insert into bar select a from generate_series(1, 1000) a;
>> postgres=# insert into tab values (1, 1);
>> postgres=# analyze foo;
>> postgres=# analyze bar;
>> postgres=# analyze tab;
>>
>> [Terminal 1]
>> postgres=# begin;
>> BEGIN
>> postgres=# update tab set b = b + 1 where a = 1;
>> UPDATE 1
>>
>> [Terminal 2]
>> postgres=# explain verbose select tab.* from tab, foo, bar where foo.a =
>> bar.a and add(foo.a, bar.a) > 0 limit 1 for update;
>>
>> QUERY PLAN
>>
>>
>> 
>> -
>>  Limit  (cost=100.00..107.70 rows=1 width=70)
>>Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
>>->  LockRows  (cost=100.00..2663.48 rows=333 width=70)
>>  Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
>>  ->  Nested Loop  (cost=100.00..2660.15 rows=333 width=70)
>>Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
>>->  Foreign Scan  (cost=100.00..2654.97 rows=333 width=56)
>>  Output: foo.*, bar.*
>>  Filter: (add(foo.a, bar.a) > 0)
>>  Relations: (public.foo) INNER JOIN (public.bar)
>>  Remote SQL: SELECT ROW(r2.a), ROW(r3.a), r2.a, r3.a
>> FROM (public.foo r2 INNER JOIN public.bar r3 ON (TRUE)) WHERE ((r2.a =
>> r3.a)) F
>> OR UPDATE OF r2 FOR UPDATE OF r3
>>  ->  Hash Join  (cost=247.50..301.25 rows=333
>> width=56)
>>Output: foo.*, bar.*
>>Hash Cond: (foo.a = bar.a)
>>Join Filter: (add(foo.a, bar.a) > 0)
>>->  Foreign Scan on public.foo
>> (cost=100.00..135.00 rows=1000 width=32)
>>  Output: foo.*, foo.a
>>  Remote SQL: SELECT a FROM public.foo FOR
>> UPDATE
>>->  Hash  

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-09 Thread Ashutosh Bapat
Yay, finally!

Thanks.

On Wed, Feb 10, 2016 at 12:46 AM, Robert Haas  wrote:

> On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
>  wrote:
> > Thanks Jeevan for your review and comments. PFA the patch which fixes
> those.
>
> Committed with a couple more small adjustments.
>
> Woohoo, finally!
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-09 Thread Jeff Janes
On Tue, Feb 9, 2016 at 11:16 AM, Robert Haas  wrote:
> On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
>  wrote:
>> Thanks Jeevan for your review and comments. PFA the patch which fixes those.
>
> Committed with a couple more small adjustments.

I'm getting a compiler warning which I think is coming from this commit.

postgres_fdw.c: In function 'fetch_more_data':
postgres_fdw.c:2526:17: warning: unused variable 'fsplan' [-Wunused-variable]
ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-08 Thread Robert Haas
On Mon, Feb 8, 2016 at 5:45 AM, Etsuro Fujita
 wrote:
> Maybe my explanation was not correct, but I'm saying that the targertlist of
> the above outer_plan should be set to the fdw_scan_tlist, to avoid
> misbehavior.

Yeah, I think you're right.  So in this hunk:

+   if (foreignrel->reloptkind == RELOPT_JOINREL)
+   {
+   /* For a join relation, get the conditions from
fdw_private structure */
+   remote_conds = fpinfo->remote_conds;
+   local_exprs = fpinfo->local_conds;
+
+   /* Build the list of columns to be fetched from the
foreign server. */
+   fdw_scan_tlist = build_tlist_to_deparse(foreignrel);
+   }

I think we should also be doing outer_plan->targetlist =
fdw_scan_tlist in this block, with a comment like "Ensure that the
outer plan produces the a tuple whose descriptor matches our scan
tuple slot.  This is safe because all scans and joins support
projection, so we never need to insert a Result node."   It would
probably be good to Assert(outer_plan != NULL) before doing the
assignment, too, just as a guard against future bugs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-08 Thread Etsuro Fujita

On 2016/02/05 17:50, Ashutosh Bapat wrote:


Btw, IIUC, I think the patch fails to adjust the targetlist of the
top plan created that way, to output the fdw_scan_tlist, as
discussed in [1] (ie, I think the attached patch is needed, which is
created on top of your patch pg_fdw_join_v8.patch).



fdw_scan_tlist represents the output fetched from the foreign server and
is not necessarily the output of ForeignScan. ForeignScan node's output
is represented by tlist argument to.

1119 return make_foreignscan(tlist,
1120 local_exprs,
1121 scan_relid,
1122 params_list,
1123 fdw_private,
1124 fdw_scan_tlist,
1125 remote_exprs,
1126 outer_plan);

This tlist is built using build_path_tlist() for all join plans. IIUC,
all of them output the same targetlist. We don't need to make sure that
targetlist match as long as we are using the targetlist passed in by
create_scan_plan(). Do you have a counter example?


Maybe my explanation was not correct, but I'm saying that the 
targertlist of the above outer_plan should be set to the fdw_scan_tlist, 
to avoid misbehavior.  Here is such an example (add() in the example is 
a user defined function that simply adds two arguments, defined by: 
create function add(integer, integer) returns integer as 
'/path/to/func', 'add' language c strict):


postgres=# create foreign table foo (a int) server myserver options 
(table_name 'foo');
postgres=# create foreign table bar (a int) server myserver options 
(table_name 'bar');

postgres=# create table tab (a int, b int);
postgres=# insert into foo select a from generate_series(1, 1000) a;
postgres=# insert into bar select a from generate_series(1, 1000) a;
postgres=# insert into tab values (1, 1);
postgres=# analyze foo;
postgres=# analyze bar;
postgres=# analyze tab;

[Terminal 1]
postgres=# begin;
BEGIN
postgres=# update tab set b = b + 1 where a = 1;
UPDATE 1

[Terminal 2]
postgres=# explain verbose select tab.* from tab, foo, bar where foo.a = 
bar.a and add(foo.a, bar.a) > 0 limit 1 for update;


QUERY PLAN


-
 Limit  (cost=100.00..107.70 rows=1 width=70)
   Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
   ->  LockRows  (cost=100.00..2663.48 rows=333 width=70)
 Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
 ->  Nested Loop  (cost=100.00..2660.15 rows=333 width=70)
   Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
   ->  Foreign Scan  (cost=100.00..2654.97 rows=333 width=56)
 Output: foo.*, bar.*
 Filter: (add(foo.a, bar.a) > 0)
 Relations: (public.foo) INNER JOIN (public.bar)
 Remote SQL: SELECT ROW(r2.a), ROW(r3.a), r2.a, 
r3.a FROM (public.foo r2 INNER JOIN public.bar r3 ON (TRUE)) WHERE 
((r2.a = r3.a)) F

OR UPDATE OF r2 FOR UPDATE OF r3
 ->  Hash Join  (cost=247.50..301.25 rows=333 width=56)
   Output: foo.*, bar.*
   Hash Cond: (foo.a = bar.a)
   Join Filter: (add(foo.a, bar.a) > 0)
   ->  Foreign Scan on public.foo 
(cost=100.00..135.00 rows=1000 width=32)

 Output: foo.*, foo.a
 Remote SQL: SELECT a FROM public.foo 
FOR UPDATE
   ->  Hash  (cost=135.00..135.00 rows=1000 
width=32)

 Output: bar.*, bar.a
 ->  Foreign Scan on public.bar 
(cost=100.00..135.00 rows=1000 width=32)

   Output: bar.*, bar.a
   Remote SQL: SELECT a FROM 
public.bar FOR UPDATE

   ->  Materialize  (cost=0.00..1.01 rows=1 width=14)
 Output: tab.a, tab.b, tab.ctid
 ->  Seq Scan on public.tab  (cost=0.00..1.01 
rows=1 width=14)

   Output: tab.a, tab.b, tab.ctid
(27 rows)

postgres=# select tab.* from tab, foo, bar where foo.a = bar.a and 
add(foo.a, bar.a) > 0 limit 1 for update;


[Terminal 1]
postgres=# commit;
COMMIT

[Terminal 2] (After the commit in Terminal 1, Terminal 2 will show the 
following.)

 a | b
---+---
(0 rows)

This is wrong.  (Note that since the SELECT FOR UPDATE doesn't impose 
any condition on a tuple from the local table tab, the EvalPlanQual 
recheck executed should succeed.)  The reason for that is that the 
targetlist of the local join plan is the same as for the ForeignScan, 
which outputs neither foo.a nor bar.a required as an argument of the 
function add().


Best regards,
Etsuro Fujita





Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-06 Thread Robert Haas
On Sat, Feb 6, 2016 at 12:46 AM, Ashutosh Bapat
 wrote:
> Here it is rebased. Thanks for the pgindent run and committing core changes.
> I have to manage only one patch now :)
>
> pgindent is giving trouble with following two comments
>
> 2213 /* Run time cost includes:
> 2214  * 1. Run time cost (total_cost - startup_cost) of
> relations being
> 2215  *joined
> 2216  * 2. Run time cost of applying join clauses on the cross
> product of
> 2217  *the joining relations.
> 2218  * 3. Run time cost of applying pushed down other clauses
> on the
> 2219  *result of join
> 2220  * 4. Run time cost of applying nonpushable other clauses
> locally
> 2221  *on the result fetched from the foreign server.
>   */
>
> which I want itemized with each item starting on separate line. pgindent
> just bunches everything together.

The thing to do here is leave a blank line between each one.  You can
also put a line of dashes before and after the comment (see many
examples elsewhere in the source tree) to force pgindent to leave that
section completely untouched, but I think that this sort of list looks
better with blank lines anyway, so I'd go for that solution.

> 1159 /*
> 1160  * For a join relation FROM clause entry is deparsed as
> 1161  * ((outer relation)  (inner relation) ON
> (joinclauses)
> 1162  */
> where I want the second line as a separate line, but pgindent puts those two
> line together breaking the continuity of second line content.
>
> How do I make pgindent respect those changes as they are?

Same idea here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-05 Thread Etsuro Fujita

On 2016/02/04 21:42, Ashutosh Bapat wrote:


* Is it safe to replace outerjoinpath with its fdw_outerpath the
following way?  I think that if the join relation represented by
outerjoinpath has local conditions that can't be executed remotely,
we have to keep outerjoinpath in the path tree; we will otherwise
fail to execute the local conditions.  No?

+   /*
+* If either inner or outer path is a
ForeignPath corresponding to
+* a pushed down join, replace it with the
fdw_outerpath, so that we
+* maintain path for EPQ checks built
entirely of local join
+* strategies.
+*/
+   if (IsA(joinpath->outerjoinpath, ForeignPath))
+   {
+   ForeignPath *foreign_path;
+   foreign_path = (ForeignPath
*)joinpath->outerjoinpath;
+   if
(foreign_path->path.parent->reloptkind == RELOPT_JOINREL)
+   joinpath->outerjoinpath =
foreign_path->fdw_outerpath;
+   }



all the conditions (local and remote) should be part of fdw_outerpath as
well, since that's the alternate local path, which should produce (when
converted to the plan) the same result as the foreign path.
fdw_outerpath should be a local path set when paths for
outerjoinpath->parent was being created. Am I missing something?


I assumed by mistake that only the remote conditions were evaluated in a 
plan created from each fdw_outerpath.  Sorry for that.  I think that is 
a good idea!


Btw, IIUC, I think the patch fails to adjust the targetlist of the top 
plan created that way, to output the fdw_scan_tlist, as discussed in [1] 
(ie, I think the attached patch is needed, which is created on top of 
your patch pg_fdw_join_v8.patch).


Best regards,
Etsuro Fujita

[1] 
http://www.postgresql.org/message-id/ca+tgmoba4mskgquicgt5ckbpqj-tmpqefht_wy49ndwa91w...@mail.gmail.com
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***
*** 1067,1072  postgresGetForeignPlan(PlannerInfo *root,
--- 1067,1076 
  
  		/* Build the list of columns to be fetched from the foreign server. */
  		fdw_scan_tlist = build_tlist_to_deparse(foreignrel);
+ 
+ 		/* Replace the targetlist of outer_plan with fdw_scan_tlist, if any */
+ 		if (outer_plan)
+ 			outer_plan->targetlist = fdw_scan_tlist;
  	}
  
  	/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-05 Thread Robert Haas
On Fri, Feb 5, 2016 at 4:23 AM, Ashutosh Bapat
 wrote:
> I have corrected this in this set of patches. Also, I have included the
> change to build the join relation description while constructing fpinfo in
> the main patch since that avoids repeated building of the same at a small
> cost of constructing relation name for base relations, which goes waste if
> that relation is not going to be part of any pushable join tree.
>
> Ran pgindent as well.

pg_fdw_join_v9.patch does not aplpy.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-05 Thread Robert Haas
On Fri, Feb 5, 2016 at 9:09 AM, Ashutosh Bapat
 wrote:
>> Would it be nuts to set fdw_scan_tlist in all cases?  Would the code
>> come out simpler than what we have now?
>
> PFA the patch that can be applied on v9 patches.
>
> If there is a whole-row reference for base relation, instead of adding that
> as an additional column deparseTargetList() creates a list of all the
> attributes of that foreign table . The whole-row reference gets constructed
> during projection. This saves some network bandwidth while transferring the
> data from the foreign server. If we build the target list for base relation,
> we should include Vars for all the columns (like deparseTargetList). Thus we
> borrow some code from deparseTargetList to get all the attributes of a
> relation. I included this logic in function build_tlist_from_attrs_used(),
> which is being called by build_tlist_to_deparse(). So, before calling
> deparseSelectStmtForRel() the callers can just call build_tlist_to_deparse()
> and pass the targetlist to deparseSelectStmtForRel() and use the same to be
> handed over to the core code as fdw_scan_tlist. build_tlist_to_deparse()
> also constructs retrieved_attrs list, so that doesn't need to be passed
> around in deparse routines.
>
> But we now have similar code in three places deparseTargetList(),
> deparseAnalyzeSql() and build_tlist_from_attrs_used(). So, I re-wrote
> deparseTargetList() (which is now used to deparse returning list) to call
> build_tlist_from_attrs_used() followed by deparseExplicitTargetList(). The
> later and its minion deparseVar requires a deparse_expr_cxt to be passed.
> deparse_expr_cxt has a member to store RelOptInfo of the relation being
> deparsed. The callers of deparseReturningList() do not have it and thus
> deparse_expr_cxt required changes, which in turn required changes in other
> places. All in all, a larger refactoring. It touches more places than
> necessary for foreign join push down. So, I think, we should try that as a
> separate refactoring work. We may just do the work described in the first
> paragraph for join pushdown, but we will then be left with duplicate code,
> which I don't think is worth the output.

Yeah, I'm not convinced this is actually simpler; at first look, it
seems like it is just moving the complexity around.  I don't like the
fact that there are so many places where we have to do one thing for a
baserel and something totally different for a joinrel, which was the
point of my comment.  But this seems to add one instance of that and
remove one instance of that, so I don't see how we're coming out
better than a wash.  Maybe it's got more merit than I'm giving it
credit for, and I'm just not seeing it right at the moment...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-05 Thread Ashutosh Bapat
> Btw, IIUC, I think the patch fails to adjust the targetlist of the top
> plan created that way, to output the fdw_scan_tlist, as discussed in [1]
> (ie, I think the attached patch is needed, which is created on top of your
> patch pg_fdw_join_v8.patch).
>
>
fdw_scan_tlist represents the output fetched from the foreign server and is
not necessarily the output of ForeignScan. ForeignScan node's output is
represented by tlist argument to.

1119 return make_foreignscan(tlist,
1120 local_exprs,
1121 scan_relid,
1122 params_list,
1123 fdw_private,
1124 fdw_scan_tlist,
1125 remote_exprs,
1126 outer_plan);

This tlist is built using build_path_tlist() for all join plans. IIUC, all
of them output the same targetlist. We don't need to make sure that
targetlist match as long as we are using the targetlist passed in by
create_scan_plan(). Do you have a counter example?
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-05 Thread Ashutosh Bapat
Would it be nuts to set fdw_scan_tlist in all cases?  Would the code
> come out simpler than what we have now?
>
>
PFA the patch that can be applied on v9 patches.

If there is a whole-row reference for base relation, instead of adding that
as an additional column deparseTargetList() creates a list of all the
attributes of that foreign table . The whole-row reference gets constructed
during projection. This saves some network bandwidth while transferring the
data from the foreign server. If we build the target list for base
relation, we should include Vars for all the columns (like
deparseTargetList). Thus we borrow some code from deparseTargetList to get
all the attributes of a relation. I included this logic in function
build_tlist_from_attrs_used(), which is being called by
build_tlist_to_deparse(). So, before calling deparseSelectStmtForRel() the
callers can just call build_tlist_to_deparse() and pass the targetlist to
deparseSelectStmtForRel() and use the same to be handed over to the core
code as fdw_scan_tlist. build_tlist_to_deparse() also constructs
retrieved_attrs list, so that doesn't need to be passed around in deparse
routines.

But we now have similar code in three places deparseTargetList(),
deparseAnalyzeSql() and build_tlist_from_attrs_used(). So, I re-wrote
deparseTargetList() (which is now used to deparse returning list) to call
build_tlist_from_attrs_used() followed by deparseExplicitTargetList(). The
later and its minion deparseVar requires a deparse_expr_cxt to be passed.
deparse_expr_cxt has a member to store RelOptInfo of the relation being
deparsed. The callers of deparseReturningList() do not have it and thus
deparse_expr_cxt required changes, which in turn required changes in other
places. All in all, a larger refactoring. It touches more places than
necessary for foreign join push down. So, I think, we should try that as a
separate refactoring work. We may just do the work described in the first
paragraph for join pushdown, but we will then be left with duplicate code,
which I don't think is worth the output.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index fb72f45..7a2a67b 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -93,9 +93,10 @@ typedef struct foreign_loc_cxt
 typedef struct deparse_expr_cxt
 {
PlannerInfo *root;  /* global planner state */
-   RelOptInfo *foreignrel; /* the foreign relation we are planning 
for */
+   Relids  relids; /* Relids for which to deparse 
*/
StringInfo  buf;/* output buffer to append to */
List  **params_list;/* exprs that will become remote Params 
*/
+   boolis_joinrel; /* Are we deparsing for a join 
relation */
 } deparse_expr_cxt;
 
 #define REL_ALIAS_PREFIX   "r"
@@ -122,8 +123,7 @@ static void deparseTargetList(StringInfo buf,
  Bitmapset *attrs_used,
  List **retrieved_attrs,
  bool qualify_col);
-static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs,
- deparse_expr_cxt *context);
+static void deparseExplicitTargetList(List *tlist, deparse_expr_cxt *context);
 static void deparseReturningList(StringInfo buf, PlannerInfo *root,
 Index rtindex, Relation rel,
 bool trig_after_row,
@@ -151,13 +151,15 @@ static void printRemoteParam(int paramindex, Oid 
paramtype, int32 paramtypmod,
 deparse_expr_cxt *context);
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
   deparse_expr_cxt *context);
-static void deparseSelectSql(List *tlist, List **retrieved_attrs,
-deparse_expr_cxt *context);
+static void deparseSelectSql(List *tlist, RelOptInfo *rel,
+deparse_expr_cxt 
*context);
 static void deparseLockingClause(deparse_expr_cxt *context);
 static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context);
 static void appendConditions(List *exprs, deparse_expr_cxt *context);
 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
RelOptInfo *joinrel, bool use_alias, 
List **params_list);
+static List *build_tlist_from_attrs_used(Index rtindex, Bitmapset *attrs_used,
+   PlannerInfo *root, List 
**retrieved_attrs);
 
 
 /*
@@ -715,26 +717,119 @@ deparse_type_name(Oid type_oid, int32 typemod)
 }
 
 /*
+ * Convert input bitmap representation of columns into targetlist of
+ * corresponding Var nodes.
+ *
+ * List of 

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-05 Thread Etsuro Fujita

On 2016/02/04 21:57, Ashutosh Bapat wrote:


One more: I think the following in postgresGetForeignJoinPaths() is
a good idea, but I think it's okay to just check whether
root->rowMarks is non-NIL, because that since we have rowmarks for
all base relations except the target, if we have
root->parse->commandType==CMD_DELETE (or
root->parse->commandType==CMD_UPDATE), then there would be at least
one non-target base relation in the joinrel, which would have a rowmark.



Sorry, I am unable to understand it fully. But what you are suggesting
that if there are root->rowMarks, then we are sure that there is at
least one base relation apart from the target, which needs locking rows.
Even if we don't have one, still changes in a row of target relation
after it was scanned, can result in firing EPQ check, which would need
the local plan to be executed, thus even if root->rowMarks is NIL, EPQ
check can fire and we will need alternate local plan.


Yeah, I think that is true, but if root->rowMarks==NIL, we won't have 
non-target foreign tables, and therefore postgresGetForeignJoinPaths() 
will never be called.  No?


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-04 Thread Ashutosh Bapat
On Thu, Feb 4, 2016 at 3:28 PM, Etsuro Fujita 
wrote:

> On 2016/02/04 17:58, Etsuro Fujita wrote:
>
>> On 2016/02/04 8:00, Robert Haas wrote:
>>
>>> On Wed, Feb 3, 2016 at 5:56 PM, Robert Haas 
>>> wrote:
>>>
 On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
  wrote:

> PFA patches with naming conventions similar to previous ones.
> pg_fdw_core_v7.patch: core changes
> pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
> pg_join_pd_v7.patch: combined patch for ease of testing.
>

> One more: I think the following in postgresGetForeignJoinPaths() is a good
> idea, but I think it's okay to just check whether root->rowMarks is
> non-NIL, because that since we have rowmarks for all base relations except
> the target, if we have root->parse->commandType==CMD_DELETE (or
> root->parse->commandType==CMD_UPDATE), then there would be at least one
> non-target base relation in the joinrel, which would have a rowmark.
>
>
Sorry, I am unable to understand it fully. But what you are suggesting that
if there are root->rowMarks, then we are sure that there is at least one
base relation apart from the target, which needs locking rows. Even if we
don't have one, still changes in a row of target relation after it was
scanned, can result in firing EPQ check, which would need the local plan to
be executed, thus even if root->rowMarks is NIL, EPQ check can fire and we
will need alternate local plan.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-04 Thread Ashutosh Bapat
>
>If so, my concern is, the helper function probably wouldn't
>> extend to the parameterized-foreign-join-path cases, though that
>> would work well for the unparameterized-foreign-join-path cases.  We
>> don't support parameterized-foreign-join paths for 9.6?
>>
>
> If we do not find a local path with given parameterization, it means
>> there are other local parameterized paths which are superior to it. This
>> possibly indicates that there will be foreign join parameterised paths
>> which are superior to this parameterized path, so we basically do not
>> create foreign join path with that parameterization.
>>
>
> The latest version of the postgres_fdw join pushdown patch will support
> only the unparameterized-path case, so we don't have to consider this, but
> why do you think the superiority of parameterizations is preserved between
> remote joining and local joining?
>

AFAIU, parameterization for local paths bubbles up from base relations. For
foreign relations, we calculate the cost of parameterization when
use_remote_estimate is ON, which means it's accurate. So, except that we
will get clause selectivity wrong (if foreign tables were analyzed
regularly even that won't be the case, I guess) resulting in some small
sway in the costs as compared to those of parameterized foreign join paths.
So, I am guessing that the local estimates for parameterized join paths
would be closer to parameterized foreign paths (if we were to produce
those). Hence my statement. There is always a possibility that those two
costs are way too different, hence I have used phrase "possibly" there. I
could be wrong.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-04 Thread Etsuro Fujita

On 2016/02/04 17:58, Etsuro Fujita wrote:

On 2016/02/04 8:00, Robert Haas wrote:

On Wed, Feb 3, 2016 at 5:56 PM, Robert Haas 
wrote:

On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
 wrote:

PFA patches with naming conventions similar to previous ones.
pg_fdw_core_v7.patch: core changes
pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
pg_join_pd_v7.patch: combined patch for ease of testing.


One more: I think the following in postgresGetForeignJoinPaths() is a 
good idea, but I think it's okay to just check whether root->rowMarks is 
non-NIL, because that since we have rowmarks for all base relations 
except the target, if we have root->parse->commandType==CMD_DELETE (or 
root->parse->commandType==CMD_UPDATE), then there would be at least one 
non-target base relation in the joinrel, which would have a rowmark.


+   if (root->parse->commandType == CMD_DELETE ||
+   root->parse->commandType == CMD_UPDATE ||
+   root->rowMarks)
+   {
+   epq_path = GetPathForEPQRecheck(joinrel);
+   if (!epq_path)
+   {
+			elog(DEBUG3, "could not push down foreign join because a local path 
suitable for EPQ checks was not found");

+   return;
+   }
+   }

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-04 Thread Ashutosh Bapat
> * Is it safe to replace outerjoinpath with its fdw_outerpath the following
> way?  I think that if the join relation represented by outerjoinpath has
> local conditions that can't be executed remotely, we have to keep
> outerjoinpath in the path tree; we will otherwise fail to execute the local
> conditions.  No?
>
> +   /*
> +* If either inner or outer path is a ForeignPath
> corresponding to
> +* a pushed down join, replace it with the
> fdw_outerpath, so that we
> +* maintain path for EPQ checks built entirely of
> local join
> +* strategies.
> +*/
> +   if (IsA(joinpath->outerjoinpath, ForeignPath))
> +   {
> +   ForeignPath *foreign_path;
> +   foreign_path = (ForeignPath
> *)joinpath->outerjoinpath;
> +   if (foreign_path->path.parent->reloptkind
> == RELOPT_JOINREL)
> +   joinpath->outerjoinpath =
> foreign_path->fdw_outerpath;
> +   }
>
>
all the conditions (local and remote) should be part of fdw_outerpath as
well, since that's the alternate local path, which should produce (when
converted to the plan) the same result as the foreign path. fdw_outerpath
should be a local path set when paths for outerjoinpath->parent was being
created. Am I missing something?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-04 Thread Robert Haas
On Thu, Feb 4, 2016 at 11:55 AM, Ashutosh Bapat
 wrote:
> Patches attached with the previous mail.

The core patch seemed to me to be in good shape now, so I committed
that.  Not sure I'll be able to get to another read-through of the
main patch today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-04 Thread Robert Haas
On Thu, Feb 4, 2016 at 11:55 AM, Ashutosh Bapat
 wrote:
> A query with FOR UPDATE/SHARE will be considered parallel unsafe in
> has_parallel_hazard_walker() and root->glob->parallelModeOK will be marked
> false. This implies that none of the base relations and hence join relations
> will be marked as consider_parallel. IIUC your logic, none of the queries
> with FOR UPDATE/SHARE will get a local path which is marked parallel_safe
> and thus join will not be pushed down. Why do you think we need to skip
> paths that aren't parallel_safe? I have left aside this change in the latest
> patches.

I changed this back before committing but, ah nuts, you're right.  Sigh.  Sorry.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-04 Thread Ashutosh Bapat
On Thu, Feb 4, 2016 at 4:30 AM, Robert Haas  wrote:

> On Wed, Feb 3, 2016 at 5:56 PM, Robert Haas  wrote:
> > On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
> >  wrote:
> >> PFA patches with naming conventions similar to previous ones.
> >> pg_fdw_core_v7.patch: core changes
> >> pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
> >> pg_join_pd_v7.patch: combined patch for ease of testing.
> >
> > Hmm, I think that GetPathForEPQRecheck() is a pretty terrible name.
> > How about GetExistingJoinPath()?
>

GetExistingLocalJoinPath()? Used that.


>
> Oops.  Hit Send too soon.  Also, how about writing if
> (path->param_info != NULL) continue; instead of burying the core of
> the function in another level of indentation?


Hmm. Done.


> I think you should skip
> paths that aren't parallel_safe, too, and the documentation should be
> clear that this will find an unparameterized, parallel-safe joinpath
> if one exists.
>

A query with FOR UPDATE/SHARE will be considered parallel unsafe in
has_parallel_hazard_walker() and root->glob->parallelModeOK will be marked
false. This implies that none of the base relations and hence join
relations will be marked as consider_parallel. IIUC your logic, none of the
queries with FOR UPDATE/SHARE will get a local path which is marked
parallel_safe and thus join will not be pushed down. Why do you think we
need to skip paths that aren't parallel_safe? I have left aside this change
in the latest patches.



>
> +   ForeignPath *foreign_path;
> +   foreign_path = (ForeignPath
> *)joinpath->outerjoinpath;
>
> Maybe insert a blank line between here, and in the other, similar case.
>

Done.

Patches attached with the previous mail.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-04 Thread Etsuro Fujita

On 2016/01/29 17:52, Ashutosh Bapat wrote:

On Fri, Jan 29, 2016 at 2:05 PM, Etsuro Fujita
> wrote:



On 2016/01/29 1:26, Ashutosh Bapat wrote:
Here is the summary of changes from the last set of patches



2. Included fix for EvalPlanQual in postgres_fdw - an alternate
local
path is obtained from the list of paths linked to the joinrel. Since
this is done before adding the ForeignPath, we should be a local
path
available for given join.



I looked at that code in the patch (ie, postgresRecheckForeignScan
and the helper function that creates a local join path for a given
foreign join path.), briefly.  Maybe I'm missing something, but I
think that is basically the same as the fix I proposed for
addressing this issue, posted before [1], right?



Yes, although I have added functions to copy the paths, not consider
pathkeys and change the relevant members of the paths. Sorry  if I have
missed giving you due credits.



   If so, my concern is, the helper function probably wouldn't
extend to the parameterized-foreign-join-path cases, though that
would work well for the unparameterized-foreign-join-path cases.  We
don't support parameterized-foreign-join paths for 9.6?



If we do not find a local path with given parameterization, it means
there are other local parameterized paths which are superior to it. This
possibly indicates that there will be foreign join parameterised paths
which are superior to this parameterized path, so we basically do not
create foreign join path with that parameterization.


The latest version of the postgres_fdw join pushdown patch will support 
only the unparameterized-path case, so we don't have to consider this, 
but why do you think the superiority of parameterizations is preserved 
between remote joining and local joining?


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-04 Thread Etsuro Fujita

On 2016/02/04 8:00, Robert Haas wrote:

On Wed, Feb 3, 2016 at 5:56 PM, Robert Haas  wrote:

On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
 wrote:

PFA patches with naming conventions similar to previous ones.
pg_fdw_core_v7.patch: core changes
pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
pg_join_pd_v7.patch: combined patch for ease of testing.


Thank you for working on this, Ashutosh and Robert!  I've not look at 
the patches closely yet, but ISTM the patches would be really in good shape!



Hmm, I think that GetPathForEPQRecheck() is a pretty terrible name.
How about GetExistingJoinPath()?


+1


Oops.  Hit Send too soon.  Also, how about writing if
(path->param_info != NULL) continue; instead of burying the core of
the function in another level of indentation?  I think you should skip
paths that aren't parallel_safe, too, and the documentation should be
clear that this will find an unparameterized, parallel-safe joinpath
if one exists.

+   ForeignPath *foreign_path;
+   foreign_path = (ForeignPath
*)joinpath->outerjoinpath;

Maybe insert a blank line between here, and in the other, similar case.


* Is it safe to replace outerjoinpath with its fdw_outerpath the 
following way?  I think that if the join relation represented by 
outerjoinpath has local conditions that can't be executed remotely, we 
have to keep outerjoinpath in the path tree; we will otherwise fail to 
execute the local conditions.  No?


+   /*
+* If either inner or outer path is a ForeignPath 
corresponding to
+* a pushed down join, replace it with the 
fdw_outerpath, so that we
+* maintain path for EPQ checks built entirely of local 
join
+* strategies.
+*/
+   if (IsA(joinpath->outerjoinpath, ForeignPath))
+   {
+   ForeignPath *foreign_path;
+   foreign_path = (ForeignPath 
*)joinpath->outerjoinpath;
+   if (foreign_path->path.parent->reloptkind == 
RELOPT_JOINREL)
+   joinpath->outerjoinpath = 
foreign_path->fdw_outerpath;
+   }

* IIUC, that function will be used by custom joins, so I think it would 
be better to put that function somewhere in the /optimizer directory 
(pathnode.c?).


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-03 Thread Robert Haas
On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
 wrote:
> The patch implements your algorithm to deparse a query as described in
> previous mail. The logic is largely coded in deparseFromExprForRel() and
> foreign_join_ok(). The later one pulls up the clauses from joining relations
> and first one deparses the FROM clause recursively.

Cool!

+   /* Add outer relation. */
+   appendStringInfo(buf, "(%s", join_sql_o.data);
+
+   /* Add join type */
+   appendStringInfo(buf, " %s JOIN ",
get_jointype_name(fpinfo->jointype));
+
+   /* Add inner relation */
+   appendStringInfo(buf, "%s", join_sql_i.data);
+
+   /* Append ON clause; ON (TRUE) in case empty join clause list */
+   appendStringInfoString(buf, " ON ");

Uh, couldn't that all be done as a single appendStringInfo?

It seems a little tortured the way you're passing "relations" all the
way down the callstack from deparseSelectStmtForRel, and at each level
it might be NULL.  If you made a rule that the caller MUST pass a
StringInfo, then you could get rid of some conditional logic in
deparseFromExprForRel.   By the way, deparseSelectSql()'s header
comment could use an update to mention this additional argument.
Generally, it's helpful to say in each relevant function header
comment something like "May be NULL" or "Must not be NULL" in cases
like this to clarify the API contract.

Similarly, I would be inclined to continue to require that
deparseTargetList() have retrieved_attrs != NULL.  If the caller
doesn't want the value, it can pass a dummy variable and ignore the
return value.  This is of course slightly more cycles, but I think
it's unlikely to matter, and making the code simpler would be good.

+ * Function is the entry point to deparse routines while constructing
+ * ForeignScan plan or estimating cost and size for ForeignPath. It is called
+ * recursively to build SELECT statements for joining relations of a
pushed down
+ * foreign join.

"This function is the entrypoint to the routines, either when
constructing ForeignScan plan or when estimating" etc.

+ * tuple descriptor for the corresponding foreign scan. For a base relation,
+ * which is not part of a pushed down join, fpinfo->attrs_used can be used to
+ * construct SELECT clause, thus the function doesn't need tlist. Hence when
+ * tlist passed, the function assumes that it's constructing the SELECT
+ * statement to be part of a pushed down foreign join.

I thought you got rid of that assumption.  I think it should be gotten
rid of, and then the comment can go too.  If we're keeping the comment
for some reason, should be "doesn't need the tlist" and when "when the
tlist is passed".

+ * 1, since those are the attribute numbers are in the corresponding scan.

Extra "are".  Should be: "Those are the attribute numbers in the
corresponding scan."

Would it be nuts to set fdw_scan_tlist in all cases?  Would the code
come out simpler than what we have now?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-03 Thread Robert Haas
On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
 wrote:
> PFA patches with naming conventions similar to previous ones.
> pg_fdw_core_v7.patch: core changes
> pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
> pg_join_pd_v7.patch: combined patch for ease of testing.

Hmm, I think that GetPathForEPQRecheck() is a pretty terrible name.
How about GetExistingJoinPath()?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-03 Thread Robert Haas
On Wed, Feb 3, 2016 at 5:56 PM, Robert Haas  wrote:
> On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
>  wrote:
>> PFA patches with naming conventions similar to previous ones.
>> pg_fdw_core_v7.patch: core changes
>> pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
>> pg_join_pd_v7.patch: combined patch for ease of testing.
>
> Hmm, I think that GetPathForEPQRecheck() is a pretty terrible name.
> How about GetExistingJoinPath()?

Oops.  Hit Send too soon.  Also, how about writing if
(path->param_info != NULL) continue; instead of burying the core of
the function in another level of indentation?  I think you should skip
paths that aren't parallel_safe, too, and the documentation should be
clear that this will find an unparameterized, parallel-safe joinpath
if one exists.

+   ForeignPath *foreign_path;
+   foreign_path = (ForeignPath
*)joinpath->outerjoinpath;

Maybe insert a blank line between here, and in the other, similar case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-02 Thread Ashutosh Bapat
On Tue, Feb 2, 2016 at 5:18 AM, Robert Haas  wrote:

> On Mon, Feb 1, 2016 at 8:27 AM, Ashutosh Bapat
>  wrote:
> > Here are patches rebased on recent commit
> > cc592c48c58d9c1920f8e2063756dcbcce79e4dd. Renamed original
> deparseSelectSql
> > as deparseSelectSqlForBaseRel and added deparseSelectSqlForJoinRel to
> > construct SELECT and FROM clauses for base and join relations.
> >
> > pg_fdw_core_v5.patch GetUserMappingById changes
> > pg_fdw_join_v5.patch: postgres_fdw changes for join pushdown including
> > suggestions as described below
> > pg_join_pd_v5.patch: combined patch for ease of testing.
> >
> > The patches also have following changes along with the changes described
> in
> > my last mail.
> > 1. Revised the way targetlists are handled. For a bare base relation the
> > SELECT clause is deparsed from fpinfo->attrs_used but for a base relation
> > which is part of join relation, the expected targetlist is passed down to
> > deparseSelectSqlForBaseRel(). This change removed 75 odd lines in
> > build_tlist_to_deparse() which were very similar to
> > deparseTargetListFromAttrsUsed() in the previous patch.
>
> Nice!
>
> > 2. Refactored postgresGetForeignJoinPaths to be more readable moving the
> > code to assess safety of join pushdown into a separate function.
>
> That looks good.  But maybe call the function foreign_join_ok() or
> something like that?  is_foreign_join() isn't terrible but it sounds a
> little odd to me.
>

I used name is_foreign_join(), which assesses push-down safety of a join,
to have similar naming convention with is_foreign_expr(), which checks
push-down safety of an expression. But foreign_join_ok() is fine too. Used
that.


>
> The path-copying stuff in get_path_for_epq_recheck() looks a lot
> better now, but you neglected to add a comment explaining why you did
> it that way (e.g. "Make a shallow copy of the join path, because the
> planner might free the original structure after a future add_path().
> We don't need to copy the substructure, though; that won't get freed."
>

I alluded to that in the second sentence of comment
3259  * Since we will need to replace any foreign paths for join with their
alternate
3260  * paths, we need make a copy of the local path chosen. Also, that
helps in case
3261  * the planner chooses to throw away the local path.

But that wasn't as clear as your wording. Rewrote the paragraph using your
wording.
3259  * Since we will need to replace any foreign paths for join with their
alternate
3260  * paths, we need make a copy of the local path chosen. Make a shallow
copy of
3261  * the join path, because the planner might free the original
structure after a
3262  * future add_path(). We don't need to copy the substructure, though;
that won't
3263  * get freed.


>  I would forget about setting merge_path->materialize_inner = false;
> that doesn't seem essential.


Done.


> Also, I would arrange things so that if
> you hit an unrecognized path type (like a custom join, or a gather)
> you skip that particular path instead of erroring out.


Ok. Done.


> I think this
> whole function should be moved to core,


I have moved the function to foreign.c where most of the FDW APIs are
located and declared it in fdwapi.h. Since the function deals with the
paths, I thought of adding it to some path related file, but since it's a
helper function that an FDW can use, I thought foreign.c would be better. I
have also added documentation in fdwhandler.sgml. I have renamed the
function as GetPathForEPQRecheck() in order to be consistent with other FDW
APIs. In the description I have just mentioned copy of a local path. I am
not sure whether we should say "shallow copy".


> and I think the argument
> should be a RelOptInfo * rather than a List *.
>

Done.


>
> + * We can't know VERBOSE option is specified or not, so always add
> shcema
>
> We can't know "whether" VERBOSE...
> shcema -> schema
>
>
Done.


> + * the join relaiton is already considered, so that we won't waste
> time in
>
> Typo.
>
>
Done.


> + * judging safety of join pushdow and adding the same paths again if
> found
>
> Typo.
>
> Done.

Sorry for those typos.

Attaching patches with reply to your next mail.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 11:21 AM, Ashutosh Bapat
 wrote:
>> Why does deparseSelectStmtForRel change the order of the existing
>> arguments?  I have no issue with adding new arguments as required, but
>> rearranging the existing argument order doesn't serve any useful
>> purpose that is immediately apparent.
>
> deparseSelectStmtForRel has two sets of arguments, input and output. They
> are separated in the declaration all inputs come first, followed by all
> outputs. The inputs were ordered according to their appearance in SELECT
> statement, so I added tlist before remote_conds. I should have added
> relations, which is an output argument, at the end, but I accidentally added
> it between existing output arguments. Anyway, I will go ahead and just add
> the new arguments after the existing ones.

No, that's not what I'm asking for, nor do I think it's right.  What
I'm complaining about is that originally params_list was after
retrieved_attrs, but in v5 it's before retrieved_attrs.  I'm fine with
inserting tlist after rel, or in general inserting new arguments in
the sequence.  But you reversed the relative ordering of params_list
and retrieved_attrs.

> I was thinking on the similar lines except rN aliases. I think there will be
> problem for queries like
> postgres=# explain verbose select * from lt left join (select bar.a, foo.b
> from bar left join foo on (bar.a = foo.a) where bar.b + foo.b < 10) q on
> (lt.b = q.b);
>QUERY PLAN
> 
>  Hash Right Join  (cost=318.03..872.45 rows=43 width=16)
>Output: lt.a, lt.b, bar.a, foo.b
>Hash Cond: (foo.b = lt.b)
>->  Merge Join  (cost=317.01..839.07 rows=8513 width=8)
>  Output: bar.a, foo.b
>  Merge Cond: (bar.a = foo.a)
>  Join Filter: ((bar.b + foo.b) < 10)
>  ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
>Output: bar.a, bar.b
>Sort Key: bar.a
>->  Seq Scan on public.bar  (cost=0.00..32.60 rows=2260
> width=8)
>  Output: bar.a, bar.b
>  ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
>Output: foo.b, foo.a
>Sort Key: foo.a
>->  Seq Scan on public.foo  (cost=0.00..32.60 rows=2260
> width=8)
>  Output: foo.b, foo.a
>->  Hash  (cost=1.01..1.01 rows=1 width=8)
>  Output: lt.a, lt.b
>  ->  Seq Scan on public.lt  (cost=0.00..1.01 rows=1 width=8)
>Output: lt.a, lt.b
> (21 rows)
>
> The subquery q is pulled up, so there won't be trace of q in the join tree
> except may be a useless RTE for the subquery. There will be RelOptInfo
> representing join between lt, bar and foo and a RelOptInfo for join between
> bar and foo. The join filter bar.b + foo.b < 10 needs to be evaluated before
> joining (bar, foo) with lt and should go with bar left join foo. But the
> syntax doesn't support something like "bar left join foo on (bar.a = foo.a)
> where bar.b + foo.b". So we will have to construct a SELECT statement for
> this join and add to the FROM clause with a subquery alias and then refer
> the columns of foo and bar with that subquery alias.

Hmm, does it work if we put bar.b + foo.b < 10 in the ON clause for
the join between lt and foo/bar? I think so...

> Further during the process of qual placement, quals that can be evaluated at
> the level of given relation in the join tree are attached to that relation
> if they can be pushed down. Thus if we see a qual attached to a given
> relation, AFAIU, we can not say whether it needs to be evaluated there
> (similar to above query) or planner pushed it down for optimization, and
> thus for every join relation with quals we will need to build subqueries
> with aliases.

I don't think that's true.  I theorize that every qual can either go
into the top level WHERE clause or the ON clause of some join.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-01 Thread Robert Haas
On Mon, Feb 1, 2016 at 8:27 AM, Ashutosh Bapat
 wrote:
> Here are patches rebased on recent commit
> cc592c48c58d9c1920f8e2063756dcbcce79e4dd. Renamed original deparseSelectSql
> as deparseSelectSqlForBaseRel and added deparseSelectSqlForJoinRel to
> construct SELECT and FROM clauses for base and join relations.
>
> pg_fdw_core_v5.patch GetUserMappingById changes
> pg_fdw_join_v5.patch: postgres_fdw changes for join pushdown including
> suggestions as described below
> pg_join_pd_v5.patch: combined patch for ease of testing.
>
> The patches also have following changes along with the changes described in
> my last mail.
> 1. Revised the way targetlists are handled. For a bare base relation the
> SELECT clause is deparsed from fpinfo->attrs_used but for a base relation
> which is part of join relation, the expected targetlist is passed down to
> deparseSelectSqlForBaseRel(). This change removed 75 odd lines in
> build_tlist_to_deparse() which were very similar to
> deparseTargetListFromAttrsUsed() in the previous patch.

Nice!

> 2. Refactored postgresGetForeignJoinPaths to be more readable moving the
> code to assess safety of join pushdown into a separate function.

That looks good.  But maybe call the function foreign_join_ok() or
something like that?  is_foreign_join() isn't terrible but it sounds a
little odd to me.

The path-copying stuff in get_path_for_epq_recheck() looks a lot
better now, but you neglected to add a comment explaining why you did
it that way (e.g. "Make a shallow copy of the join path, because the
planner might free the original structure after a future add_path().
We don't need to copy the substructure, though; that won't get freed."
 I would forget about setting merge_path->materialize_inner = false;
that doesn't seem essential.  Also, I would arrange things so that if
you hit an unrecognized path type (like a custom join, or a gather)
you skip that particular path instead of erroring out.  I think this
whole function should be moved to core, and I think the argument
should be a RelOptInfo * rather than a List *.

+ * We can't know VERBOSE option is specified or not, so always add shcema

We can't know "whether" VERBOSE...
shcema -> schema

+ * the join relaiton is already considered, so that we won't waste time in

Typo.

+ * judging safety of join pushdow and adding the same paths again if found

Typo.

More when I have a bit more time to look at this...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-01 Thread Robert Haas
On Mon, Feb 1, 2016 at 6:48 PM, Robert Haas  wrote:
> More when I have a bit more time to look at this...

Why does deparseSelectStmtForRel change the order of the existing
arguments?  I have no issue with adding new arguments as required, but
rearranging the existing argument order doesn't serve any useful
purpose that is immediately apparent.  There's also no real need for
the rel -> foreignrel renaming.

+/*
+ * If we have constructed the SELECT clause from the targetlist, construct
+ * the retrieved attributes list as continuously increasing list of
+ * integers.
+ */
+if (retrieved_attrs && tlist)
+{
+int i;
+*retrieved_attrs = NIL;
+for (i = 1; i <= list_length(tlist); i++)
+*retrieved_attrs = lappend_int(*retrieved_attrs, i);
+}

This is really wonky.  First, you pass retrieved_attrs to
deparseSelectSqlForBaseRel, but then you have this code which blows it
away and replaces it if tlist != NIL.  So I guess this will run always
for a join relation, and for a base relation only sometimes.  But
that's certainly not at all clear.  I think you need to find some way
of rearranging this so that it's more obvious what is going on here.

I suggest not renaming the existing deparseTargetList() and instead
coming up with a different name for the new thing you need, maybe
deparseExplicitTargetList().

How about adding another sentence to the header comment for
appendConditions() saying something like "This is used for both WHERE
clauses and for JOIN .. ON"?

+ * The targetlist is list of TargetEntry's which in turn contains Var nodes.

contain.

+/*
+ * Output join name for given join type */

Formatting.  This patch, overall, is badly in need of a pgindent run.

+/*
+ * XXX
+ * Since the query is being built in recursive manner from bottom up,
+ * the FOR UPDATE/SHARE clause referring the base relations can
not be added
+ * at the top level. They need to be added to the subqueries corresponding
+ * to the base relations. This has an undesirable effect of locking more
+ * rows than specified by user, as it locks even those rows from base
+ * relations which are not part of the final join result. To avoid this
+ * undesirable effect, we need to build the join query without the
+ * subqueries, which for now, seems hard.
+ */

This is a second good reason that we should actually do that work
instead of complaining that it's too hard.  The first one is that the
queries that come out of this right now are too long and hard to read.
I actually don't see why this is all that hard.  Deparsing the target
list is simple enough; you just need to emit tab.col using varno to
determine what tab looks like and varattno to determine what col looks
like.  The trickier part is emitting the FROM clause. But this doesn't
seem all that hard either.  Suppose that when we construct the fpinfo
(PgFdwRelationInfo *) for a relation, we include in it a FROM clause
appropriate to that rel.  So, for a baserel, it's something like "foo
r4" where 4 is foo's RTI.  For a joinrel, do this:

1. Emit the FROM clause constructed for the outer relation,
surrounding it with parentheses if the outer relation is a joinrel.
2. Emit " JOIN ", " LEFT JOIN ", " RIGHT JOIN ", or " FULL JOIN "
according to the join type.
3. Emit the FROM clause constructed for the inner relation,
surrounding it with parentheses if the inner relation is a joinrel.
4. Emit " ON ".
5. Emit the joinqual.

This will produce nice things like (foo r3 JOIN bar r4 ON r3.x = r4.x)
JOIN baz r2 ON r3.y = r2.y

Then, you'd also need to stuff the conditions into the
PgFdwRelationInfo so that those could be added to paths constructed at
higher levels.  But that's not too hard either.  Basically you'd end
up with mostly the same stuff you have now, but the PgFdwRelationInfo
would store a join clause and a set of deparsed quals to be included
in the FROM and WHERE clauses respectively.  And then you'd pull the
information from the inner and outer sides to build up what you need
at the joinrel level.

This would actually be faster than what you've got right now, because
right now you're recursing down the whole join tree all over again at
each new join level, maybe not the best plan.

+if (foreignrel->reloptkind == RELOPT_JOINREL)
+return;
+
 /* Add any necessary FOR UPDATE/SHARE. */
 deparseLockingClause();

Generally, I think in these kinds of cases it is better to reverse the
test and get rid of the return statement, like this:

if (foreignrel->reloptkind != RELOPT_JOINREL)
deparseLockingClause();

The way you wrote it, somebody who wants to add more code to the end
of the function will probably have to make that change anyhow.
Really, your intention here was to skip that code for joins, not to
skip the rest of the function for baserels.

@@ -1424,9 +1875,7 @@ deparseVar(Var *node, deparse_expr_cxt *context)
 

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-30 Thread Robert Haas
On Fri, Jan 29, 2016 at 3:46 AM, Ashutosh Bapat
 wrote:
> PFA patch to move code to deparse SELECT statement into a function
> deparseSelectStmtForRel(). This code is duplicated in
> estimate_path_cost_size() and postgresGetForeignPlan(), so moving it into a
> function avoids that duplication. As a side note, estimate_path_cost_size()
> doesn't add FOR SHARE/UPDATE clause to the statement being EXPLAINed, even
> if the actual statement during execution would have it. This difference
> looks unintentional to me. This patch corrects it as well.
> appendOrderByClause and appendWhereClause both create a context within
> themselves and pass it to deparseExpr. This patch creates the context once
> in deparseSelectStmtForRel() and then passes it to the other deparse
> functions avoiding repeated context creation.

Right, OK.  I think this is good, so, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-29 Thread Ashutosh Bapat
On Fri, Jan 29, 2016 at 2:05 PM, Etsuro Fujita 
wrote:

> On 2016/01/29 1:26, Ashutosh Bapat wrote:
>
>> Here's an updated version of the previous patches, broken up like before
>>
>
> 2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below
>>
>
> Here is the summary of changes from the last set of patches
>>
>
> 2. Included fix for EvalPlanQual in postgres_fdw - an alternate local
>> path is obtained from the list of paths linked to the joinrel. Since
>> this is done before adding the ForeignPath, we should be a local path
>> available for given join.
>>
>
> I looked at that code in the patch (ie, postgresRecheckForeignScan and the
> helper function that creates a local join path for a given foreign join
> path.), briefly.  Maybe I'm missing something, but I think that is
> basically the same as the fix I proposed for addressing this issue, posted
> before [1], right?


Yes, although I have added functions to copy the paths, not consider
pathkeys and change the relevant members of the paths. Sorry  if I have
missed giving you due credits.


>   If so, my concern is, the helper function probably wouldn't extend to
> the parameterized-foreign-join-path cases, though that would work well for
> the unparameterized-foreign-join-path cases.  We don't support
> parameterized-foreign-join paths for 9.6?
>
>
If we do not find a local path with given parameterization, it means there
are other local parameterized paths which are superior to it. This possibly
indicates that there will be foreign join parameterised paths which are
superior to this parameterized path, so we basically do not create foreign
join path with that parameterization.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-29 Thread Ashutosh Bapat
On Fri, Jan 29, 2016 at 9:51 AM, Robert Haas  wrote:

> On Thu, Jan 28, 2016 at 11:26 AM, Ashutosh Bapat
>  wrote:
> > 2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below
>
> This patch no longer quite applies because of conflicts with one of
> your other patches that I applied today (cf. commit
> fbe5a3fb73102c2cfec114a67943f4474383).
>
> And then I broke it some more by committing a patch to extract
> deparseLockingClause from postgresGetForeignPlan and move it to
> deparse.c, but that should pretty directly reduce the size of this
> patch.  I wonder if there are any other bits of refactoring of that
> sort that we can do in advance of landing the main patch, just to
> simplify review.  But I'm not sure there are: this patch removes very
> little existing code; it just adds a ton of new stuff.
>

PFA patch to move code to deparse SELECT statement into a function
deparseSelectStmtForRel(). This code is duplicated in
estimate_path_cost_size() and postgresGetForeignPlan(), so moving it into a
function avoids that duplication. As a side note, estimate_path_cost_size()
doesn't add FOR SHARE/UPDATE clause to the statement being EXPLAINed, even
if the actual statement during execution would have it. This difference
looks unintentional to me. This patch corrects it as well.
appendOrderByClause and appendWhereClause both create a context within
themselves and pass it to deparseExpr. This patch creates the context once
in deparseSelectStmtForRel() and then passes it to the other deparse
functions avoiding repeated context creation.


> copy_path_for_epq_recheck() and friends are really ugly.  Should we
> consider just adding copyObject() support for those node types
> instead?
>

the note in copyfuncs.c says
 * We also do not support copying Path trees, mainly
 * because the circular linkages between RelOptInfo and Path nodes can't
 * be handled easily in a simple depth-first traversal.

We may avoid that by just copying the pointer to RelOptInfo and not copy
the whole RelOptInfo. The other problem is paths in epq_paths will be
copied as many times as the number of 2-way joins pushed down. Let me give
it a try and produce patch with that.

I'm sure there's more -- this is a huge patch and I don't fully
> understand it yet -- but I'm out of energy for tonight.
>
>
Thanks a lot for your comments and moving this patch forward.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


pg_fdw_deparse_select.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-29 Thread Etsuro Fujita

On 2016/01/29 1:26, Ashutosh Bapat wrote:

Here's an updated version of the previous patches, broken up like before



2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below



Here is the summary of changes from the last set of patches



2. Included fix for EvalPlanQual in postgres_fdw - an alternate local
path is obtained from the list of paths linked to the joinrel. Since
this is done before adding the ForeignPath, we should be a local path
available for given join.


I looked at that code in the patch (ie, postgresRecheckForeignScan and 
the helper function that creates a local join path for a given foreign 
join path.), briefly.  Maybe I'm missing something, but I think that is 
basically the same as the fix I proposed for addressing this issue, 
posted before [1], right?  If so, my concern is, the helper function 
probably wouldn't extend to the parameterized-foreign-join-path cases, 
though that would work well for the unparameterized-foreign-join-path 
cases.  We don't support parameterized-foreign-join paths for 9.6?


Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/5666b59f.6010...@lab.ntt.co.jp




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-28 Thread Robert Haas
On Thu, Jan 28, 2016 at 11:26 AM, Ashutosh Bapat
 wrote:
> 1. pg_fdw_core_v3.patch: changes in core - more description below

I've committed most of this patch, with some modifications.  In
particular, I moved CachedPlanSource's hasForeignJoin flag to the
CachedPlan and renamed it has_foreign_join, consistent with the naming
of other members.

The GetUserMappingById() function seemed like a separate thing, so I
left that out of this commit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-28 Thread Robert Haas
On Thu, Jan 28, 2016 at 11:26 AM, Ashutosh Bapat
 wrote:
> 2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below

This patch no longer quite applies because of conflicts with one of
your other patches that I applied today (cf. commit
fbe5a3fb73102c2cfec114a67943f4474383).

And then I broke it some more by committing a patch to extract
deparseLockingClause from postgresGetForeignPlan and move it to
deparse.c, but that should pretty directly reduce the size of this
patch.  I wonder if there are any other bits of refactoring of that
sort that we can do in advance of landing the main patch, just to
simplify review.  But I'm not sure there are: this patch removes very
little existing code; it just adds a ton of new stuff.

I think the names deparseColumnRefForJoinrel and
deparseColumnRefForBaserel are better than the previous names, but I
would capitalize the last "r", so "Rel" instead of "rel".  But it's
weird that we have those functions and also just plain old
deparseColumnRef, which is logically part of
deparseColumnRefForBaserel but inexplicably remains a separate
function.  I still don't see why you can't just add a bunch of new
logic to the existing deparseColumnRef, change the last argument to be
of type deparse_expr_cxt instead of PlannerInfo, and have that one
function simply handle more cases than it does currently.

It seems unlikely to me that postgresGetForeignPlan really needs to
call list_free_deep(fdw_scan_tlist).  Can't we just let memory context
reset clean that up?

In postgresGetForeignPlan (and I think some other functions), you
renamed the argument from baserel to foreignrel.  But I think it would
be better to just go with "rel".  We do that elsewhere in various
places, and it seems fine here too.  And it's shorter.

copy_path_for_epq_recheck() and friends are really ugly.  Should we
consider just adding copyObject() support for those node types
instead?

The error message quality in conversion_error_callback() looks
unacceptably poor.  The column number we're proposing to output will
be utterly meaningless to the user.  It would ideally be desirable to
output the base table name and the column name from that table.

I'm sure there's more -- this is a huge patch and I don't fully
understand it yet -- but I'm out of energy for tonight.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-25 Thread Robert Haas
On Thu, Jan 21, 2016 at 8:36 AM, Ashutosh Bapat
 wrote:
>> What this patch does to the naming and calling conventions in
>> deparse.c does not good.  Previously, we had deparseTargetList().
>> Now, we sometimes use that and sometimes deparseAttrsUsed() for almost
>> the same thing.
>
> Previously deparseTargetList deparsed the SELECT or RETURNING clause by
> including list of name of attributes provided by attrs_used. That's now done
> by deparseAttrsUsed(). In current path deparseTargetList() deparses the
> targetlist i.e. list of TargetEntry nodes (right now only Vars). Although
> these two functions return comma separated string of column names, their
> inputs are different. deparseAttrsUsed() can never be called for more than
> one base relation. deparseTargetList() on the other hand can deparse a
> targetlist with Var nodes from multiple relations. We need those two
> functionalities separately. We might convert attrs_used into a list of
> TargetEntry nodes using build_tlist_to_deparse() and use deparseTargetList
> everywhere. A side effect of that would be separating retrieved_attrs
> collection from deparsing code. I didn't do that change in this version to
> avoid large code changes. But I am fine doing that, if that makes code
> readable.
>
> If we have to keep old deparseTargetList as is (and don't rename it as
> deparseAttrsUsed), we can rename the new deparseTargetList as something
> different may be deparseSelectList. I am fine with that too. But we need the
> later functionality, whatever its name be.

I'm not arguing that we don't need the functionality.  I'm arguing
that if we've got a set of existing functions that are named one way,
we shouldn't get a whole bunch of new functions that invent an
entirely new naming convention.  I'm not sure exactly how to clean
this up, but I think we need to find a way.

>> Previously, we had deparseColumnRef(); now we have
>> both that and deparseJoinVar() doing very similar things.  But in each
>> case, the function names are not parallel and the calling conventions
>> are totally different.  Like here:
>>
>> +   if (context->foreignrel->reloptkind == RELOPT_JOINREL)
>> +   deparseJoinVar(node, context);
>> +   else
>> +   deparseColumnRef(buf, node->varno,
>> node->varattno, context->root);
>>
>> We pass the buf separately to deparseColumnRef(), but for some reason
>> not to deparseJoinVar().I wonder if these functions need to be two
>> separate things or if the work done by deparseJoinVar() should
>> actually be part of deparseColumnRef().  But even if it needs to be
>> separate, I wonder why we can't arrange things so that they get the
>> same arguments, more or less.
>
> deparseVar() is called for any Var node that's encountered. deparseJoinVar
> is called to deparse a Var from join relation, which is supposed to output
> INNER or OUTER var's alias as used in INNER or OUTER subqueries. It does not
> output the real column names. deparseColumnRef() however is the same old
> thing; it deparses column of given base relation. They are not similar
> things.

deparseColumnRef() emits things like "foo" meaning column foo, or
"foo.bar" meaning column bar of table foo.  deparseJoinVar() emits
things like "r.a7", referring to a column called "a7" in a relation
called "r".  I feel that those *are* similar things.

I also wonder whether they couldn't be made more similar.  It seems to
me this patch is going to realias things potentially multiple times
for its own convenience.  That's not a catastrophe, but it's not
great, either, because it produces queries that are not necessarily
very human readable.  It would be nicer to get
actual_table_name.actual_column_name in more places and r.a7 in fewer.

> I agree that the code is complex for a reader. One of the reasons is
> recursive nature of deparsing. I will try to make it more cleaner and easier
> to understand. Would adding a call tree for deparsing routines help here? Or
> improving function prologues of even the existing functions?

I don't think so.  A README might help, but honestly I think some of
these APIs really just need to be revised.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-25 Thread Robert Haas
On Thu, Jan 21, 2016 at 12:47 AM, Ashutosh Bapat
 wrote:
>> Well, we kind of need to get it right, not just be guessing.
>>
>> It looks to me as though GetConnection() only uses the user ID as a
>> cache lookup key.  What it's trying to do is ensure that if user A and
>> user B need different user mapping options, we don't use the same
>> connection for both.  So, actually, passing InvalidOid seems like it's
>> not really a problem here.  It's arguably more correct than what we've
>> been doing up until now, since it means we cache the connection under
>> user OID whose options we used, rather than the user OID that asked
>> about the options.
>
> In that case, do we need GetUserMappingById changes in that patch and not
> pull it out. If we are keeping those changes, I need some clarification
> about your comment
>
>> Even if that were no issue, it doesn't seem to add anything.  The only
>> caller of the new function is  postgresBeginForeignScan(), and that
>> function already has a way of getting the user mapping.  The new way
>> doesn't seem to be better or faster, so why bother changing it?
>
> In pg_fdw_join_v2.patch, postgresBeginForeignScan() obtained user mapping
> using GetUserMappingById() instead of the earlier way of fetching it by
> userid and serverid. So even that change will remain, right?

In light of the investigation which led to the first comment you
quoted, I withdraw the second one (which I think I actually made
chronologically prior to the first one).  I'm not exactly sure what
needs to happen here, but it seems to me that maybe the connection
cache should be changed to be keyed by user mapping OID.  That seems
like it would address the problem you mention here:

http://www.postgresql.org/message-id/CAFjFpRf-LiD5bai4D6cSUseJh=xxjqipo_vn8mtnzg16tmw...@mail.gmail.com

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-21 Thread Ashutosh Bapat
> > In such a
> > case, which userid should be stored in UserMapping structure?It might
> look
> > like setting GetUserId() as userid in UserMapping is wise, but not
> really.
> > All the foreign tables might have different effective userids, each
> > different from GetUserId() and what GetUserId() would return might have a
> > user mapping different from the effective userids. What userid should
> > UserMapping structure have in that case? I thought, that's why Hanada-san
> > used invalid userid there, so left as it is.
>
> Well, we kind of need to get it right, not just be guessing.
>
> It looks to me as though GetConnection() only uses the user ID as a
> cache lookup key.  What it's trying to do is ensure that if user A and
> user B need different user mapping options, we don't use the same
> connection for both.  So, actually, passing InvalidOid seems like it's
> not really a problem here.  It's arguably more correct than what we've
> been doing up until now, since it means we cache the connection under
> user OID whose options we used, rather than the user OID that asked
> about the options.
>

That means that, right now, for two different local users which use public
user mapping we will be creating two different connections to the foreign
server with the same credentials. That doesn't look good. If we obtained
user mapping using user mapping oid (which will have invalid user id for
public user mapping) and used userid from that structure, we will get rid
of this problem.


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-21 Thread Ashutosh Bapat
On Thu, Jan 21, 2016 at 3:03 AM, Robert Haas  wrote:

> On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
>  wrote:
> > 2. pg_fdw_join_v2.patch: postgres_fdw changes for supporting join
> pushdown
>
> The very first hunk of this patch contains annoying whitespace
> changes.  Even if the result is what pgindent is going to do anyway,
> such changes should be stripped out of patches for ease of review.  In
> this case, though, I'm pretty sure it isn't what pgindent is going to
> do, so it's just useless churn.  Please remove all such changes from
> the patch.
>

Done.


>
> find_var_pos() looks like it should just be inlined into its only caller.
>
> Node *node = (Node *) var;
> TargetListEntry *tle = tlist_member(node, context->outerlist);
> if (tle)
> {
> side = OUTER_ALIAS;
> pos = tle->resno;
> }
> else
> {
> side = INNER_ALIAS;
> tle = tlist_member(node, context->innertlist);
> pos = tle->resno;
> }
>
> Why are we calling the return value "pos" instead of "resno" as we
> typically do elsewhere?
>

I have rewritten deparseJoinVar as
/*
 * Deparse given Var required for a joinrel into buf.
 */
static void
deparseJoinVar(Var *node, deparse_expr_cxt *context)
{
char*side;
TargetEntry *tle;

/* Lookup outer side */
tle = tlist_member((Node *)node, context->outertlist);
if (tle)
side = OUTER_ALIAS;
else
{
/* Not found on outer side; lookup inner */
side = INNER_ALIAS;
tle = tlist_member((Node *)node, context->innertlist);
}

/* The input var should be either on left or right side */
Assert(tle && side);

appendStringInfo(context->buf, "%s.%s%d", side, COL_ALIAS_PREFIX,
tle->resno);
}


> get_jointype_name() would probably be better written as a switch.


Done.


> On
> the flip side, deparseColumnRef() would have a lot less code churn if
> it *weren't* written as a switch.
>

Done.


>
> What this patch does to the naming and calling conventions in
> deparse.c does not good.  Previously, we had deparseTargetList().
> Now, we sometimes use that and sometimes deparseAttrsUsed() for almost
> the same thing.


Previously deparseTargetList deparsed the SELECT or RETURNING clause by
including list of name of attributes provided by attrs_used. That's now
done by deparseAttrsUsed(). In current path deparseTargetList() deparses
the targetlist i.e. list of TargetEntry nodes (right now only Vars).
Although these two functions return comma separated string of column names,
their inputs are different. deparseAttrsUsed() can never be called for more
than one base relation. deparseTargetList() on the other hand can deparse a
targetlist with Var nodes from multiple relations. We need those two
functionalities separately. We might convert attrs_used into a list of
TargetEntry nodes using build_tlist_to_deparse() and use deparseTargetList
everywhere. A side effect of that would be separating retrieved_attrs
collection from deparsing code. I didn't do that change in this version to
avoid large code changes. But I am fine doing that, if that makes code
readable.

If we have to keep old deparseTargetList as is (and don't rename it as
deparseAttrsUsed), we can rename the new deparseTargetList as something
different may be deparseSelectList. I am fine with that too. But we need
the later functionality, whatever its name be.


> Previously, we had deparseColumnRef(); now we have
> both that and deparseJoinVar() doing very similar things.  But in each
> case, the function names are not parallel and the calling conventions
> are totally different.  Like here:
>
> +   if (context->foreignrel->reloptkind == RELOPT_JOINREL)
> +   deparseJoinVar(node, context);
> +   else
> +   deparseColumnRef(buf, node->varno,
> node->varattno, context->root);
>
> We pass the buf separately to deparseColumnRef(), but for some reason
> not to deparseJoinVar().I wonder if these functions need to be two
> separate things or if the work done by deparseJoinVar() should
> actually be part of deparseColumnRef().  But even if it needs to be
> separate, I wonder why we can't arrange things so that they get the
> same arguments, more or less.
>

deparseVar() is called for any Var node that's encountered. deparseJoinVar
is called to deparse a Var from join relation, which is supposed to output
INNER or OUTER var's alias as used in INNER or OUTER subqueries. It does
not output the real column names. deparseColumnRef() however is the same
old thing; it deparses column of given base relation. They are not similar
things.

deparseJoinVar gets its buf from context, which we do not pass to
deparseColumnRef(). Not all callers of deparseColumnRef have a
deparse_expr_cxt with them. Also, outertlist and innertlist required by
deparseJoinVar are passed through deparse_expr_cxt. It doesn't look worth
to create a context just for the sake of making function 

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-20 Thread Ashutosh Bapat
On Thu, Jan 21, 2016 at 2:07 AM, Robert Haas  wrote:

>
> Well, we kind of need to get it right, not just be guessing.
>
> It looks to me as though GetConnection() only uses the user ID as a
> cache lookup key.  What it's trying to do is ensure that if user A and
> user B need different user mapping options, we don't use the same
> connection for both.  So, actually, passing InvalidOid seems like it's
> not really a problem here.  It's arguably more correct than what we've
> been doing up until now, since it means we cache the connection under
> user OID whose options we used, rather than the user OID that asked
> about the options.
>
>
In that case, do we need GetUserMappingById changes in that patch and not
pull it out. If we are keeping those changes, I need some clarification
about your comment

Even if that were no issue, it doesn't seem to add anything.  The only
> caller of the new function is  postgresBeginForeignScan(), and that
> function already has a way of getting the user mapping.  The new way
> doesn't seem to be better or faster, so why bother changing it?
>

In pg_fdw_join_v2.patch, postgresBeginForeignScan() obtained user mapping
using GetUserMappingById() instead of the earlier way of fetching it by
userid and serverid. So even that change will remain, right?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-20 Thread Robert Haas
On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
 wrote:
> 2. pg_fdw_join_v2.patch: postgres_fdw changes for supporting join pushdown

The very first hunk of this patch contains annoying whitespace
changes.  Even if the result is what pgindent is going to do anyway,
such changes should be stripped out of patches for ease of review.  In
this case, though, I'm pretty sure it isn't what pgindent is going to
do, so it's just useless churn.  Please remove all such changes from
the patch.

find_var_pos() looks like it should just be inlined into its only caller.

Node *node = (Node *) var;
TargetListEntry *tle = tlist_member(node, context->outerlist);
if (tle)
{
side = OUTER_ALIAS;
pos = tle->resno;
}
else
{
side = INNER_ALIAS;
tle = tlist_member(node, context->innertlist);
pos = tle->resno;
}

Why are we calling the return value "pos" instead of "resno" as we
typically do elsewhere?

get_jointype_name() would probably be better written as a switch.  On
the flip side, deparseColumnRef() would have a lot less code churn if
it *weren't* written as a switch.

What this patch does to the naming and calling conventions in
deparse.c does not good.  Previously, we had deparseTargetList().
Now, we sometimes use that and sometimes deparseAttrsUsed() for almost
the same thing.  Previously, we had deparseColumnRef(); now we have
both that and deparseJoinVar() doing very similar things.  But in each
case, the function names are not parallel and the calling conventions
are totally different.  Like here:

+   if (context->foreignrel->reloptkind == RELOPT_JOINREL)
+   deparseJoinVar(node, context);
+   else
+   deparseColumnRef(buf, node->varno,
node->varattno, context->root);

We pass the buf separately to deparseColumnRef(), but for some reason
not to deparseJoinVar().  I wonder if these functions need to be two
separate things or if the work done by deparseJoinVar() should
actually be part of deparseColumnRef().  But even if it needs to be
separate, I wonder why we can't arrange things so that they get the
same arguments, more or less.

Generally, I think this patch is on the right track, but I think
there's a good bit of work to be done to make it clearer and more
understandable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-20 Thread Robert Haas
On Wed, Jan 20, 2016 at 8:53 AM, Ashutosh Bapat
 wrote:
> I missed the example plan cache revalidation patch in the previous mail.
> Sorry. Here it is.

I see.  Yeah, I don't see a reason offhand why that wouldn't work.
But you need to update src/backend/nodes for the new field in each
place where PlannedStmt or PlannerGlobal is mentioned.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-20 Thread Robert Haas
On Wed, Jan 20, 2016 at 8:50 AM, Ashutosh Bapat
 wrote:
>> Third, I removed GetUserMappingById().  As mentioned in the email to
>> which I replied earlier, that doesn't actually produce the same result
>> as the way we're doing it now, and might leave the user ID invalid.
>
> The comment you quoted was my comment :). I never got a reply from
> Hanada-san on that comment. A bit of investigation revealed this: A pushed
> down foreign join which involves N foreign tables, might have different
> effective userid for each of them.
> userid = OidIsValid(rte->checkAsUser) ? rte->checkAsUser : GetUserId()
> In such a case, AFAIU, the join will be pushed down only if none of those
> have user mapping and there is public user mapping. Is that right?

Yes, I think that is right.

> In such a
> case, which userid should be stored in UserMapping structure?It might look
> like setting GetUserId() as userid in UserMapping is wise, but not really.
> All the foreign tables might have different effective userids, each
> different from GetUserId() and what GetUserId() would return might have a
> user mapping different from the effective userids. What userid should
> UserMapping structure have in that case? I thought, that's why Hanada-san
> used invalid userid there, so left as it is.

Well, we kind of need to get it right, not just be guessing.

It looks to me as though GetConnection() only uses the user ID as a
cache lookup key.  What it's trying to do is ensure that if user A and
user B need different user mapping options, we don't use the same
connection for both.  So, actually, passing InvalidOid seems like it's
not really a problem here.  It's arguably more correct than what we've
been doing up until now, since it means we cache the connection under
user OID whose options we used, rather than the user OID that asked
about the options.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-20 Thread Ashutosh Bapat
On Wed, Jan 20, 2016 at 4:58 AM, Robert Haas  wrote:

> On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
>  wrote:
> > Thanks Thom for bringing it to my notice quickly. Sorry for the same.
> >
> > Here are the patches.
> >
> > 1. pg_fdw_core_v2.patch: changes in core related to user mapping
> handling,
> > GUC
> >   enable_foreignjoin
>
> I tried to whittle this patch down to something that I'd be
> comfortable committing and ended up with nothing left.
>
> First, I removed the enable_foreignjoin GUC.  I still think an
> FDW-specific GUC is better, and I haven't heard anybody make a strong
> argument the other way. Your argument that this might be inconvenient
> if somebody is using a bunch of join-pushdown-enabled FDWs sounds like
> a strictly theoretical problem, considering how much difficulty we're
> having getting even one FDW to support join pushdown.  And if it does
> happen, the user can script it.  I'm willing to reconsider this point
> if there is a massive chorus of support for having this be a core
> option rather than an FDW option, but to me it seems that we've gone
> to a lot of trouble to make the system extensible and might as well
> get some benefit from it.
>

Ok. Removed.


>
> Second, I removed the documentation for GetForeignTable().  That
> function is already documented and doesn't need re-documenting.
>

Removed.


>
> Third, I removed GetUserMappingById().  As mentioned in the email to
> which I replied earlier, that doesn't actually produce the same result
> as the way we're doing it now, and might leave the user ID invalid.
>

The comment you quoted was my comment :). I never got a reply from
Hanada-san on that comment. A bit of investigation revealed this: A pushed
down foreign join which involves N foreign tables, might have different
effective userid for each of them.
userid = OidIsValid(rte->checkAsUser) ? rte->checkAsUser : GetUserId()
In such a case, AFAIU, the join will be pushed down only if none of those
have user mapping and there is public user mapping. Is that right? In such
a case, which userid should be stored in UserMapping structure?It might
look like setting GetUserId() as userid in UserMapping is wise, but not
really. All the foreign tables might have different effective userids, each
different from GetUserId() and what GetUserId() would return might have a
user mapping different from the effective userids. What userid should
UserMapping structure have in that case? I thought, that's why Hanada-san
used invalid userid there, so left as it is.

But that has an undesirable effect of setting it to invalid, even in case
of base relation or when all the joining relations have same effective
userid. We may cache "any" of the effective userids of joining relations if
the user mapping matches in build_join_rel() (we will need to add another
field userid in RelOptInfo along with umid). While creating the plan use
this userid to get user mapping. Does that sound good? This clubbed with
the plan cache invalidation should be full proof. We need a way to get user
mapping whether from umid (in which case public user mapping will have -1)
or serverid and some userid.


> Even if that were no issue, it doesn't seem to add anything.  The only
> caller of the new function is  postgresBeginForeignScan(), and that
> function already has a way of getting the user mapping.  The new way
> doesn't seem to be better or faster, so why bother changing it?
>

> At this point, I was down to just the changes to store the user
> mapping ID (umid) in the RelOptInfo, and to consider join pushdown
> only if the user mapping IDs match.  One observation I made is that if
> the code to initialize the FDW-related fields were lifted from
> get_relation_info() up to build_simple_rel(), we would not need to use
> planner_rt_fetch(), because the caller already has that information.
> That seems like it might be worth considering.  But then I realized a
> more fundamental problem: making the plan depend on the user ID is a
> problem, because the user ID can be changed, and the plan might be
> cached.  The same issue arises for RLS, but there is provision for
> that in RevalidateCachedQuery.  This patch makes no similar provision.
>

Thanks for the catch. Please see attached patch for a quick fix in
RevalidateCachedQuery(). Are you thinking on similar lines? However, I am
not sure of planUserId - that field actually puzzles me. It's set when the
first time we create a plan and it never changes then. This seems to be a
problem, even for RLS, in following scene

prepare statement using RLS feature
execute statement -- now planUserId is set to say user1
set session role user2;
execute statement - RevalidateCachedQuery() detects that the user has
changed and invalidates the plan and recreates it (including possibly the
query analyze and rewrite). Note planUserId is unchanged here; it's still
user1
reset role; -- now GetUserId() 

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-20 Thread Ashutosh Bapat
I missed the example plan cache revalidation patch in the previous mail.
Sorry. Here it is.

On Wed, Jan 20, 2016 at 7:20 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
>
> On Wed, Jan 20, 2016 at 4:58 AM, Robert Haas 
> wrote:
>
>> On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
>>  wrote:
>> > Thanks Thom for bringing it to my notice quickly. Sorry for the same.
>> >
>> > Here are the patches.
>> >
>> > 1. pg_fdw_core_v2.patch: changes in core related to user mapping
>> handling,
>> > GUC
>> >   enable_foreignjoin
>>
>> I tried to whittle this patch down to something that I'd be
>> comfortable committing and ended up with nothing left.
>>
>> First, I removed the enable_foreignjoin GUC.  I still think an
>> FDW-specific GUC is better, and I haven't heard anybody make a strong
>> argument the other way. Your argument that this might be inconvenient
>> if somebody is using a bunch of join-pushdown-enabled FDWs sounds like
>> a strictly theoretical problem, considering how much difficulty we're
>> having getting even one FDW to support join pushdown.  And if it does
>> happen, the user can script it.  I'm willing to reconsider this point
>> if there is a massive chorus of support for having this be a core
>> option rather than an FDW option, but to me it seems that we've gone
>> to a lot of trouble to make the system extensible and might as well
>> get some benefit from it.
>>
>
> Ok. Removed.
>
>
>>
>> Second, I removed the documentation for GetForeignTable().  That
>> function is already documented and doesn't need re-documenting.
>>
>
> Removed.
>
>
>>
>> Third, I removed GetUserMappingById().  As mentioned in the email to
>> which I replied earlier, that doesn't actually produce the same result
>> as the way we're doing it now, and might leave the user ID invalid.
>>
>
> The comment you quoted was my comment :). I never got a reply from
> Hanada-san on that comment. A bit of investigation revealed this: A pushed
> down foreign join which involves N foreign tables, might have different
> effective userid for each of them.
> userid = OidIsValid(rte->checkAsUser) ? rte->checkAsUser : GetUserId()
> In such a case, AFAIU, the join will be pushed down only if none of those
> have user mapping and there is public user mapping. Is that right? In such
> a case, which userid should be stored in UserMapping structure?It might
> look like setting GetUserId() as userid in UserMapping is wise, but not
> really. All the foreign tables might have different effective userids, each
> different from GetUserId() and what GetUserId() would return might have a
> user mapping different from the effective userids. What userid should
> UserMapping structure have in that case? I thought, that's why Hanada-san
> used invalid userid there, so left as it is.
>
> But that has an undesirable effect of setting it to invalid, even in case
> of base relation or when all the joining relations have same effective
> userid. We may cache "any" of the effective userids of joining relations if
> the user mapping matches in build_join_rel() (we will need to add another
> field userid in RelOptInfo along with umid). While creating the plan use
> this userid to get user mapping. Does that sound good? This clubbed with
> the plan cache invalidation should be full proof. We need a way to get user
> mapping whether from umid (in which case public user mapping will have -1)
> or serverid and some userid.
>
>
>> Even if that were no issue, it doesn't seem to add anything.  The only
>> caller of the new function is  postgresBeginForeignScan(), and that
>> function already has a way of getting the user mapping.  The new way
>> doesn't seem to be better or faster, so why bother changing it?
>>
>
>> At this point, I was down to just the changes to store the user
>> mapping ID (umid) in the RelOptInfo, and to consider join pushdown
>> only if the user mapping IDs match.  One observation I made is that if
>> the code to initialize the FDW-related fields were lifted from
>> get_relation_info() up to build_simple_rel(), we would not need to use
>> planner_rt_fetch(), because the caller already has that information.
>> That seems like it might be worth considering.  But then I realized a
>> more fundamental problem: making the plan depend on the user ID is a
>> problem, because the user ID can be changed, and the plan might be
>> cached.  The same issue arises for RLS, but there is provision for
>> that in RevalidateCachedQuery.  This patch makes no similar provision.
>>
>
> Thanks for the catch. Please see attached patch for a quick fix in
> RevalidateCachedQuery(). Are you thinking on similar lines? However, I am
> not sure of planUserId - that field actually puzzles me. It's set when the
> first time we create a plan and it never changes then. This seems to be a
> problem, even for RLS, in following scene
>
> prepare statement using RLS feature
> execute statement 

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-19 Thread Robert Haas
On Wed, Aug 19, 2015 at 8:40 AM, Ashutosh Bapat
 wrote:
> I started reviewing the other patches.
>
> In patch foreign_join_v16.patch, the user mapping structure being passed to
> GetConnection() is the one obtained from GetUserMappingById().
> GetUserMappingById() constructs the user mapping structure from the user
> mapping catalog. For public user mappings, catalog entries have InvalidOid
> as userid. Thus, with this patch there is a chance that userid in
> UserMapping being passed to GetConnection(), contains InvalidOid as userid.
> This is not the case today. The UserMapping structure constructed using
> GetUserMapping(Oid userid, Oid serverid) (which ultimately gets passed to
> GetConnection()), has the passed in userid and not the one in the catalog.
> Is this change intentional?

This point seems not to have been addressed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-19 Thread Robert Haas
On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
 wrote:
> Thanks Thom for bringing it to my notice quickly. Sorry for the same.
>
> Here are the patches.
>
> 1. pg_fdw_core_v2.patch: changes in core related to user mapping handling,
> GUC
>   enable_foreignjoin

I tried to whittle this patch down to something that I'd be
comfortable committing and ended up with nothing left.

First, I removed the enable_foreignjoin GUC.  I still think an
FDW-specific GUC is better, and I haven't heard anybody make a strong
argument the other way. Your argument that this might be inconvenient
if somebody is using a bunch of join-pushdown-enabled FDWs sounds like
a strictly theoretical problem, considering how much difficulty we're
having getting even one FDW to support join pushdown.  And if it does
happen, the user can script it.  I'm willing to reconsider this point
if there is a massive chorus of support for having this be a core
option rather than an FDW option, but to me it seems that we've gone
to a lot of trouble to make the system extensible and might as well
get some benefit from it.

Second, I removed the documentation for GetForeignTable().  That
function is already documented and doesn't need re-documenting.

Third, I removed GetUserMappingById().  As mentioned in the email to
which I replied earlier, that doesn't actually produce the same result
as the way we're doing it now, and might leave the user ID invalid.
Even if that were no issue, it doesn't seem to add anything.  The only
caller of the new function is  postgresBeginForeignScan(), and that
function already has a way of getting the user mapping.  The new way
doesn't seem to be better or faster, so why bother changing it?

At this point, I was down to just the changes to store the user
mapping ID (umid) in the RelOptInfo, and to consider join pushdown
only if the user mapping IDs match.  One observation I made is that if
the code to initialize the FDW-related fields were lifted from
get_relation_info() up to build_simple_rel(), we would not need to use
planner_rt_fetch(), because the caller already has that information.
That seems like it might be worth considering.  But then I realized a
more fundamental problem: making the plan depend on the user ID is a
problem, because the user ID can be changed, and the plan might be
cached.  The same issue arises for RLS, but there is provision for
that in RevalidateCachedQuery.  This patch makes no similar provision.

I think there are two ways forward here.  One is to figure out a way
for the plancache to invalidate queries using FDW join pushdown when
the user ID changes.  The other is to recheck at execution time
whether the user mapping IDs still match, and if not, fall back to
using the "backup" plan that we need anyway for EvalPlanQual rechecks.
This would of course mean that the backup plan would need to be
something decently efficient, not just whatever we had nearest to
hand.  But that might not be too hard to manage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-18 Thread Ashutosh Bapat
Hi All,
PFA patches for postgres_fdw join pushdown, taken care of all TODOs in my
last mail.

Here is the list of things that have been improved/added new as compared to
Hanada-san's previous patch at [1].

1. Condition handling for join
Patch in [1] allowed a foreign join to be pushed down if only all the
conditions were safe to push down to the foreign server. This patch
differentiates these conditions into 1. conditions to be applied while
joining (ON clause) 2. conditions to be applied after joining (WHERE
clause). For a join to be safe to pushdown, only conditions in 1 need to be
all safe to pushdown. The conditions in second category, which are not safe
to be pushed down can be applied locally. This allows more avenue for join
pushdown. For an INNER join all the conditions can be applied on the cross
product. Hence we can push down an INNER join even if one or more of the
conditions are not safe to be pushed down. This patch includes the
optimization as well.

2. Targetlist handling:
The columns required to evaluate the non-pushable conditions on a join
relation need to be fetched from the foreign server. In previous patch the
SELECT clauses were built from rel->reltargetlist, which doesn't contain
these columns. This patch includes those columns as well.

3. Column projection:
Earlier patch required another layer of SQL to project whole-row attribute
from a base relation. This patch takes care of that while constructing and
deparsing
targetlist. This reduces the complexity and length of the query to be sent
to the foreign server e.g.

With the projection in previous patch the query looked like
EXPLAIN (COSTS false, VERBOSE)
SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)
ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
  QUERY PLAN


... explain output clipped
   Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT
l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10,
l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7
a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN
(SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9
FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8
a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))

With this patch it looks like
EXPLAIN (COSTS false, VERBOSE)
SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)
ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
   QUERY
PLAN

... explain output clipped
   Remote SQL: SELECT l.a3, l.a4, l.a1, l.a2, r.a2 FROM (SELECT
"C 1", c3, ctid, ROW("C 1", c2, c3, c4, c5, c6, c7, c8) FROM "S 1"."T 1") l
(a1, a2, a3, a4) INNER JOIN (SELECT "C 1", ROW("C 1", c2, c3, c4, c5, c6,
c7, c8) FROM "S 1"."T 1") r (a1, a2) ON (TRUE) WHERE ((l.a1 = r.a1))
(9 rows)

4. Local cost estimation
Previous patch had a TODO left for estimating join cost locally, when
use_remote_estimate is false. This patch adds support for the same. The
relevant
discussion in mail thread [2], [3].

5. This patch adds a GUC enable_foreignjoin to enable or disable join
pushdown through core.

6. Added more tests to test lateral references, unsafe to push conditions
at various places in the query,

Many cosmetic improvements like adding static function declarations,
comment improvements and making code readable.

[1]
http://www.postgresql.org/message-id/CAEZqfEe9KGy=1_wagh2rgzpg0o4pqgd+iauyaj8wtze+cyj...@mail.gmail.com
[2]
http://www.postgresql.org/message-id/cafjfprcqswus+tb5iyp1m3c-w0k3xab6h5mw4+n2q2iuafs...@mail.gmail.com
[3]
http://www.postgresql.org/message-id/CAFjFpRepSC2e3mZ1uYSopJD6R19fOZ0dNNf9Z=gnyksb6wg...@mail.gmail.com

I will be working next on (in that order)
1. eval_plan_qual fix for foreign join. (Considered as a must-have for 9.6)
2. Pushing down ORDER BY clause along with join pushdown
3. Parameterization of foreign join paths (Given the complexity of the
feature this may not make it into 9.6)

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-18 Thread Thom Brown
On 18 January 2016 at 10:46, Ashutosh Bapat
 wrote:
> Hi All,
> PFA patches for postgres_fdw join pushdown, taken care of all TODOs in my
> last mail.
>
> Here is the list of things that have been improved/added new as compared to
> Hanada-san's previous patch at [1].
>
> 1. Condition handling for join
> Patch in [1] allowed a foreign join to be pushed down if only all the
> conditions were safe to push down to the foreign server. This patch
> differentiates these conditions into 1. conditions to be applied while
> joining (ON clause) 2. conditions to be applied after joining (WHERE
> clause). For a join to be safe to pushdown, only conditions in 1 need to be
> all safe to pushdown. The conditions in second category, which are not safe
> to be pushed down can be applied locally. This allows more avenue for join
> pushdown. For an INNER join all the conditions can be applied on the cross
> product. Hence we can push down an INNER join even if one or more of the
> conditions are not safe to be pushed down. This patch includes the
> optimization as well.
>
> 2. Targetlist handling:
> The columns required to evaluate the non-pushable conditions on a join
> relation need to be fetched from the foreign server. In previous patch the
> SELECT clauses were built from rel->reltargetlist, which doesn't contain
> these columns. This patch includes those columns as well.
>
> 3. Column projection:
> Earlier patch required another layer of SQL to project whole-row attribute
> from a base relation. This patch takes care of that while constructing and
> deparsing
> targetlist. This reduces the complexity and length of the query to be sent
> to the foreign server e.g.
>
> With the projection in previous patch the query looked like
> EXPLAIN (COSTS false, VERBOSE)
> SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)
> ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
>   QUERY PLAN
> ... explain output clipped
>Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT
> l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10,
> l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7
> a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN
> (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9
> FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8
> a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
>
> With this patch it looks like
> EXPLAIN (COSTS false, VERBOSE)
> SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)
> ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
>QUERY PLAN
> ... explain output clipped
>Remote SQL: SELECT l.a3, l.a4, l.a1, l.a2, r.a2 FROM (SELECT
> "C 1", c3, ctid, ROW("C 1", c2, c3, c4, c5, c6, c7, c8) FROM "S 1"."T 1") l
> (a1, a2, a3, a4) INNER JOIN (SELECT "C 1", ROW("C 1", c2, c3, c4, c5, c6,
> c7, c8) FROM "S 1"."T 1") r (a1, a2) ON (TRUE) WHERE ((l.a1 = r.a1))
> (9 rows)
>
> 4. Local cost estimation
> Previous patch had a TODO left for estimating join cost locally, when
> use_remote_estimate is false. This patch adds support for the same. The
> relevant
> discussion in mail thread [2], [3].
>
> 5. This patch adds a GUC enable_foreignjoin to enable or disable join
> pushdown through core.
>
> 6. Added more tests to test lateral references, unsafe to push conditions at
> various places in the query,
>
> Many cosmetic improvements like adding static function declarations, comment
> improvements and making code readable.
>
> [1]
> http://www.postgresql.org/message-id/CAEZqfEe9KGy=1_wagh2rgzpg0o4pqgd+iauyaj8wtze+cyj...@mail.gmail.com
> [2]
> http://www.postgresql.org/message-id/cafjfprcqswus+tb5iyp1m3c-w0k3xab6h5mw4+n2q2iuafs...@mail.gmail.com
> [3]
> http://www.postgresql.org/message-id/CAFjFpRepSC2e3mZ1uYSopJD6R19fOZ0dNNf9Z=gnyksb6wg...@mail.gmail.com
>
> I will be working next on (in that order)
> 1. eval_plan_qual fix for foreign join. (Considered as a must-have for 9.6)
> 2. Pushing down ORDER BY clause along with join pushdown
> 3. Parameterization of foreign join paths (Given the complexity of the
> feature this may not make it into 9.6)

It seems you forgot to attach the patch.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-01-18 Thread Etsuro Fujita

On 2016/01/18 19:46, Ashutosh Bapat wrote:

PFA patches for postgres_fdw join pushdown, taken care of all TODOs in
my last mail.

Here is the list of things that have been improved/added new as compared
to Hanada-san's previous patch at [1].


Great!  Thank you for working on that!  I'll review the patch.


I will be working next on (in that order)
1. eval_plan_qual fix for foreign join. (Considered as a must-have for 9.6)
2. Pushing down ORDER BY clause along with join pushdown
3. Parameterization of foreign join paths (Given the complexity of the
feature this may not make it into 9.6)


As discussed internally, I think #3 might have some impact on the 
overall design of the EvalPlanQual fix, especially the design of a 
helper function that creates a local join execution path for a foreign 
join path for EvalPlanQual.  So, IMO, I think the order is #1, #3, and 
#2 (or #3, #1, #2).


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-11 Thread Etsuro Fujita

On 2015/12/11 14:16, Ashutosh Bapat wrote:

On Thu, Dec 10, 2015 at 11:20 PM, Robert Haas > wrote:



On Tue, Dec 8, 2015 at 6:40 AM, Etsuro Fujita
>
wrote:
> IMO I want to see the EvalPlanQual fix in the first version for 9.6.

+1.



I think there is still a lot functionality that is offered without
EvalPlanQual fix. As long as we do not push joins when there are
RowMarks involved, implementation of that hook is not required. We won't
be able to push down joins for DMLs and when there are FOR SHARE/UPDATE
clauses in the query. And there are huge number of queries, which will
be benefitted by the push down even without that support. There's
nothing in this patch, which comes in way of implementing the
EvalPlanQual fix. It can be easily added after committing the first
version. On the other hand, getting minimal (it's not really minimal,
it's much more than that) support for postgres_fdw support committed
opens up possibility to work on multiple items (as listed in my mail) in
parallel.



I am not saying that we do not need EvalPlanQual fix in 9.6. But it's
not needed in the first cut. If we get the first cut in first couple of
months of 2016, there's plenty of room for the fix to go in 9.6. It
would be really bad situation if we could not get postgres_fdw join
pushdown supported in 9.6 because EvalPlanQual hook could not be
committed while the rest of the code is ready. EvalPlanQual fix in core
was being discussed since April 2015. It took 8 months to get that
fixed. Hopefully we won't need that long to implement the hook in
postgres_fdw, but that number says something about the complexity of the
feature.


ISTM that further enhancements are of secondary importance. Let's do the 
EvalPlanQual fix first. I'll add the RecheckForeignScan callback routine 
to your version of the postgres_fdw patch as soon as possible.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-11 Thread Robert Haas
On Fri, Dec 11, 2015 at 12:16 AM, Ashutosh Bapat
 wrote:
>> +1.
>>
> I think there is still a lot functionality that is offered without
> EvalPlanQual fix.

Sure.  But I think that the EvalPlanQual-related fixes might have some
impact on the overall design, and I don't want to commit this with one
design and then have to revise it because we didn't examine the
EvalPlanQual requirements carefully enough.  We've already been down
that path once, and I don't want to go back.  It's not always possible
to get the design right the first time, but it's definitely nicer when
you do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-10 Thread Robert Haas
On Wed, Dec 2, 2015 at 6:45 AM, Ashutosh Bapat
 wrote:
> It's been a long time since last patch on this thread was posted. I have
> started
> to work on supporting join pushdown for postgres_fdw. Attached please find
> three
> patches
> 1. pg_fdw_core.patch: changes in core related to user mapping handling, GUC
>   enable_foreignjoin
> 2. pg_fdw_join.patch: postgres_fdw changes for supporting join pushdown

It seems useful to break things up this way.  However, I'm not sure we
want an enable_foreignjoin GUC; in fact, I think we probably don't.
If we want to have a way to control this, postgres_fdw can provide a
custom GUC or FDW option for that.

And to be honest, I haven't really been able to understand why join
pushdown needs changes to user mapping handling.  Just hypothetically
speaking, if I put my foot down and said we're not committing any of
that stuff, how and why would that impact our ability to have join
pushdown in 9.6?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-10 Thread Robert Haas
On Tue, Dec 8, 2015 at 6:40 AM, Etsuro Fujita
 wrote:
> IMO I want to see the EvalPlanQual fix in the first version for 9.6.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-10 Thread Ashutosh Bapat
On Thu, Dec 10, 2015 at 11:20 PM, Robert Haas  wrote:

> On Tue, Dec 8, 2015 at 6:40 AM, Etsuro Fujita
>  wrote:
> > IMO I want to see the EvalPlanQual fix in the first version for 9.6.
>
> +1.
>
> I think there is still a lot functionality that is offered without
EvalPlanQual fix. As long as we do not push joins when there are RowMarks
involved, implementation of that hook is not required. We won't be able to
push down joins for DMLs and when there are FOR SHARE/UPDATE clauses in the
query. And there are huge number of queries, which will be benefitted by
the push down even without that support. There's nothing in this patch,
which comes in way of implementing the EvalPlanQual fix. It can be easily
added after committing the first version. On the other hand, getting
minimal (it's not really minimal, it's much more than that) support for
postgres_fdw support committed opens up possibility to work on multiple
items (as listed in my mail) in parallel.

I am not saying that we do not need EvalPlanQual fix in 9.6. But it's not
needed in the first cut. If we get the first cut in first couple of months
of 2016, there's plenty of room for the fix to go in 9.6. It would be
really bad situation if we could not get postgres_fdw join pushdown
supported in 9.6 because EvalPlanQual hook could not be committed while the
rest of the code is ready. EvalPlanQual fix in core was being discussed
since April 2015. It took 8 months to get that fixed. Hopefully we won't
need that long to implement the hook in postgres_fdw, but that number says
something about the complexity of the feature.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-10 Thread Ashutosh Bapat
On Fri, Dec 11, 2015 at 3:41 AM, Robert Haas  wrote:

> On Wed, Dec 2, 2015 at 6:45 AM, Ashutosh Bapat
>  wrote:
> > It's been a long time since last patch on this thread was posted. I have
> > started
> > to work on supporting join pushdown for postgres_fdw. Attached please
> find
> > three
> > patches
> > 1. pg_fdw_core.patch: changes in core related to user mapping handling,
> GUC
> >   enable_foreignjoin
> > 2. pg_fdw_join.patch: postgres_fdw changes for supporting join pushdown
>
> It seems useful to break things up this way.  However, I'm not sure we
> want an enable_foreignjoin GUC; in fact, I think we probably don't.
> If we want to have a way to control this, postgres_fdw can provide a
> custom GUC or FDW option for that.
>

enable_foreignjoin or its FDW counterpart would be useful for debugging
purposes just like enable_hashjoin/enable_mergejoin etc. Foreign join push
down can be viewed as a join strategy just like merge/nest loop/hash join
etc. Having enable_foreignjoin complements that picture. Users find more
usage of the same. A user running multiple FDWs and needing to disable join
pushdown across FDWs for any purpose would find enable_foreignjoin very
useful. Needing to turn on/off multiple GUCs would be cumbersome.


>
> And to be honest, I haven't really been able to understand why join
> pushdown needs changes to user mapping handling.


Current join pushdown infrastructure in core allows join to be pushed down
if both the sides of join come from the same server. Those sides may have
different user mappings and thus different user properties/access
permissions/visibility on the foreign server. If FDW chooses either of
these different user mappings to push down the join, it will get wrong
results. So, a join between two foreign relations can not be pushed down if
the user mappings on both sides do not match. This has been already
discussed in this thread. I am pasting your response to Hanada-san back in
May 2015 as hint to the discussion
--
2015-05-21 23:11 GMT+09:00 Robert Haas :
> On Sat, May 16, 2015 at 9:04 AM, Shigeru Hanada
>  wrote:
>> d) All relations must have the same effective user id
>> This check is done with userid stored in PgFdwRelationInfo, which is
>> valid only when underlying relations have the same effective user id.
>> Here "effective user id" means id of the user executing the query, or
>> the owner of the view when the foreign table is accessed via view.
>> Tom suggested that it is necessary to check that user mapping matches
>> for security reason, and now I think it's better than checking
>> effective user as current postgres_fdw does.
>
> So, should this be a separate patch?
--
To add to that, checking user mapping is better than checking the effective
user id for multiple reasons. Multiple local users can share same public
user mapping, which implies that they share same permissions/visibility for
objects on the foreign server. Join involving two sides with different
effective local user but same user mapping can be pushed down to the
foreign server as same objects/data is going to visible where or not we
push the join down.



> Just hypothetically
> speaking, if I put my foot down and said we're not committing any of
> that stuff, how and why would that impact our ability to have join
>
pushdown in 9.6?
>
>
In fact, the question would be what user mapping should be used by the
connection on which we are firing join query. Unless we answer that
question we won't have join pushdown in 9.6. If we push down joins without
taking into consideration the user mapping, we will have all sorts of
security/data visibility problems because of wrong user properties used for
connection.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-08 Thread Ashutosh Bapat
On Thu, Dec 3, 2015 at 12:36 PM, Etsuro Fujita 
wrote:

> Hi Ashutosh,
>
> On 2015/12/02 20:45, Ashutosh Bapat wrote:
>
>> It's been a long time since last patch on this thread was posted. I have
>> started
>> to work on supporting join pushdown for postgres_fdw.
>>
>
> Thanks for the work!
>
> Generating paths
>> 
>>
>
> A join between two foreign relations is considered safe to push down if
>> 1. The joining sides are pushable
>> 2. The type of join is OUTER or INNER (LEFT/RIGHT/FULL/INNER). SEMI and
>> ANTI
>> joins are not considered right now, because of difficulties in
>> constructing
>> the queries involving those. The join clauses of SEMI/ANTI joins are
>> not in a
>> form that can be readily converted to IN/EXISTS/NOT EXIST kind of
>> expression.
>> We might consider this as future optimization.
>> 3. Joining sides do not have clauses which can not be pushed down to the
>> foreign
>> server. For an OUTER join this is important since those clauses need
>> to be
>> applied before performing the join and thus join can not be pushed
>> to the
>> foreign server. An example is
>> SELECT * FROM ft1 LEFT JOIN (SELECT * FROM ft2 where local_cond) ft2
>> ON (join clause)
>> Here the local_cond on ft2 needs to be executed before performing
>> LEFT JOIN
>> between ft1 and ft2.
>> This condition can be relaxed for an INNER join by pulling the local
>> clauses
>> up the join tree. But this needs more investigation and is not
>> considered in
>> this version.
>> 4. The join conditions (e.g. conditions in ON clause) are all safe to
>> push down.
>> This is important for OUTER joins as pushing down join clauses
>> partially and
>> applying rest locally changes the result. There are ways [1] by
>> which partial
>> OUTER join can be completed by applying unpushable clauses locally
>> and then
>> nullifying the nullable side and eliminating duplicate non-nullable
>> side
>> rows. But that's again out of scope of first version of postgres_fdw
>> join
>> pushdown.
>>
>
> As for 4, as commented in the patch, we could relax the requirement that
> all the join conditions (given by JoinPathExtraData's restrictlist) need to
> be safe to push down to the remote server;
> * In case of inner join, all the conditions would not need to be safe.
> * In case of outer join, all the "otherclauses" would not need to be safe,
> while I think all the "joinclauses" need to be safe to get the right
> results (where "joinclauses" and "otherclauses" are defined by
> extract_actual_join_clauses).  And I think we should do this relaxation to
> some extent for 9.6, to allow more joins to be pushed down.


agreed. I will work on those.


> I don't know about [1].  May I see more information about [1]?
>
>
Generating plan
>> ===
>>
>
> Rest of this section describes the logic to construct
>> the SQL
>> for join; the logic is implemented as function deparseSelectSqlForRel().
>>
>> deparseSelectSqlForRel() builds the SQL for given joinrel (and now for
>> baserel
>> asd well) recursively.
>> For joinrels
>> 1. it constructs SQL representing either side of join, by calling itself
>> in recursive fashion.
>> 2. These SQLs are converted into subqueries and become part of the FROM
>> clause
>> with appropriate JOIN type and clauses. The left and right
>> subqueries are
>> given aliases "l" and "r" respectively. The columns in each subquery
>> are
>> aliased as "a1", "a2", "a3" and so on. Thus the third column on left
>> side can
>> be referenced as "l.a3" at any recursion level.
>> 3. Targetlist is added representing the columns in the join result
>> expected at
>> that level.
>> 4. The join clauses are added as part of ON clause
>> 5. Any clauses that planner has deemed fit to be evaluated at that level
>> of join
>> are added as part of WHERE clause.
>>
>
> Honestly, I'm not sure that that is a good idea.  One reason for that is
> that a query string constructed by the procedure is difficult to read
> especially when the procedure is applied recursively.  So, I'm thinking to
> revise the procedure so as to construct a query string with a flattened
> FROM clause, as discussed in eg, [2].
>
>
Just to confirm, the hook discussed in [2] is not in place right? I can
find only one hook for foreign join
 50 typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
 51   RelOptInfo
*joinrel,
 52 RelOptInfo
*outerrel,
 53 RelOptInfo
*innerrel,
 54   JoinType
jointype,
 55JoinPathExtraData
*extra);
This hook takes an inner and outer relation, so can not be used for N-way
join as discussed in that thread.

Are you 

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-08 Thread Etsuro Fujita

On 2015/12/08 17:27, Ashutosh Bapat wrote:

On Thu, Dec 3, 2015 at 12:36 PM, Etsuro Fujita
> wrote:



Generating paths




A join between two foreign relations is considered safe to push
down if



4. The join conditions (e.g. conditions in ON clause) are all
safe to
push down.
 This is important for OUTER joins as pushing down join clauses
partially and
 applying rest locally changes the result. There are ways [1] by
which partial
 OUTER join can be completed by applying unpushable clauses
locally
and then
 nullifying the nullable side and eliminating duplicate
non-nullable side
 rows. But that's again out of scope of first version of
postgres_fdw
join
 pushdown.



As for 4, as commented in the patch, we could relax the requirement
that all the join conditions (given by JoinPathExtraData's
restrictlist) need to be safe to push down to the remote server;
* In case of inner join, all the conditions would not need to be safe.
* In case of outer join, all the "otherclauses" would not need to be
safe, while I think all the "joinclauses" need to be safe to get the
right results (where "joinclauses" and "otherclauses" are defined by
extract_actual_join_clauses).  And I think we should do this
relaxation to some extent for 9.6, to allow more joins to be pushed
down.



agreed. I will work on those.


Great!


Generating plan
===



Rest of this section describes the logic to construct
the SQL
for join; the logic is implemented as function
deparseSelectSqlForRel().

deparseSelectSqlForRel() builds the SQL for given joinrel (and
now for
baserel
asd well) recursively.
For joinrels
1. it constructs SQL representing either side of join, by
calling itself
 in recursive fashion.
2. These SQLs are converted into subqueries and become part of
the FROM
clause
 with appropriate JOIN type and clauses. The left and right
subqueries are
 given aliases "l" and "r" respectively. The columns in each
subquery are
 aliased as "a1", "a2", "a3" and so on. Thus the third
column on left
side can
 be referenced as "l.a3" at any recursion level.
3. Targetlist is added representing the columns in the join result
expected at
 that level.
4. The join clauses are added as part of ON clause
5. Any clauses that planner has deemed fit to be evaluated at
that level
of join
 are added as part of WHERE clause.



Honestly, I'm not sure that that is a good idea.  One reason for
that is that a query string constructed by the procedure is
difficult to read especially when the procedure is applied
recursively.  So, I'm thinking to revise the procedure so as to
construct a query string with a flattened FROM clause, as discussed
in eg, [2].



Just to confirm, the hook discussed in [2] is not in place right? I can
find only one hook for foreign join
  50 typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
  51
RelOptInfo *joinrel,
  52 RelOptInfo
*outerrel,
  53 RelOptInfo
*innerrel,
  54   JoinType
jointype,
  55
JoinPathExtraData *extra);
This hook takes an inner and outer relation, so can not be used for
N-way join as discussed in that thread.

Are you suggesting that we should add that hook before we implement join
pushdown in postgres_fdw? Am I missing something?


I don't mean it.  I'm thinking that I'll just revise the procedure so as 
to generate a FROM clause that is something like "from c left join d on 
(...) full join e on (...)" based on the existing hook you mentioned.



TODOs
=



In another thread Robert, Fujita-san and Kaigai-san are
discussing about
EvalPlanQual support for foreign joins. Corresponding changes to
postgres_fdw
will need to be added once those changes get committed.



Yeah, we would need those changes including helper functions to
create a local join execution plan for that support.  I'd like to
add those changes to your updated patch if it's okay.



Right now, we do not have any support for postgres_fdw join pushdown. I
was thinking of adding at least minimal support for the same using this
patch, may be by preventing join pushdown in case there are row marks
for now. That way, we at least have some way to play with postgres_fdw
join pushdown. 

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-02 Thread Etsuro Fujita

Hi Ashutosh,

On 2015/12/02 20:45, Ashutosh Bapat wrote:

It's been a long time since last patch on this thread was posted. I have
started
to work on supporting join pushdown for postgres_fdw.


Thanks for the work!


Generating paths




A join between two foreign relations is considered safe to push down if
1. The joining sides are pushable
2. The type of join is OUTER or INNER (LEFT/RIGHT/FULL/INNER). SEMI and ANTI
joins are not considered right now, because of difficulties in
constructing
the queries involving those. The join clauses of SEMI/ANTI joins are
not in a
form that can be readily converted to IN/EXISTS/NOT EXIST kind of
expression.
We might consider this as future optimization.
3. Joining sides do not have clauses which can not be pushed down to the
foreign
server. For an OUTER join this is important since those clauses need
to be
applied before performing the join and thus join can not be pushed
to the
foreign server. An example is
SELECT * FROM ft1 LEFT JOIN (SELECT * FROM ft2 where local_cond) ft2
ON (join clause)
Here the local_cond on ft2 needs to be executed before performing
LEFT JOIN
between ft1 and ft2.
This condition can be relaxed for an INNER join by pulling the local
clauses
up the join tree. But this needs more investigation and is not
considered in
this version.
4. The join conditions (e.g. conditions in ON clause) are all safe to
push down.
This is important for OUTER joins as pushing down join clauses
partially and
applying rest locally changes the result. There are ways [1] by
which partial
OUTER join can be completed by applying unpushable clauses locally
and then
nullifying the nullable side and eliminating duplicate non-nullable side
rows. But that's again out of scope of first version of postgres_fdw
join
pushdown.


As for 4, as commented in the patch, we could relax the requirement that 
all the join conditions (given by JoinPathExtraData's restrictlist) need 
to be safe to push down to the remote server;

* In case of inner join, all the conditions would not need to be safe.
* In case of outer join, all the "otherclauses" would not need to be 
safe, while I think all the "joinclauses" need to be safe to get the 
right results (where "joinclauses" and "otherclauses" are defined by 
extract_actual_join_clauses).  And I think we should do this relaxation 
to some extent for 9.6, to allow more joins to be pushed down.  I don't 
know about [1].  May I see more information about [1]?



Generating plan
===



Rest of this section describes the logic to construct
the SQL
for join; the logic is implemented as function deparseSelectSqlForRel().

deparseSelectSqlForRel() builds the SQL for given joinrel (and now for
baserel
asd well) recursively.
For joinrels
1. it constructs SQL representing either side of join, by calling itself
in recursive fashion.
2. These SQLs are converted into subqueries and become part of the FROM
clause
with appropriate JOIN type and clauses. The left and right
subqueries are
given aliases "l" and "r" respectively. The columns in each subquery are
aliased as "a1", "a2", "a3" and so on. Thus the third column on left
side can
be referenced as "l.a3" at any recursion level.
3. Targetlist is added representing the columns in the join result
expected at
that level.
4. The join clauses are added as part of ON clause
5. Any clauses that planner has deemed fit to be evaluated at that level
of join
are added as part of WHERE clause.


Honestly, I'm not sure that that is a good idea.  One reason for that is 
that a query string constructed by the procedure is difficult to read 
especially when the procedure is applied recursively.  So, I'm thinking 
to revise the procedure so as to construct a query string with a 
flattened FROM clause, as discussed in eg, [2].



TODOs
=
This patch is very much WIP patch to show case the approach and invite early
comments. I will continue to improve the patch and some of the areas
that will
be improved are
1. Costing of foreign join paths.
2. Various TODOs in the patch, making it more readable, finishing etc.
3. Tests
4. Any comments/suggestions on approach or the attached patch.


That would be great!


In another thread Robert, Fujita-san and Kaigai-san are discussing about
EvalPlanQual support for foreign joins. Corresponding changes to
postgres_fdw
will need to be added once those changes get committed.


Yeah, we would need those changes including helper functions to create a 
local join execution plan for that support.  I'd like to add those 
changes to your updated patch if it's okay.


Best regards,
Etsuro Fujita

[2] 
http://www.postgresql.org/message-id/ca+tgmozh9pb8bc+z3re7wo8cwuxaf7vp3066isg39qfr1jj...@mail.gmail.com





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-08-19 Thread Ashutosh Bapat
On Tue, Aug 4, 2015 at 2:20 PM, Shigeru Hanada shigeru.han...@gmail.com
wrote:

 Hi Ashutosh,

 Sorry for leaving the thread.

 2015-07-20 16:09 GMT+09:00 Ashutosh Bapat ashutosh.ba...@enterprisedb.com
 :
  In find_user_mapping(), if the first cache search returns a valid tuple,
 it
  is checked twice for validity, un-necessarily. Instead if the first
 search
  returns a valid tuple, it should be returned immediately. I see that this
  code was copied from GetUserMapping(), where as well it had the same
  problem.

 Oops.  I changed find_user_mapping to exit immediately when any valid
 cache was found.


Thanks.



  In build_join_rel(), we check whether user mappings of the two joining
  relations are same. If the user mappings are same, doesn't that imply
 that
  the foreign server and local user are same too?

 Yes, validity of umid is identical to serverid.  We can remove the
 check for serverid for some cycles.  One idea is to put Assert for
 serverid inside the if-statement block.


  Rest of the patch looks fine.

 Thanks


I started reviewing the other patches.

In patch foreign_join_v16.patch, the user mapping structure being passed to
GetConnection() is the one obtained from GetUserMappingById().
GetUserMappingById() constructs the user mapping structure from the user
mapping catalog. For public user mappings, catalog entries have InvalidOid
as userid. Thus, with this patch there is a chance that userid in
UserMapping being passed to GetConnection(), contains InvalidOid as userid.
This is not the case today. The UserMapping structure constructed using
GetUserMapping(Oid userid, Oid serverid) (which ultimately gets passed to
GetConnection()), has the passed in userid and not the one in the catalog.
Is this change intentional?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-08-04 Thread Shigeru Hanada
Hi Ashutosh,

Sorry for leaving the thread.

2015-07-20 16:09 GMT+09:00 Ashutosh Bapat ashutosh.ba...@enterprisedb.com:
 In find_user_mapping(), if the first cache search returns a valid tuple, it
 is checked twice for validity, un-necessarily. Instead if the first search
 returns a valid tuple, it should be returned immediately. I see that this
 code was copied from GetUserMapping(), where as well it had the same
 problem.

Oops.  I changed find_user_mapping to exit immediately when any valid
cache was found.

 In build_join_rel(), we check whether user mappings of the two joining
 relations are same. If the user mappings are same, doesn't that imply that
 the foreign server and local user are same too?

Yes, validity of umid is identical to serverid.  We can remove the
check for serverid for some cycles.  One idea is to put Assert for
serverid inside the if-statement block.

 Rest of the patch looks fine.

Thanks

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-07-20 Thread Ashutosh Bapat
Hi Hanada-san,
I have reviewed usermapping patch and here are some comments.

The patch applies cleanly on Head and compiles without problem. The make
check in regression folder does not show any failure.

In find_user_mapping(), if the first cache search returns a valid tuple, it
is checked twice for validity, un-necessarily. Instead if the first search
returns a valid tuple, it should be returned immediately. I see that this
code was copied from GetUserMapping(), where as well it had the same
problem.

In build_join_rel(), we check whether user mappings of the two joining
relations are same. If the user mappings are same, doesn't that imply that
the foreign server and local user are same too?

Rest of the patch looks fine.

On Thu, May 28, 2015 at 10:50 AM, Shigeru Hanada shigeru.han...@gmail.com
wrote:

 2015-05-22 18:37 GMT+09:00 Shigeru Hanada shigeru.han...@gmail.com:
  2015-05-21 23:11 GMT+09:00 Robert Haas robertmh...@gmail.com:
  On Sat, May 16, 2015 at 9:04 AM, Shigeru Hanada
  shigeru.han...@gmail.com wrote:
  d) All relations must have the same effective user id
  This check is done with userid stored in PgFdwRelationInfo, which is
  valid only when underlying relations have the same effective user id.
  Here effective user id means id of the user executing the query, or
  the owner of the view when the foreign table is accessed via view.
  Tom suggested that it is necessary to check that user mapping matches
  for security reason, and now I think it's better than checking
  effective user as current postgres_fdw does.
 
  So, should this be a separate patch?

 I wrote patches for this issue.  Let me describe their design.

 To require the matching of user mapping between two relations (base or
 join) which involves foreign tables, it would require these stuffs:

 a) Add new field umid to RelOptInfo to hold OID of user mapping used
 for the relation and children
 b) Propagate umid up to join relation only when both outer and inner
 have the save valid values
 c) Check matching of umid between two relations to be joined before
 calling GetForeignJoinPaths

 For a), adding an OID field would not be a serious burden.  Obtaining
 the OID of user mapping can be accomplished by calling GetUserMapping
 in get_relation_info, but it allocates an unnecessary UserMapping
 object, so I added GetUserMappingId which just returns OID.

 One concern about getting user mapping is checkAsUser.  Currently FDW
 authors are required to consider which user is valid as argument of
 GetUserMapping, because it is different from the result of GetUserId
 when the target relation is accessed via a view which is owned by
 another user.  This requirement would be easily ignored by FDW authors
 and the ignorance causes terrible security issues.  So IMO it should
 be hidden in the core code, and FDW authors should use user mappings
 determined by the core.  This would break FDW I/F, so we can't change
 it right now, but making GetUserMapping obsolete and mention it in the
 documents would guide FDW authors to the right direction.

 For b), such check should be done in build_join_rel, similarly to serverid.

 For c), we can reuse the check about RelOptInfo#fdwroutine in
 add_paths_to_joinrel, because all of serverid, umid and fdwroutine are
 valid only when both the servers and the user mappings match between
 outer and inner relations.

 Attached are the patches which implement the idea above except
 checkAsUser issue.
 usermapping_matching.patch: check matching of user mapping OIDs
 add_GetUserMappingById.patch: add helper function which is handy for
 FDWs to obtain UserMapping
 foreign_join_v16.patch: postgres_fdw which assumes user mapping
 matching is done in core

 Another idea about a) is to have an entire UserMapping object for each
 RelOptInfo, but it would require UserMapping to have extra field of
 its Oid, say umid, to compare them by OID.  But IMO creating
 UserMapping for every RelOptInfo seems waste of space and time.

 Thoughts?
 --
 Shigeru HANADA


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-05-22 Thread Shigeru Hanada
Thank for your comments.

2015-05-21 23:11 GMT+09:00 Robert Haas robertmh...@gmail.com:
 On Sat, May 16, 2015 at 9:04 AM, Shigeru Hanada
 shigeru.han...@gmail.com wrote:
 d) All relations must have the same effective user id
 This check is done with userid stored in PgFdwRelationInfo, which is
 valid only when underlying relations have the same effective user id.
 Here effective user id means id of the user executing the query, or
 the owner of the view when the foreign table is accessed via view.
 Tom suggested that it is necessary to check that user mapping matches
 for security reason, and now I think it's better than checking
 effective user as current postgres_fdw does.
 
 So, should this be a separate patch?

Yes, it should be an additional patch for Custom/Foreign join API which is 
already committed.
The patch would contain these changes:
* add new field usermappingid to RelOptInfo, it is InvalidOid for non-foreign 
tables
* obtain oid of user mapping for a foreign table, and store it in the RelOptInfo
  (we already have serverid in RelOptInfo, so userid is enough to identify user 
mapping though)
* propagate usermappingid up to the join relation when outer and inner 
relations have same valid value
* check matching of user mapping before calling GetForeignJoinPaths, rather 
than serverid

 One of my concerns about this patch is that it's got a lot of stuff in
 it that isn't obviously related to the patch.  Anything that is a
 separate change should be separated out into its own patch.  Perhaps
 you can post a set of patches that apply one on top of the next, with
 the changes for each one clearly separated.

IIUC, each patch should not break compilation, and should contain only one 
complete logical change which can't be separated into pieces.  I think whole of 
the patch is necessary to implement 

 
 e) Each source relation must not have any local filter
 Evaluating conditions of join source talbe potentially produces
 different result in OUTER join cases.  This can be relaxed for the
 cases when the join is INNER and local filters don't contain any
 volatile function/operator, but it is left as future enhancement.
 
 I think this restriction is a sign that you're not really doing this
 right. Consider:
 
 (1) SELECT * FROM a LEFT JOIN b ON a.x = b.x AND b.x = 3;
 (2) SELECT * FROM a LEFT JOIN b ON a.x = b.x WHERE b.x = 3;
 
 If you push down the scan of b, you can include the b.x = 3 qual in
 case (1) but not in case (2).  If you push down the join, you can
 include the qual in either case, but you must attach it in the same
 place where it was before.
 
 One big change about deparsing base relation is aliasing.  This patch
 adds column alias to SELECT clause even original query is a simple
 single table SELECT.
 
 fdw=# EXPLAIN (VERBOSE, COSTS false) SELECT * FROM pgbench_branches b;
 QUERY PLAN
 
 Foreign Scan on public.pgbench_branches b
   Output: bid, bbalance, filler
   Remote SQL: SELECT bid a9, bbalance a10, filler a11 FROM
 public.pgbench_branches
 (3 rows)
 
 As you see, every column has alias in format a%d with index value
 derived from pg_attribute.attnum.  Index value is attnum + 8, and the
 magic number 8 comes from FirstLowInvalidHeapAttributeNumber for the
 adjustment that makes attribute number of system attributes positive.
 
 Yeah.  I'm not sure this is a good idea.  The column labels are
 pointless at the outermost level.
 
 I'm not sure it isn't a good idea, either, but I have some doubts.

I fixed the patch to not add column alias to remote queries for single table.  
This change also reduces amount of differences from master branch slightly.


 
 One thing tricky is peusdo projection which is done by
 deparseProjectionSql for whole-row reference.  This is done by put the
 query string in FROM subquery and add whole-row reference in outer
 SELECT clause.  This is done by ExecProjection in 9.4 and older, but
 we need to do this in postgres_fdw because ExecProjection is not
 called any more.
 
 What commit changed this?

No commit changed this behavior, as Kaigai-san says.  If you still have 
comments, please refer my response to Kaigai-san.

 
 Thanks for your work on this.  Although I know progress has been slow,
 I think this work is really important to the project.

I agree.  I’ll take more time for this work.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-05-17 Thread Ashutosh Bapat
On Sat, May 16, 2015 at 6:34 PM, Shigeru Hanada shigeru.han...@gmail.com
wrote:

 Attached is the v15 patch of foreign join support for postgres_fdw.

 This patch is based on current master, and having being removed some
 hunks which are not essential.

 And I wrote description of changes done by the patch.  It is little
 bit long but I hope it would help understanding what the patch does.



 The total LOC of the patch is 3.7k, 1.8k for code and 2.0k for
 regression tests.  This is not a small patch, as Robert says, so I'd
 like to summarize changed done by this patch and explain why they are
 necessary.

 Outline of join push-down support for postgres_fdw
 ==

 This patch provides new capability to join between foriegn tables
 managed by same foreign server on remote side, by constructing a
 remote query containing join clause, and executing it as source of a
 pseudo foreign scan.  This patch is based on Custom/Foreign join patch
 written by Kohei KaiGai.

 PostgreSQL's planning for a query containing join is done with these steps:

 1. generate possible scan paths for each base relations
 2. generate join paths with bottom-up approach
 3. generate plan nodes required for the cheapest path
 4. execute the plan nodes to obtain result tuples

 Generating path node
 
 As of now, postgres_fdw generates a ForeignPath which represents a
 result of a join for each RelOptInfo, and planner can determine which
 path is cheapest from its cost values.

 GetForeignJoinPaths is called once for each join combination, i.e. A
 JOIN B and B JOIN A are considered separately.  So GetForeignJoinPath
 should return immediately to skip its job when the call is the
 reversed combination of already considered one.  For this purpose, I
 added update_safe flag to PgFdwRelationInfo.  This flag is always set
 for simple foriegn scans, but for join relation it is set only when
 the join can be pushed down.  The reason of adding this flag is that
 checking RelOptInfo#fdw_private is MULL can't prevent useless
 processing for a join combination which is reversed one of already
 considered join which can't be pushed down.


This is just  a suggestion, but you may actually get rid of the flag by
restricting the path generation only when say outer relation's pointer or
OID or relid is greater/lesser than inner relation's corresponding property.



 postgres_fdw's GetForeignJoinPaths() does various checks, to ensure
 that the result has same semantics as local joins.  Now postgres_fdw
 have these criteria:

 a) join type must be one of INNER/LEFT OUTER/RIGHT OUTER/FULL OUTER join
 This check is done with given jointype argument.  IOW, CROSS joins and
 SEMI/ANTI joins are not pushed down.  This is because 1) CROSS joins
 would produe more result than separeted join sources,


We might loose on some optimizations in aggregate push-down by not creating
paths altogether for CROSS joins. If there is a count(*) on CROSS join
result, we will push count(*) since there doesn't exist a foreign path for
the join. OR it might be possible that pushing down A INNER JOIN B CROSS
JOIN C is cheaper than performing some or all of the joins locally. I think
we should create a path and let it stay in the paths list. If there is no
path which can use CROSS join path, it will discarded eventually. Sorry for
bringing this so late in the discussion.


 and 2) ANTI/SEMI
 joins need to be deparsed as sub-query and it seems to take some more
 time to implement.
 b) Both outer and inner must have RelOptInfo#fdw_private
 Having fdw_private means that the RelOptInfo is safe to push down, so
 having no fdw_private means that portion is not safe to push down and
 thus the whole join is not safe to push down.
 c) All relations in the join must belong to the same server
 This check is done with serverid stored in RelOptInfo#fdw_private as
 PgFdwRelationInfo.  Joining relations belong to different servers is
 not leagal.  Even they finally have completely same connection
 information, they should accessed via different libpq sessions.
 Responsibility of checking server matching is under discussion in the
 Custom/Foreign join thread, and I'd vote for checking it in core.  If
 it is decided, I remove this criterion soon.
 d) All relations must have the same effective user id
 This check is done with userid stored in PgFdwRelationInfo, which is
 valid only when underlying relations have the same effective user id.
 Here effective user id means id of the user executing the query, or
 the owner of the view when the foreign table is accessed via view.
 Tom suggested that it is necessary to check that user mapping matches
 for security reason, and now I think it's better than checking
 effective user as current postgres_fdw does.
 e) Each source relation must not have any local filter
 Evaluating conditions of join source talbe potentially produces
 different result in OUTER join cases.  This can be relaxed for 

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-05-02 Thread Shigeru HANADA
Thanks for the comments.

2015/05/01 22:35、Robert Haas robertmh...@gmail.com のメール:
 On Mon, Apr 27, 2015 at 5:07 AM, Shigeru Hanada
 shigeru.han...@gmail.com wrote:
 2015-04-27 11:00 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com:
 Hanada-san, could you adjust your postgres_fdw patch according to
 the above new (previous?) definition.
 
 The attached v14 patch is the revised version for your v13 patch.  It also 
 contains changed for Ashutosh’s comments.
 
 We should probably move this discussion to a new thread now that the
 other patch is committed.  Changing subject line accordingly.
 
 Generally, there's an awful lot of changes in this patch - it is over
 2000 insertions and more than 450 deletions - and it's not awfully
 obvious why all of those changes are there.  I think this patch needs
 a detailed README to accompany it explaining what the various changes
 in the patch are and why those things got changed; or maybe there is a
 way to break it up into multiple patches so that we can take a more
 incremental approach.  I am really suspicious of the amount of
 wholesale reorganization of code that this patch is doing.  It's
 really hard to validate that a reorganization like that is necessary,
 or that it's correct, and it's gonna make back-patching noticeably
 harder in the future.  If we really need this much code churn it needs
 careful justification; if we don't, we shouldn't do it.
 

I agree.  I’ll write detailed description for the patch and repost the new one 
with rebasing onto current HEAD.  I’m sorry but it will take a day or so...

 +SET enable_mergejoin = off; -- planner choose MergeJoin even it has
 higher costs, so disable it for testing.
 
 This seems awfully strange.  Why would the planner choose a plan if it
 had a higher cost?

I thought so but I couldn’t reproduce such situation now.  I’ll investigate it 
more.  If the issue has gone, I’ll remove that SET statement for 
straightforward tests.


 
 -* If the table or the server is configured to use remote estimates,
 -* identify which user to do remote access as during planning.  This
 +* Identify which user to do remote access as during planning.  This
 * should match what ExecCheckRTEPerms() does.  If we fail due
 to lack of
 * permissions, the query would have failed at runtime anyway.
 */
 -   if (fpinfo-use_remote_estimate)
 -   {
 -   RangeTblEntry *rte = planner_rt_fetch(baserel-relid, root);
 -   Oid userid = rte-checkAsUser ?
 rte-checkAsUser : GetUserId();
 -
 -   fpinfo-user = GetUserMapping(userid, 
 fpinfo-server-serverid);
 -   }
 -   else
 -   fpinfo-user = NULL;
 +   rte = planner_rt_fetch(baserel-relid, root);
 +   fpinfo-userid = rte-checkAsUser ? rte-checkAsUser : GetUserId();
 
 So, wait a minute, remote estimates aren't optional any more?

No, it seems to be removed accidentally.  I’ll check the reason again though, 
but I’ll revert the change unless I find any problem.

--
Shigeru HANADA
shigeru.han...@gmail.com






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers