Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/07/02 0:32, Robert Haas wrote: On Wed, Jun 29, 2016 at 1:38 AM, Ashutosh Bapat wrote: On Tue, Jun 28, 2016 at 12:52 PM, Etsuro Fujita wrote: Please find attached an updated version. This looks good to me. Regression tests pass. Committed. Thanks to Etsuro Fujita for the patch and to Ashutosh for the review. Thanks, Robert and Ashutosh! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Wed, Jun 29, 2016 at 1:38 AM, Ashutosh Bapat wrote: > On Tue, Jun 28, 2016 at 12:52 PM, Etsuro Fujita > wrote: >> On 2016/06/28 15:23, Ashutosh Bapat wrote: >>> >>> The wording "column "whole-row reference ..." doesn't look good. >>> Whole-row reference is not a column. The error context itself should be >>> "whole row reference for foreign table ft1". >> >> Ah, you are right. Please find attached an updated version. > > This looks good to me. Regression tests pass. Committed. Thanks to Etsuro Fujita for the patch and to Ashutosh for the review. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Fri, Jul 1, 2016 at 7:45 PM, Robert Haas wrote: > On Tue, Jun 28, 2016 at 8:20 AM, Ashutosh Bapat > wrote: > >> > postgres_fdw resets the search path to pg_catalog while opening > >> > connection > >> > to the server. The reason behind this is explained in deparse.c > >> > > >> > * We assume that the remote session's search_path is exactly > >> > "pg_catalog", > >> > * and thus we need schema-qualify all and only names outside > >> > pg_catalog. > >> > >> Hmm. OK, should we revert the schema-qualification part of that > >> commit, or just leave it alone? > > > > If we leave that code as is, someone who wants to add similar code later > > would get confused or will be tempted to create more instances of > > schema-qualification. I think we should revert the schema qualification. > > OK, done. > Thanks. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Tue, Jun 28, 2016 at 8:20 AM, Ashutosh Bapat wrote: >> > postgres_fdw resets the search path to pg_catalog while opening >> > connection >> > to the server. The reason behind this is explained in deparse.c >> > >> > * We assume that the remote session's search_path is exactly >> > "pg_catalog", >> > * and thus we need schema-qualify all and only names outside >> > pg_catalog. >> >> Hmm. OK, should we revert the schema-qualification part of that >> commit, or just leave it alone? > > If we leave that code as is, someone who wants to add similar code later > would get confused or will be tempted to create more instances of > schema-qualification. I think we should revert the schema qualification. OK, done. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Tue, Jun 28, 2016 at 12:52 PM, Etsuro Fujita wrote: > On 2016/06/28 15:23, Ashutosh Bapat wrote: > >> The wording "column "whole-row reference ..." doesn't look good. >> Whole-row reference is not a column. The error context itself should be >> "whole row reference for foreign table ft1". >> > > Ah, you are right. Please find attached an updated version. > > This looks good to me. Regression tests pass. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
> > > > > postgres_fdw resets the search path to pg_catalog while opening > connection > > to the server. The reason behind this is explained in deparse.c > > > > * We assume that the remote session's search_path is exactly > "pg_catalog", > > * and thus we need schema-qualify all and only names outside pg_catalog. > > Hmm. OK, should we revert the schema-qualification part of that > commit, or just leave it alone? > > If we leave that code as is, someone who wants to add similar code later would get confused or will be tempted to create more instances of schema-qualification. I think we should revert the schema qualification. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/28 15:23, Ashutosh Bapat wrote: The wording "column "whole-row reference ..." doesn't look good. Whole-row reference is not a column. The error context itself should be "whole row reference for foreign table ft1". Ah, you are right. Please find attached an updated version. Best regards, Etsuro Fujita postgres-fdw-conv-error-callback-v4.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Tue, Jun 28, 2016 at 11:43 AM, Etsuro Fujita wrote: > On 2016/06/28 13:53, Ashutosh Bapat wrote: > >> Ideally, we should point out the specific column that faced the >> conversion problem and report it, instead of saying the whole row >> reference conversion caused the problem. But that may be too difficult. >> > > I think so too. > > Or at least the error message should read "whole row reference of >> foreign table ft1". >> > > OK, attached is an updated version of the patch, which uses "whole-row > reference", not "wholerow". > The wording "column "whole-row reference ..." doesn't look good. Whole-row reference is not a column. The error context itself should be "whole row reference for foreign table ft1". -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/28 13:53, Ashutosh Bapat wrote: Ideally, we should point out the specific column that faced the conversion problem and report it, instead of saying the whole row reference conversion caused the problem. But that may be too difficult. I think so too. Or at least the error message should read "whole row reference of foreign table ft1". OK, attached is an updated version of the patch, which uses "whole-row reference", not "wholerow". Best regards, Etsuro Fujita postgres-fdw-conv-error-callback-v3.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Tue, Jun 28, 2016 at 9:00 AM, Etsuro Fujita wrote: > On 2016/06/27 18:56, Ashutosh Bapat wrote: > >> On Mon, Jun 27, 2016 at 3:06 PM, Etsuro Fujita >> mailto:fujita.ets...@lab.ntt.co.jp>> wrote: >> > > I found another bug in error handling of whole-row references in >> join pushdown; conversion_error_callback fails to take into account >> that get_relid_attribute_name(Oid relid, AttrNumber attnum) can't >> handle whole-row references (ie, attnum=0), in which case that would >> cause cache lookup errors. Attached is a small patch to address >> this issue. >> > > Do you have any testcase reproducing the bug here? It would be good to >> include that test in the regression. >> > > Done. > > There is a always a possibility that a user would create a table (which >> can be used as target for the foreign table) with column named >> 'wholerow', in which case s/he will get confused with this error message. >> > > By grepping I found that there are error messages that use "whole-row > table reference", "whole-row reference", These two should be fine. > or "wholerow", so the use of "wholerow" seems to me reasonable. (And IMO > I think "wholerow" would most likely fit into that errcontext().) > As an error message text, which is not referring to any particular column, these are fine. But in this case, we are specifically referring to a particular column. That reference can be confusing. The text ' column "wholerow" of foreign table "ft1"' is confusing either way. A lay user who doesn't have created table with "wholerow" column, s/he will not understand which column the error context refers to. For a user who has created table with "wholerow" column, he won't find any problem with the data in that column. Ideally, we should point out the specific column that faced the conversion problem and report it, instead of saying the whole row reference conversion caused the problem. But that may be too difficult. Or at least the error message should read "whole row reference of foreign table ft1". -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/27 18:56, Ashutosh Bapat wrote: On Mon, Jun 27, 2016 at 3:06 PM, Etsuro Fujita mailto:fujita.ets...@lab.ntt.co.jp>> wrote: I found another bug in error handling of whole-row references in join pushdown; conversion_error_callback fails to take into account that get_relid_attribute_name(Oid relid, AttrNumber attnum) can't handle whole-row references (ie, attnum=0), in which case that would cause cache lookup errors. Attached is a small patch to address this issue. Do you have any testcase reproducing the bug here? It would be good to include that test in the regression. Done. There is a always a possibility that a user would create a table (which can be used as target for the foreign table) with column named 'wholerow', in which case s/he will get confused with this error message. By grepping I found that there are error messages that use "whole-row table reference", "whole-row reference", or "wholerow", so the use of "wholerow" seems to me reasonable. (And IMO I think "wholerow" would most likely fit into that errcontext().) Best regards, Etsuro Fujita postgres-fdw-conv-error-callback-v2.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Mon, Jun 27, 2016 at 2:47 AM, Ashutosh Bapat wrote: > On Sat, Jun 25, 2016 at 12:44 AM, Robert Haas wrote: >> On Wed, Jun 22, 2016 at 4:11 AM, Amit Langote >> wrote: >> >> In an outer join we have to differentiate between a row being null >> >> (because >> >> there was no joining row on nullable side) and a non-null row with all >> >> column values being null. If we cast the whole-row expression to a text >> >> e.g. r.*::text and test the resultant value for nullness, it gives us >> >> what >> >> we want. A null row casted to text is null and a row with all null >> >> values >> >> casted to text is not null. >> > >> > You are right. There may be non-null rows with all columns null which >> > are >> > handled wrongly (as Rushabh reports) and the hack I proposed is not >> > right >> > for. Especially if from non-nullable side as in the reported case, NULL >> > test for such a whole-row-var would produce the wrong result. Casting >> > to >> > text as your patch does produces the correct behavior. >> >> I agree, but I think we'd better cast to pg_catalog.text instead, just >> to be safe. Committed that way. > > postgres_fdw resets the search path to pg_catalog while opening connection > to the server. The reason behind this is explained in deparse.c > > * We assume that the remote session's search_path is exactly "pg_catalog", > * and thus we need schema-qualify all and only names outside pg_catalog. Hmm. OK, should we revert the schema-qualification part of that commit, or just leave it alone? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Mon, Jun 27, 2016 at 3:06 PM, Etsuro Fujita wrote: > On 2016/06/25 4:14, Robert Haas wrote: > >> Committed that way. >> > > Thanks for taking care of this! > > I found another bug in error handling of whole-row references in join > pushdown; conversion_error_callback fails to take into account that > get_relid_attribute_name(Oid relid, AttrNumber attnum) can't handle > whole-row references (ie, attnum=0), in which case that would cause cache > lookup errors. Attached is a small patch to address this issue. > Do you have any testcase reproducing the bug here? It would be good to include that test in the regression. There is a always a possibility that a user would create a table (which can be used as target for the foreign table) with column named 'wholerow', in which case s/he will get confused with this error message. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/25 4:14, Robert Haas wrote: Committed that way. Thanks for taking care of this! I found another bug in error handling of whole-row references in join pushdown; conversion_error_callback fails to take into account that get_relid_attribute_name(Oid relid, AttrNumber attnum) can't handle whole-row references (ie, attnum=0), in which case that would cause cache lookup errors. Attached is a small patch to address this issue. Best regards, Etsuro Fujita postgres-fdw-conv-error-callback.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Sat, Jun 25, 2016 at 12:44 AM, Robert Haas wrote: > On Wed, Jun 22, 2016 at 4:11 AM, Amit Langote > wrote: > >> In an outer join we have to differentiate between a row being null > (because > >> there was no joining row on nullable side) and a non-null row with all > >> column values being null. If we cast the whole-row expression to a text > >> e.g. r.*::text and test the resultant value for nullness, it gives us > what > >> we want. A null row casted to text is null and a row with all null > values > >> casted to text is not null. > > > > You are right. There may be non-null rows with all columns null which > are > > handled wrongly (as Rushabh reports) and the hack I proposed is not right > > for. Especially if from non-nullable side as in the reported case, NULL > > test for such a whole-row-var would produce the wrong result. Casting to > > text as your patch does produces the correct behavior. > > I agree, but I think we'd better cast to pg_catalog.text instead, just > to be safe. Committed that way. > postgres_fdw resets the search path to pg_catalog while opening connection to the server. The reason behind this is explained in deparse.c * We assume that the remote session's search_path is exactly "pg_catalog", * and thus we need schema-qualify all and only names outside pg_catalog. So, we should not be schema-qualifying types while casting. From deparse.c /* * Convert type OID + typmod info into a type name we can ship to the remote * server. Someplace else had better have verified that this type name is * expected to be known on the remote end. * * This is almost just format_type_with_typemod(), except that if left to its * own devices, that function will make schema-qualification decisions based * on the local search_path, which is wrong. We must schema-qualify all * type names that are not in pg_catalog. We assume here that built-in types * are all in pg_catalog and need not be qualified; otherwise, qualify. */ static char * deparse_type_name(Oid type_oid, int32 typemod) { if (is_builtin(type_oid)) return format_type_with_typemod(type_oid, typemod); else return format_type_with_typemod_qualified(type_oid, typemod); } -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Wed, Jun 22, 2016 at 5:16 AM, Ashutosh Bapat wrote: >> However, if we support deparsing subqueries, the remote query in the above >> example could be rewritten into something like this: >> >> SELECT ss.c2 FROM t1 LEFT JOIN (SELECT t2.a, ROW(a, b) FROM t2) ss(c1, c2) >> ON (t1.a = ss.c1); >> >> So we would no longer need "CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, >> r2.b) END" in the target list in the remote query. > > Right. Although, it means that the query processor at the other end has to > do extra work for pulling up the subqueries. I would be inclined to pick the method that generates cleaner SQL. I doubt that difference in optimization speed matters here - it's presumably very small, especially when compared to the cost of the network round-trip. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Wed, Jun 22, 2016 at 4:11 AM, Amit Langote wrote: >> In an outer join we have to differentiate between a row being null (because >> there was no joining row on nullable side) and a non-null row with all >> column values being null. If we cast the whole-row expression to a text >> e.g. r.*::text and test the resultant value for nullness, it gives us what >> we want. A null row casted to text is null and a row with all null values >> casted to text is not null. > > You are right. There may be non-null rows with all columns null which are > handled wrongly (as Rushabh reports) and the hack I proposed is not right > for. Especially if from non-nullable side as in the reported case, NULL > test for such a whole-row-var would produce the wrong result. Casting to > text as your patch does produces the correct behavior. I agree, but I think we'd better cast to pg_catalog.text instead, just to be safe. Committed that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/24 17:38, Ashutosh Bapat wrote: > On Fri, Jun 24, 2016 at 1:59 PM, Amit Langote wrote: >> I'm now starting to wonder if it would be outright wrong to just use the >> alias names of corresponding foreign tables directly for whole-row >> references? So, instead of these in target lists of remote queries: >> >> SELECT CASE WHEN (r1.*)::text IS NOT NULL THEN *ROW (r1.*)* END, ... > > This is wrong. The deparsed query looks like > SELECT CASE WHEN (r1.*)::text IS NOT NULL THEN *ROW (r1.col1, r1.col2, ...)* > END, Yeah, I had noticed that in explain outputs (should have written like that). My point though is why we don't consider dropping the CASE WHEN ... END target list item solution in favor of simply using the alias name for a whole-row reference without affecting the correctness behavior wrt outer joins. Using CASE WHEN to emit the correct result has revealed its downside (this thread) although a simple whole-row reference would just work without any special consideration. > The reason for this is that the foreign table definition may not match the > target table definition. This has been explained in the comments that you > have deleted in your patch. Am I missing something? What may go wrong if we requested r1 (an alias name) in target list of the sent query instead of ROW(r1.col1, ...) for a whole-row reference in the original query? Fear of wrong set of data arriving or in wrong order or something like that? This part, I'm not quite sure about. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Fri, Jun 24, 2016 at 1:59 PM, Amit Langote wrote: > On 2016/06/24 15:44, Ashutosh Bapat wrote: > >> > >> I think the proposed idea of applying record::text explicit coercion to > a > >> whole-row reference in the IS NOT NULL condition in the CASE WHEN > >> conversion would work as expected as you explained, but I'm concerned > that > >> the cost wouldn't be negligible when the foreign table has a lot of > columns. > > > > That's right, if the foreign server doesn't optimize the case for IS NOT > > NULL, which it doesn't :) > > > > I am happy to use any cheaper means e.g a function which counts number of > > columns in a record. All we need here is a way to correctly identify > when a > > record is null and not null in the way we want (as described upthread). I > > didn't find any quickly. Do you have any suggestions? > > I'm now starting to wonder if it would be outright wrong to just use the > alias names of corresponding foreign tables directly for whole-row > references? So, instead of these in target lists of remote queries: > > SELECT CASE WHEN (r1.*)::text IS NOT NULL THEN *ROW (r1.*)* END, ... > > This is wrong. The deparsed query looks like SELECT CASE WHEN (r1.*)::text IS NOT NULL THEN *ROW (r1.col1, r1.col2, ...)* END, The reason for this is that the foreign table definition may not match the target table definition. This has been explained in the comments that you have deleted in your patch. Am I missing something? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/24 15:44, Ashutosh Bapat wrote: >> >> I think the proposed idea of applying record::text explicit coercion to a >> whole-row reference in the IS NOT NULL condition in the CASE WHEN >> conversion would work as expected as you explained, but I'm concerned that >> the cost wouldn't be negligible when the foreign table has a lot of columns. > > That's right, if the foreign server doesn't optimize the case for IS NOT > NULL, which it doesn't :) > > I am happy to use any cheaper means e.g a function which counts number of > columns in a record. All we need here is a way to correctly identify when a > record is null and not null in the way we want (as described upthread). I > didn't find any quickly. Do you have any suggestions? I'm now starting to wonder if it would be outright wrong to just use the alias names of corresponding foreign tables directly for whole-row references? So, instead of these in target lists of remote queries: SELECT CASE WHEN (r1.*)::text IS NOT NULL THEN ROW (r1.*) END, ... Just: SELECT r1, ... It seems to produce the correct result. Although, I may be missing something because CASE WHEN solution seems to me to be deliberately chosen. In any case, attached patch doing the above did not change the results of related regression tests (plans obviously did change since they don't output the CASE WHENs in target lists anymore). Also see the example below: create extension postgres_fdw; create server myserver foreign data wrapper postgres_fdw options (dbname 'postgres', use_remote_estimate 'true'); create user mapping for CURRENT_USER server myserver; create table t1(a int, b int); create table t2(a int, b int); create foreign table ft1(a int, b int) server myserver options (table_name 't1'); create foreign table ft2(a int, b int) server myserver options (table_name 't2'); insert into t1 values (1), (2); insert into t1 values (null, null); insert into t2 values (1); insert into t2 values (1, 2); explain (costs off, verbose) select t1, t1 is null, t2, t2 is null from ft1 t1 left join ft2 t2 on (t1.a = t2.a); QUERY PLAN - Foreign Scan Output: t1.*, (t1.* IS NULL), t2.*, (t2.* IS NULL) Relations: (public.ft1 t1) LEFT JOIN (public.ft2 t2) Remote SQL: SELECT r1, r2 FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON (((r1.a = r2.a (4 rows) select t1, t1 is null as t1null, t2, t2 is null as t2null from ft1 t1 left join ft2 t2 on (t1.a = t2.a); t1 | t1null | t2 | t2null --++---+ (1,) | f | (1,) | f (1,) | f | (1,2) | f (2,) | f | | t (,) | t | | t (4 rows)) alter server myserver options (set use_remote_estimate 'false'); analyze; explain (costs off, verbose) select t1, t1 is null, t2, t2 is null from ft1 t1 left join ft2 t2 on (t1.a = t2.a); QUERY PLAN -- Merge Left Join Output: t1.*, (t1.* IS NULL), t2.*, (t2.* IS NULL) Merge Cond: (t1.a = t2.a) -> Sort Output: t1.*, t1.a Sort Key: t1.a -> Foreign Scan on public.ft1 t1 Output: t1.*, t1.a Remote SQL: SELECT a, b FROM public.t1 -> Sort Output: t2.*, t2.a Sort Key: t2.a -> Foreign Scan on public.ft2 t2 Output: t2.*, t2.a Remote SQL: SELECT a, b FROM public.t2 (15 rows) select t1, t1 is null as t1null, t2, t2 is null as t2null from ft1 t1 left join ft2 t2 on (t1.a = t2.a); t1 | t1null | t2 | t2null --++---+ (1,) | f | (1,) | f (1,) | f | (1,2) | f (2,) | f | | t (,) | t | | t (4 rows) And produces the correct result for Rushabh's case. Thoughts? Thanks, Amit diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index c91f3a5..d742693 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1609,57 +1609,18 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root, } else if (varattno == 0) { - /* Whole row reference */ - Relation rel; - Bitmapset *attrs_used; - - /* Required only to be passed down to deparseTargetList(). */ - List *retrieved_attrs; - - /* Get RangeTblEntry from array in PlannerInfo. */ - rte = planner_rt_fetch(varno, root); - - /* - * The lock on the relation will be held by upper callers, so it's - * fine to open it with no lock here. - */ - rel = heap_open(rte->relid, NoLock); - - /* - * The local name of the foreign table can not be recognized by the - * foreign server and the table it references on foreign server might - * have different column ordering or different columns than those - * declared locally. Hence we have to deparse whole-row reference as - * ROW(columns referenced locally). Construct this by deparsing a -
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
> > I think the proposed idea of applying record::text explicit coercion to a > whole-row reference in the IS NOT NULL condition in the CASE WHEN > conversion would work as expected as you explained, but I'm concerned that > the cost wouldn't be negligible when the foreign table has a lot of columns. > That's right, if the foreign server doesn't optimize the case for IS NOT NULL, which it doesn't :) I am happy to use any cheaper means e.g a function which counts number of columns in a record. All we need here is a way to correctly identify when a record is null and not null in the way we want (as described upthread). I didn't find any quickly. Do you have any suggestions? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/22 19:37, Ashutosh Bapat wrote: On Wed, Jun 22, 2016 at 3:57 PM, Etsuro Fujita Maybe I'm confused, but I think that in the system-column case it's the user's responsibility to specify system columns for foreign tables in a local query only when that makes sense on the remote end, as shown in the below counter example: postgres=# create foreign table fv1 (a int, b int) server myserver options (table_name 'v1'); CREATE FOREIGN TABLE postgres=# select ctid, * from fv1; ERROR: column "ctid" does not exist CONTEXT: Remote SQL command: SELECT a, b, ctid FROM public.v1 But a ctid, when available, would rightly get nullified when referenced as a column of table. Really? What happens with the other system columns is we 0 them out locally, whether they are available at the foreign server or not. We never try to check whether they are available at the foreign server or not. If we use such column's column name to decide whether to report 0 or null and if that column is not available at the foreign table, we will get error. I think we should avoid that. Those column anyway do not make any sense. Agreed except for oid. I think we should handle oid in the same way as ctid, which I'll work on in the next CF. I think the proposed idea of applying record::text explicit coercion to a whole-row reference in the IS NOT NULL condition in the CASE WHEN conversion would work as expected as you explained, but I'm concerned that the cost wouldn't be negligible when the foreign table has a lot of columns. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Wed, Jun 22, 2016 at 3:57 PM, Etsuro Fujita wrote: > On 2016/06/22 18:16, Ashutosh Bapat wrote: > >> On Wed, Jun 22, 2016 at 2:26 PM, Etsuro Fujita >> mailto:fujita.ets...@lab.ntt.co.jp>> wrote: >> > > I think we could address this in another way once we support >> deparsing subqueries; rewrite the remote query into something that >> wouldn't need the CASE WHEN conversion. For example, we currently >> have: >> >> postgres=# explain verbose select ft2 from ft1 left join ft2 on >> ft1.a = ft2.a; >> >> QUERY PLAN >> >> -- >> Foreign Scan (cost=100.00..110.04 rows=1 width=32) >>Output: ft2.* >>Relations: (public.ft1) LEFT JOIN (public.ft2) >>Remote SQL: SELECT CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, >> r2.b) END FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON (((r1.a = >> r2.a >> (4 rows) >> >> However, if we support deparsing subqueries, the remote query in the >> above example could be rewritten into something like this: >> >> SELECT ss.c2 FROM t1 LEFT JOIN (SELECT t2.a, ROW(a, b) FROM t2) >> ss(c1, c2) ON (t1.a = ss.c1); >> >> So we would no longer need "CASE WHEN r2.* IS NOT NULL THEN >> ROW(r2.a, r2.b) END" in the target list in the remote query. >> > > Right. Although, it means that the query processor at the other end has >> to do extra work for pulling up the subqueries. >> > > Yeah, that's right. But this approach seems not so ugly... > > For the CASE WHEN conversion for a system column other than ctid, we >> could also address this by replacing the whole-row reference in the >> IS NOT NULL condition in that conversion with the system column >> reference. >> > > That would not work again as the system column reference would make >> sense locally but may not be available at the foreign server e.g. >> foreign table targeting a view a tableoid is requested. >> > > Maybe I'm confused, but I think that in the system-column case it's the > user's responsibility to specify system columns for foreign tables in a > local query only when that makes sense on the remote end, as shown in the > below counter example: > > postgres=# create foreign table fv1 (a int, b int) server myserver options > (table_name 'v1'); > CREATE FOREIGN TABLE > postgres=# select ctid, * from fv1; > ERROR: column "ctid" does not exist > CONTEXT: Remote SQL command: SELECT a, b, ctid FROM public.v1 > But a ctid, when available, would rightly get nullified when referenced as a column of table. What happens with the other system columns is we 0 them out locally, whether they are available at the foreign server or not. We never try to check whether they are available at the foreign server or not. If we use such column's column name to decide whether to report 0 or null and if that column is not available at the foreign table, we will get error. I think we should avoid that. Those column anyway do not make any sense. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/22 18:16, Ashutosh Bapat wrote: On Wed, Jun 22, 2016 at 2:26 PM, Etsuro Fujita mailto:fujita.ets...@lab.ntt.co.jp>> wrote: I think we could address this in another way once we support deparsing subqueries; rewrite the remote query into something that wouldn't need the CASE WHEN conversion. For example, we currently have: postgres=# explain verbose select ft2 from ft1 left join ft2 on ft1.a = ft2.a; QUERY PLAN -- Foreign Scan (cost=100.00..110.04 rows=1 width=32) Output: ft2.* Relations: (public.ft1) LEFT JOIN (public.ft2) Remote SQL: SELECT CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, r2.b) END FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON (((r1.a = r2.a (4 rows) However, if we support deparsing subqueries, the remote query in the above example could be rewritten into something like this: SELECT ss.c2 FROM t1 LEFT JOIN (SELECT t2.a, ROW(a, b) FROM t2) ss(c1, c2) ON (t1.a = ss.c1); So we would no longer need "CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, r2.b) END" in the target list in the remote query. Right. Although, it means that the query processor at the other end has to do extra work for pulling up the subqueries. Yeah, that's right. But this approach seems not so ugly... For the CASE WHEN conversion for a system column other than ctid, we could also address this by replacing the whole-row reference in the IS NOT NULL condition in that conversion with the system column reference. That would not work again as the system column reference would make sense locally but may not be available at the foreign server e.g. foreign table targeting a view a tableoid is requested. Maybe I'm confused, but I think that in the system-column case it's the user's responsibility to specify system columns for foreign tables in a local query only when that makes sense on the remote end, as shown in the below counter example: postgres=# create foreign table fv1 (a int, b int) server myserver options (table_name 'v1'); CREATE FOREIGN TABLE postgres=# select ctid, * from fv1; ERROR: column "ctid" does not exist CONTEXT: Remote SQL command: SELECT a, b, ctid FROM public.v1 where v1 is a view created on the remote server "myserver". Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/22 18:14, Ashutosh Bapat wrote: > I wonder whether such a whole-row-var would arise from the nullable side >> of a join? I guess not. Not that I'm saying we shouldn't account for that >> case at all since any and every whole-row-var in the targetlist currently >> gets that treatment, even those that are known non-nullable. Couldn't we >> have prevented the latter somehow? IOW, only generate the CASE WHEN when >> a Var being deparsed is known nullable as the comment there says: >> >> deparse.c: >> >> 1639 /* >> 1640 * In case the whole-row reference is under an outer join then it has >> 1641 * to go NULL whenver the rest of the row goes NULL. Deparsing a join >> 1642 * query would always involve multiple relations, thus qualify_col >> 1643 * would be true. >> 1644 */ >> 1645 if (qualify_col) >> 1646 { >> 1647 appendStringInfoString(buf, "CASE WHEN"); >> 1648 ADD_REL_QUALIFIER(buf, varno); >> 1649 appendStringInfo(buf, "* IS NOT NULL THEN "); >> 1650 } >> >> But I guess just fixing the expression as your patch does may be just fine. >> > I thought about that, but it means that we have compute the nullable relids > (which isn't a big deal I guess). That's something more than necessary for > fixing the bug, which is the focus in beta stage right now. Agreed. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Wed, Jun 22, 2016 at 2:26 PM, Etsuro Fujita wrote: > On 2016/06/22 17:11, Amit Langote wrote: > >> I wonder whether such a whole-row-var would arise from the nullable side >> of a join? I guess not. Not that I'm saying we shouldn't account for that >> case at all since any and every whole-row-var in the targetlist currently >> gets that treatment, even those that are known non-nullable. Couldn't we >> have prevented the latter somehow? IOW, only generate the CASE WHEN when >> a Var being deparsed is known nullable as the comment there says: >> >> deparse.c: >> >> 1639 /* >> 1640 * In case the whole-row reference is under an outer join then it has >> 1641 * to go NULL whenver the rest of the row goes NULL. Deparsing a join >> 1642 * query would always involve multiple relations, thus qualify_col >> 1643 * would be true. >> 1644 */ >> 1645 if (qualify_col) >> 1646 { >> 1647 appendStringInfoString(buf, "CASE WHEN"); >> 1648 ADD_REL_QUALIFIER(buf, varno); >> 1649 appendStringInfo(buf, "* IS NOT NULL THEN "); >> 1650 } >> > > I think we could address this in another way once we support deparsing > subqueries; rewrite the remote query into something that wouldn't need the > CASE WHEN conversion. For example, we currently have: > > postgres=# explain verbose select ft2 from ft1 left join ft2 on ft1.a = > ft2.a; > QUERY PLAN > > -- > Foreign Scan (cost=100.00..110.04 rows=1 width=32) >Output: ft2.* >Relations: (public.ft1) LEFT JOIN (public.ft2) >Remote SQL: SELECT CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, r2.b) END > FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON (((r1.a = r2.a > (4 rows) > > However, if we support deparsing subqueries, the remote query in the above > example could be rewritten into something like this: > > SELECT ss.c2 FROM t1 LEFT JOIN (SELECT t2.a, ROW(a, b) FROM t2) ss(c1, c2) > ON (t1.a = ss.c1); > > So we would no longer need "CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, > r2.b) END" in the target list in the remote query. > Right. Although, it means that the query processor at the other end has to do extra work for pulling up the subqueries. > > For the CASE WHEN conversion for a system column other than ctid, we could > also address this by replacing the whole-row reference in the IS NOT NULL > condition in that conversion with the system column reference. > > That would not work again as the system column reference would make sense locally but may not be available at the foreign server e.g. foreign table targeting a view a tableoid is requested. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
I wonder whether such a whole-row-var would arise from the nullable side > of a join? I guess not. Not that I'm saying we shouldn't account for that > case at all since any and every whole-row-var in the targetlist currently > gets that treatment, even those that are known non-nullable. Couldn't we > have prevented the latter somehow? IOW, only generate the CASE WHEN when > a Var being deparsed is known nullable as the comment there says: > > deparse.c: > > 1639 /* > 1640 * In case the whole-row reference is under an outer join then it has > 1641 * to go NULL whenver the rest of the row goes NULL. Deparsing a join > 1642 * query would always involve multiple relations, thus qualify_col > 1643 * would be true. > 1644 */ > 1645 if (qualify_col) > 1646 { > 1647 appendStringInfoString(buf, "CASE WHEN"); > 1648 ADD_REL_QUALIFIER(buf, varno); > 1649 appendStringInfo(buf, "* IS NOT NULL THEN "); > 1650 } > > But I guess just fixing the expression as your patch does may be just fine. > > I thought about that, but it means that we have compute the nullable relids (which isn't a big deal I guess). That's something more than necessary for fixing the bug, which is the focus in beta stage right now. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/22 17:11, Amit Langote wrote: I wonder whether such a whole-row-var would arise from the nullable side of a join? I guess not. Not that I'm saying we shouldn't account for that case at all since any and every whole-row-var in the targetlist currently gets that treatment, even those that are known non-nullable. Couldn't we have prevented the latter somehow? IOW, only generate the CASE WHEN when a Var being deparsed is known nullable as the comment there says: deparse.c: 1639 /* 1640 * In case the whole-row reference is under an outer join then it has 1641 * to go NULL whenver the rest of the row goes NULL. Deparsing a join 1642 * query would always involve multiple relations, thus qualify_col 1643 * would be true. 1644 */ 1645 if (qualify_col) 1646 { 1647 appendStringInfoString(buf, "CASE WHEN"); 1648 ADD_REL_QUALIFIER(buf, varno); 1649 appendStringInfo(buf, "* IS NOT NULL THEN "); 1650 } I think we could address this in another way once we support deparsing subqueries; rewrite the remote query into something that wouldn't need the CASE WHEN conversion. For example, we currently have: postgres=# explain verbose select ft2 from ft1 left join ft2 on ft1.a = ft2.a; QUERY PLAN -- Foreign Scan (cost=100.00..110.04 rows=1 width=32) Output: ft2.* Relations: (public.ft1) LEFT JOIN (public.ft2) Remote SQL: SELECT CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, r2.b) END FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON (((r1.a = r2.a (4 rows) However, if we support deparsing subqueries, the remote query in the above example could be rewritten into something like this: SELECT ss.c2 FROM t1 LEFT JOIN (SELECT t2.a, ROW(a, b) FROM t2) ss(c1, c2) ON (t1.a = ss.c1); So we would no longer need "CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, r2.b) END" in the target list in the remote query. For the CASE WHEN conversion for a system column other than ctid, we could also address this by replacing the whole-row reference in the IS NOT NULL condition in that conversion with the system column reference. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/21 20:42, Ashutosh Bapat wrote: > On Tue, Jun 21, 2016 at 4:36 PM, Amit Langote wrote: >> On 2016/06/21 16:27, Rushabh Lathia wrote: >>> >>> And as above documentation clearly says that IS NULL and IS NOT NULL do >> not >>> always return inverse results for row-valued expressions. So need to >> change >>> the >>> deparse logic into postgres_fdw - how ? May be to use IS NULL rather >> then IS >>> NOT NULL? >>> >>> Input/thought? >> >> Perhaps - NOT expr IS NULL? Like in the attached. >> >> > As the documentation describes row is NULL is going to return true when all > the columns in a row are NULL, even though row itself is not null. So, with > your patch a row (null, null, null) is going to be output as a NULL row. Is > that right? Right. > In an outer join we have to differentiate between a row being null (because > there was no joining row on nullable side) and a non-null row with all > column values being null. If we cast the whole-row expression to a text > e.g. r.*::text and test the resultant value for nullness, it gives us what > we want. A null row casted to text is null and a row with all null values > casted to text is not null. You are right. There may be non-null rows with all columns null which are handled wrongly (as Rushabh reports) and the hack I proposed is not right for. Especially if from non-nullable side as in the reported case, NULL test for such a whole-row-var would produce the wrong result. Casting to text as your patch does produces the correct behavior. I wonder whether such a whole-row-var would arise from the nullable side of a join? I guess not. Not that I'm saying we shouldn't account for that case at all since any and every whole-row-var in the targetlist currently gets that treatment, even those that are known non-nullable. Couldn't we have prevented the latter somehow? IOW, only generate the CASE WHEN when a Var being deparsed is known nullable as the comment there says: deparse.c: 1639 /* 1640 * In case the whole-row reference is under an outer join then it has 1641 * to go NULL whenver the rest of the row goes NULL. Deparsing a join 1642 * query would always involve multiple relations, thus qualify_col 1643 * would be true. 1644 */ 1645 if (qualify_col) 1646 { 1647 appendStringInfoString(buf, "CASE WHEN"); 1648 ADD_REL_QUALIFIER(buf, varno); 1649 appendStringInfo(buf, "* IS NOT NULL THEN "); 1650 } But I guess just fixing the expression as your patch does may be just fine. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/21 21:37, Ashutosh Bapat wrote: How about using a system column eg, ctid, for the CASE WHEN conversion; in Rushabh's example the reference to "r1" would be converted with "CASE WHEN r1.ctid IS NOT NULL THEN ROW(r1.empno, r1.ename, r1.job, r1.mgr, r1.hiredate, r1.sal, r1.comm, r1.deptno) END". IMO I think that that would be much simpler than Ashutosh's approach. A foreign table can have a view, a regular table, another foreign table or a materialised view a its target. A view does not support any of the system columns, so none of them are available. You are right. Sorry for the noise. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
> How about using a system column eg, ctid, for the CASE WHEN conversion; in > Rushabh's example the reference to "r1" would be converted with "CASE WHEN > r1.ctid IS NOT NULL THEN ROW(r1.empno, r1.ename, r1.job, r1.mgr, > r1.hiredate, r1.sal, r1.comm, r1.deptno) END". IMO I think that that would > be much simpler than Ashutosh's approach. > > A foreign table can have a view, a regular table, another foreign table or a materialised view a its target. A view does not support any of the system columns, so none of them are available. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/21 20:42, Ashutosh Bapat wrote: On Tue, Jun 21, 2016 at 4:36 PM, Amit Langote mailto:langote_amit...@lab.ntt.co.jp>> wrote: On 2016/06/21 16:27, Rushabh Lathia wrote: > Now I was under impression the IS NOT NULL should be always in inverse of > IS NULL, but clearly here its not the case with wholerow. But further > looking at > the document its saying different thing for wholerow: > > https://www.postgresql.org/docs/9.5/static/functions-comparison.html > > Note: If the expression is row-valued, then IS NULL is true when the row > expression > itself is null or when all the row's fields are null, while IS NOT NULL is > true > when the row expression itself is non-null and all the row's fields are > non-null. > Because of this behavior, IS NULL and IS NOT NULL do not always return > inverse > results for row-valued expressions, i.e., a row-valued expression that > contains > both NULL and non-null values will return false for both tests. This > definition > conforms to the SQL standard, and is a change from the inconsistent behavior > exhibited by PostgreSQL versions prior to 8.2. > And as above documentation clearly says that IS NULL and IS NOT NULL do not > always return inverse results for row-valued expressions. So need to change > the > deparse logic into postgres_fdw - how ? May be to use IS NULL rather then IS > NOT NULL? > > Input/thought? Perhaps - NOT expr IS NULL? Like in the attached. As the documentation describes row is NULL is going to return true when all the columns in a row are NULL, even though row itself is not null. So, with your patch a row (null, null, null) is going to be output as a NULL row. Is that right? Yeah, I think so. In an outer join we have to differentiate between a row being null (because there was no joining row on nullable side) and a non-null row with all column values being null. If we cast the whole-row expression to a text e.g. r.*::text and test the resultant value for nullness, it gives us what we want. A null row casted to text is null and a row with all null values casted to text is not null. postgres=# select (null, null, null)::text is not null; ?column? -- t (1 row) postgres=# select null::record::text is not null; ?column? -- f (1 row) In find_coercion_pathway(), we resort to IO coercion for record::text explicit coercion. This should always work the way we want as record_out is a strict function and for non-null values it outputs at least the parenthesis which will be treated as non-null text. 2253 /* 2254 * If we still haven't found a possibility, consider automatic casting 2255 * using I/O functions. We allow assignment casts to string types and 2256 * explicit casts from string types to be handled this way. (The 2257 * CoerceViaIO mechanism is a lot more general than that, but this is 2258 * all we want to allow in the absence of a pg_cast entry.) It would 2259 * probably be better to insist on explicit casts in both directions, 2260 * but this is a compromise to preserve something of the pre-8.3 2261 * behavior that many types had implicit (yipes!) casts to text. 2262 */ PFA the patch with the cast to text. This is probably uglier than expected, but I don't know any better test to find nullness of a record, the way we want here. The patch also includes the expected output changes in the EXPLAIN output. How about using a system column eg, ctid, for the CASE WHEN conversion; in Rushabh's example the reference to "r1" would be converted with "CASE WHEN r1.ctid IS NOT NULL THEN ROW(r1.empno, r1.ename, r1.job, r1.mgr, r1.hiredate, r1.sal, r1.comm, r1.deptno) END". IMO I think that that would be much simpler than Ashutosh's approach. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On Tue, Jun 21, 2016 at 4:36 PM, Amit Langote wrote: > On 2016/06/21 16:27, Rushabh Lathia wrote: > > Now I was under impression the IS NOT NULL should be always in inverse of > > IS NULL, but clearly here its not the case with wholerow. But further > > looking at > > the document its saying different thing for wholerow: > > > > https://www.postgresql.org/docs/9.5/static/functions-comparison.html > > > > Note: If the expression is row-valued, then IS NULL is true when the row > > expression > > itself is null or when all the row's fields are null, while IS NOT NULL > is > > true > > when the row expression itself is non-null and all the row's fields are > > non-null. > > Because of this behavior, IS NULL and IS NOT NULL do not always return > > inverse > > results for row-valued expressions, i.e., a row-valued expression that > > contains > > both NULL and non-null values will return false for both tests. This > > definition > > conforms to the SQL standard, and is a change from the inconsistent > behavior > > exhibited by PostgreSQL versions prior to 8.2. > > > > > > And as above documentation clearly says that IS NULL and IS NOT NULL do > not > > always return inverse results for row-valued expressions. So need to > change > > the > > deparse logic into postgres_fdw - how ? May be to use IS NULL rather > then IS > > NOT NULL? > > > > Input/thought? > > Perhaps - NOT expr IS NULL? Like in the attached. > > As the documentation describes row is NULL is going to return true when all the columns in a row are NULL, even though row itself is not null. So, with your patch a row (null, null, null) is going to be output as a NULL row. Is that right? In an outer join we have to differentiate between a row being null (because there was no joining row on nullable side) and a non-null row with all column values being null. If we cast the whole-row expression to a text e.g. r.*::text and test the resultant value for nullness, it gives us what we want. A null row casted to text is null and a row with all null values casted to text is not null. postgres=# select (null, null, null)::text is not null; ?column? -- t (1 row) postgres=# select null::record::text is not null; ?column? -- f (1 row) In find_coercion_pathway(), we resort to IO coercion for record::text explicit coercion. This should always work the way we want as record_out is a strict function and for non-null values it outputs at least the parenthesis which will be treated as non-null text. 2253 /* 2254 * If we still haven't found a possibility, consider automatic casting 2255 * using I/O functions. We allow assignment casts to string types and 2256 * explicit casts from string types to be handled this way. (The 2257 * CoerceViaIO mechanism is a lot more general than that, but this is 2258 * all we want to allow in the absence of a pg_cast entry.) It would 2259 * probably be better to insist on explicit casts in both directions, 2260 * but this is a compromise to preserve something of the pre-8.3 2261 * behavior that many types had implicit (yipes!) casts to text. 2262 */ PFA the patch with the cast to text. This is probably uglier than expected, but I don't know any better test to find nullness of a record, the way we want here. The patch also includes the expected output changes in the EXPLAIN output. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company pg_null_wr.patch Description: application/download -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - wrong results with whole-row reference
On 2016/06/21 16:27, Rushabh Lathia wrote: > Now I was under impression the IS NOT NULL should be always in inverse of > IS NULL, but clearly here its not the case with wholerow. But further > looking at > the document its saying different thing for wholerow: > > https://www.postgresql.org/docs/9.5/static/functions-comparison.html > > Note: If the expression is row-valued, then IS NULL is true when the row > expression > itself is null or when all the row's fields are null, while IS NOT NULL is > true > when the row expression itself is non-null and all the row's fields are > non-null. > Because of this behavior, IS NULL and IS NOT NULL do not always return > inverse > results for row-valued expressions, i.e., a row-valued expression that > contains > both NULL and non-null values will return false for both tests. This > definition > conforms to the SQL standard, and is a change from the inconsistent behavior > exhibited by PostgreSQL versions prior to 8.2. > > > And as above documentation clearly says that IS NULL and IS NOT NULL do not > always return inverse results for row-valued expressions. So need to change > the > deparse logic into postgres_fdw - how ? May be to use IS NULL rather then IS > NOT NULL? > > Input/thought? Perhaps - NOT expr IS NULL? Like in the attached. explain verbose select e, e.empno, d.deptno, d.dname from f_emp e left join f_dept d on e.deptno = d.deptno and e.sal > 3000 order by 2, 3 limit 10; QUERY PLAN - - - Limit (cost=100.00..136.86 rows=10 width=236) Output: e.*, e.empno, d.deptno, d.dname -> Foreign Scan (cost=100.00..2304.10 rows=598 width=236) Output: e.*, e.empno, d.deptno, d.dname Relations: (public.f_emp e) LEFT JOIN (public.f_dept d) Remote SQL: SELECT CASE WHEN NOT r1.* IS NULL THEN ROW(r1.empno, r1.ename, r1.job, r1.mgr, r1.hiredate, r1.sal, r1.comm, r1.deptno) END, r1.empno, r2.deptno , r2.dname FROM (public.emp r1 LEFT JOIN public.dept r2 ON (((r1.sal > 3000::numeric)) AND ((r1.deptno = r2.deptno ORDER BY r1.empno ASC NULLS LAST, r2.deptno AS C NULLS LAST (6 rows) select e, e.empno, d.deptno, d.dname from f_emp e left join f_dept d on e.deptno = d.deptno and e.sal > 3000 order by 2, 3 limit 10; e | empno | deptno | dname ---+---++ (7369,SMITH,CLERK,7902,1980-12-17,800.00,,20) | 7369 || (7499,ALLEN,SALESMAN,7698,1981-02-20,1600.00,300.00,30) | 7499 || (7521,WARD,SALESMAN,7698,1981-02-22,1250.00,500.00,30)| 7521 || (7566,JONES,MANAGER,7839,1981-04-02,2975.00,,20) | 7566 || (7654,MARTIN,SALESMAN,7698,1981-09-28,1250.00,1400.00,30) | 7654 || (7698,BLAKE,MANAGER,7839,1981-05-01,2850.00,,30) | 7698 || (7782,CLARK,MANAGER,7839,1981-06-09,2450.00,,10) | 7782 || (7788,SCOTT,ANALYST,7566,1987-04-19,3000.00,,20) | 7788 || (7839,KING,PRESIDENT,,1981-11-17,5000.00,,10) | 7839 | 10 | ACCOUNTING (7844,TURNER,SALESMAN,7698,1981-09-08,1500.00,0.00,30)| 7844 || (10 rows) Thanks, Amit diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index c91f3a5..7446506 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1644,9 +1644,9 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root, */ if (qualify_col) { - appendStringInfoString(buf, "CASE WHEN "); + appendStringInfoString(buf, "CASE WHEN NOT "); ADD_REL_QUALIFIER(buf, varno); - appendStringInfo(buf, "* IS NOT NULL THEN "); + appendStringInfo(buf, "* IS NULL THEN "); } appendStringInfoString(buf, "ROW("); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers