Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-12 Thread Amit Kapila
On Wed, Apr 12, 2017 at 10:30 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>> On Wed, Apr 12, 2017 at 12:40 AM, Tom Lane  wrote:
>>> Anyone want to draft a patch for this?
>
>> Please find patch attached based on above discussion.
>
> This patch seems fairly incomplete: you can't just whack around major data
> structures like PlannedStmt and PlannerGlobal without doing the janitorial
> work of cleaning up support logic such as outfuncs/readfuncs.
>

Oops, missed it.

> Also, when you think about what will happen when doing a copyObject()
> on a completed plan, it seems like a pretty bad idea to link subplans
> into two independent lists.  We'll end up with two separate copies of
> those subtrees in places like the plan cache.
>
> I'm inclined to think the other approach of adding a parallel_safe
> flag to struct Plan is a better answer in the long run.  It's basically
> free to have it there because of alignment considerations, and it's
> hard to believe that we're not going to need that labeling at some
> point in the future anyway.
>
> I had been a bit concerned about having to have some magic in outfuncs
> and/or readfuncs to avoid transferring the unsafe subplan(s), but I see
> from your patch there's a better way: we can have ExecSerializePlan modify
> the subplan list as it transfers it into its private PlannedStmt struct.
> But I think iterating over the list and examining each subplan's
> parallel_safe marking is a better way to do that.
>
> Will work on this approach.
>

Thanks, I see that you have committed patch on those lines.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-12 Thread Tom Lane
Amit Kapila  writes:
> On Wed, Apr 12, 2017 at 12:40 AM, Tom Lane  wrote:
>> Anyone want to draft a patch for this?

> Please find patch attached based on above discussion.

This patch seems fairly incomplete: you can't just whack around major data
structures like PlannedStmt and PlannerGlobal without doing the janitorial
work of cleaning up support logic such as outfuncs/readfuncs.

Also, when you think about what will happen when doing a copyObject()
on a completed plan, it seems like a pretty bad idea to link subplans
into two independent lists.  We'll end up with two separate copies of
those subtrees in places like the plan cache.

I'm inclined to think the other approach of adding a parallel_safe
flag to struct Plan is a better answer in the long run.  It's basically
free to have it there because of alignment considerations, and it's
hard to believe that we're not going to need that labeling at some
point in the future anyway.

I had been a bit concerned about having to have some magic in outfuncs
and/or readfuncs to avoid transferring the unsafe subplan(s), but I see
from your patch there's a better way: we can have ExecSerializePlan modify
the subplan list as it transfers it into its private PlannedStmt struct.
But I think iterating over the list and examining each subplan's
parallel_safe marking is a better way to do that.

Will work on this approach.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-12 Thread Tom Lane
Amit Kapila  writes:
> On Wed, Apr 12, 2017 at 12:40 AM, Tom Lane  wrote:
>> Anyone want to draft a patch for this?

> Please find patch attached based on above discussion.

Thanks, I'll look at this later today.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-12 Thread Amit Kapila
On Wed, Apr 12, 2017 at 12:40 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>
>> However, the worker will
>> never execute such a plan as we don't generate a plan where unsafe
>> sublan/initplan is referenced in the node passed to the worker.  If we
>> want to avoid passing parallel-unsafe subplans to workers, then I
>> think we can maintain a list of parallel safe subplans along with
>> subplans in PlannerGlobal and PlannedStmt or maybe keep a parallel
>> safe flag in Plan so that we can pass only parallel safe plans to
>> workers.
>
> Right, we could, say, leave a hole in the subplan list corresponding
> to any subplan that's not parallel-safe.  That seems like a good idea
> anyway because right now there's clearly no cross-check preventing
> a worker from trying to run such a subplan.
>
> Anyone want to draft a patch for this?
>

Please find patch attached based on above discussion.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


push-parallel-safe-subplans-workers_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-11 Thread Ashutosh Bapat
On Tue, Apr 11, 2017 at 10:02 PM, Tom Lane  wrote:
> Ashutosh Bapat  writes:
>> On Tue, Apr 11, 2017 at 8:11 PM, Tom Lane  wrote:
>>> OK.  I was trying to be noninvasive, but this does look more sensible.
>>> I think this might read better if we inverted the test (so that
>>> the outer-join-joinclause-must-be-pushable case would come first).
>
>> Ok. In fact, thinking more about it, we should probably test
>> JOIN_INNER explicitly instead of !IS_OUTER_JOIN() since SEMI_JOINS are
>> not considered as outer joins and I am not sure how would we be
>> treating the quals for those.
>
> No, that's correct as-is --- or at least, if it's not correct, there
> are a bunch of other places that are also not correct.

Hmm, when we support SEMI and ANTI join push-down, we will have a
bunch of other places to change. This is one of them.

>
> Thinking about this further, though, it seems like a more straightforward
> solution to the original problem is to not store quals in a Plan's
> fdw_private list in the first place.  Putting them there is at best
> useless overhead that the finished plan will have to drag around;
> and it's not terribly hard to imagine future changes that would make
> having never-processed-by-setrefs.c qual trees in a plan be broken in
> other ways.  Can't we use a field in the rel's PgFdwRelationInfo to
> carry the remote_exprs forward from postgresGetForeignPlan to
> postgresPlanDirectModify?
>

I was thinking of using fdw_recheck_quals by renaming it. We don't
push DML with joins down to the foreign server. So, it's ok to set
fdw_recheck_quals (or whatever name we choose) to be NIL in join and
upper relation case as we do today, without changing anything else.
When we support DMLs with join pushdown, we will have to use subquery
for the scan and thus will not require fdw_recheck_quals to be set.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-11 Thread Amit Kapila
On Wed, Apr 12, 2017 at 12:40 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>
>> I think there is a possibility of doing ExecInitNode in a parallel
>> worker for a parallel-unsafe subplan, because we pass a list of all
>> the sublans stored in planned statement.
>
> It's more than a possibility.  If you run Andreas' test case against
> HEAD, now that the can't-transmit-RestrictInfo failure is fixed,
> you will find that the parallel worker actually creates and immediately
> destroys an FDW remote connection.  (The easiest way to prove that is
> to turn on log_connections/log_disconnections.  BTW, is there any
> convenient way to attach a debugger to a parallel worker process as it's
> being launched?
>

What I generally do to debug parallel worker case is to add a "while
(1) {}" kind of loop in the beginning of ParallelQueryMain() or in
ParallelWorkerMain() depending on area I want to debug, like in this
case it would be okay to add such a loop in ParallelQueryMain().  Then
as the worker will be waiting there attach the debugger to that
process and resume debugging.  It is easier to identify the worker
process if we add an infinite loop, otherwise one can use sleep or
some other form of wait or debug break mechanism as well.

>  I couldn't manage to catch the worker in the act.)
>
> So I think this is clearly a Bad Thing and we'd better do something
> about it.  The consequences for postgres_fdw aren't so awful perhaps,
> but other non-parallel-safe FDWs might have bigger problems with this
> behavior.
>
>> However, the worker will
>> never execute such a plan as we don't generate a plan where unsafe
>> sublan/initplan is referenced in the node passed to the worker.  If we
>> want to avoid passing parallel-unsafe subplans to workers, then I
>> think we can maintain a list of parallel safe subplans along with
>> subplans in PlannerGlobal and PlannedStmt or maybe keep a parallel
>> safe flag in Plan so that we can pass only parallel safe plans to
>> workers.
>
> Right, we could, say, leave a hole in the subplan list corresponding
> to any subplan that's not parallel-safe.  That seems like a good idea
> anyway because right now there's clearly no cross-check preventing
> a worker from trying to run such a subplan.
>
> Anyone want to draft a patch for this?
>

Yes, I will work on it in this week and possibly today or tomorrow and
either produce a patch or if I face any problems, then will update
about them here.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-11 Thread Tom Lane
Amit Kapila  writes:
> On Tue, Apr 11, 2017 at 5:29 AM, Tom Lane  wrote:
>> I think the fact that we see this problem at all may indicate an
>> oversight in commit 5e6d8d2bbbcace304450b309e79366c0da4063e4 ("Allow
>> parallel workers to execute subplans").  If the worker were to actually
>> run the initplan, bad things would happen (the worker would create its
>> own remote connection, which we don't want).  Now, in the problem plan
>> the InitPlan is actually attached to the topmost Gather, which I think
>> is safe because it'll be run by the master, but I wonder if we're being
>> careful enough about non-parallel-safe plans for initplans/subplans.

> Initplans are never marked parallel safe, only subplans that are
> generated for parallel safe paths are marked as parallel safe.

OK ...

>> Also, even if the worker never actually executes the plan node, just
>> doing ExecInitNode on it in a parallel worker might be more than a
>> non-parallel-safe FDW is prepared to cope with.

> I think there is a possibility of doing ExecInitNode in a parallel
> worker for a parallel-unsafe subplan, because we pass a list of all
> the sublans stored in planned statement.

It's more than a possibility.  If you run Andreas' test case against
HEAD, now that the can't-transmit-RestrictInfo failure is fixed,
you will find that the parallel worker actually creates and immediately
destroys an FDW remote connection.  (The easiest way to prove that is
to turn on log_connections/log_disconnections.  BTW, is there any
convenient way to attach a debugger to a parallel worker process as it's
being launched?  I couldn't manage to catch the worker in the act.)

So I think this is clearly a Bad Thing and we'd better do something
about it.  The consequences for postgres_fdw aren't so awful perhaps,
but other non-parallel-safe FDWs might have bigger problems with this
behavior.

> However, the worker will
> never execute such a plan as we don't generate a plan where unsafe
> sublan/initplan is referenced in the node passed to the worker.  If we
> want to avoid passing parallel-unsafe subplans to workers, then I
> think we can maintain a list of parallel safe subplans along with
> subplans in PlannerGlobal and PlannedStmt or maybe keep a parallel
> safe flag in Plan so that we can pass only parallel safe plans to
> workers.

Right, we could, say, leave a hole in the subplan list corresponding
to any subplan that's not parallel-safe.  That seems like a good idea
anyway because right now there's clearly no cross-check preventing
a worker from trying to run such a subplan.

Anyone want to draft a patch for this?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-11 Thread Tom Lane
Ashutosh Bapat  writes:
> On Tue, Apr 11, 2017 at 8:11 PM, Tom Lane  wrote:
>> OK.  I was trying to be noninvasive, but this does look more sensible.
>> I think this might read better if we inverted the test (so that
>> the outer-join-joinclause-must-be-pushable case would come first).

> Ok. In fact, thinking more about it, we should probably test
> JOIN_INNER explicitly instead of !IS_OUTER_JOIN() since SEMI_JOINS are
> not considered as outer joins and I am not sure how would we be
> treating the quals for those.

No, that's correct as-is --- or at least, if it's not correct, there
are a bunch of other places that are also not correct.

Thinking about this further, though, it seems like a more straightforward
solution to the original problem is to not store quals in a Plan's
fdw_private list in the first place.  Putting them there is at best
useless overhead that the finished plan will have to drag around;
and it's not terribly hard to imagine future changes that would make
having never-processed-by-setrefs.c qual trees in a plan be broken in
other ways.  Can't we use a field in the rel's PgFdwRelationInfo to
carry the remote_exprs forward from postgresGetForeignPlan to
postgresPlanDirectModify?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-11 Thread Ashutosh Bapat
On Tue, Apr 11, 2017 at 8:11 PM, Tom Lane  wrote:
> Ashutosh Bapat  writes:
>> On Tue, Apr 11, 2017 at 3:53 AM, Tom Lane  wrote:
>> +* there is no need for EPQ recheck at a join (and Vars or Aggrefs in
>> +* the qual might not be available locally anyway).
>
>> I am not sure whether EPQ checks make sense for an upper relation, esp. a
>> grouped relation. So mentioning Aggref can be confusing here. For joins, we
>> execute EPQ by executing the (so called) outer plan created from 
>> fdw_outerpath.
>> For this, we fetch whole-row references for the joining relations and build 
>> the
>> output row by executing the local outer plan attached to the ForeignScanPlan.
>> This whole-row references has values for all Vars, so even though Vars are 
>> not
>> available, corresponding column values are available.  So mentioning Vars is
>> also confusing here.
>
> Well, my first attempt at fixing this merged remote_conds and remote_exprs
> together, which in the previous version of the code resulted in always
> passing the remote conditions as fdw_recheck_quals too.  And what happened
> was that I got "variable not found in subplan target list" errors for Vars
> used inside Aggrefs in pushed-to-the-remote HAVING clauses.  Which is
> unsurprising -- it'd be impossible to return such a Var if the grouping is
> being done remotely.  So I think it's important for this comment to
> explain that we *can't* put upperrel quals into fdw_recheck_quals, not just
> that "there's no need to".

The comments in the committed versions are good.

> But pointing out that at a join, there's a
> different mechanism that's responsible for EPQ checks is good.  I'll
> reword this again.
>
>> We can club the code to separate other and join clauses, checking that all 
>> join
>> clauses are shippable and separating other clauses into local and remote
>> clauses in a single list traversal as done in the attached patch.
>
> OK.  I was trying to be noninvasive, but this does look more sensible.
> I think this might read better if we inverted the test (so that
> the outer-join-joinclause-must-be-pushable case would come first).

Ok. In fact, thinking more about it, we should probably test
JOIN_INNER explicitly instead of !IS_OUTER_JOIN() since SEMI_JOINS are
not considered as outer joins and I am not sure how would we be
treating the quals for those.

>
> If we're going for a more-than-minimal patch, I'm inclined to also
> move the first loop in postgresGetForeignPlan into the "else" branch,
> so that it doesn't get executed in the join/upperrel case.  I realize
> that it's going to iterate zero times in that case, but it's just
> confusing to have it there when we don't expect it to do anything
> and we would only throw away the results if it did do anything.
>

Committed version looks quite clear.

I noticed that you committed the patch, while I was writing this mail.
Sending it anyway.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-11 Thread Tom Lane
Andreas Seltenreich  writes:
> I see the above ERROR logged a lot when testing master at eef8c0069e
> with a postgres_fdw around.  Below is a recipe to reproduce it on top of
> the regression DB.

I've pushed a fix that should get you past that problem, although
I suspect we still have some issues with treatment of parallel-unsafe
subplans.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-11 Thread Tom Lane
Ashutosh Bapat  writes:
> On Tue, Apr 11, 2017 at 3:53 AM, Tom Lane  wrote:
> +* there is no need for EPQ recheck at a join (and Vars or Aggrefs in
> +* the qual might not be available locally anyway).

> I am not sure whether EPQ checks make sense for an upper relation, esp. a
> grouped relation. So mentioning Aggref can be confusing here. For joins, we
> execute EPQ by executing the (so called) outer plan created from 
> fdw_outerpath.
> For this, we fetch whole-row references for the joining relations and build 
> the
> output row by executing the local outer plan attached to the ForeignScanPlan.
> This whole-row references has values for all Vars, so even though Vars are not
> available, corresponding column values are available.  So mentioning Vars is
> also confusing here.

Well, my first attempt at fixing this merged remote_conds and remote_exprs
together, which in the previous version of the code resulted in always
passing the remote conditions as fdw_recheck_quals too.  And what happened
was that I got "variable not found in subplan target list" errors for Vars
used inside Aggrefs in pushed-to-the-remote HAVING clauses.  Which is
unsurprising -- it'd be impossible to return such a Var if the grouping is
being done remotely.  So I think it's important for this comment to
explain that we *can't* put upperrel quals into fdw_recheck_quals, not just
that "there's no need to".  But pointing out that at a join, there's a
different mechanism that's responsible for EPQ checks is good.  I'll
reword this again.

> We can club the code to separate other and join clauses, checking that all 
> join
> clauses are shippable and separating other clauses into local and remote
> clauses in a single list traversal as done in the attached patch.

OK.  I was trying to be noninvasive, but this does look more sensible.
I think this might read better if we inverted the test (so that
the outer-join-joinclause-must-be-pushable case would come first).

If we're going for a more-than-minimal patch, I'm inclined to also
move the first loop in postgresGetForeignPlan into the "else" branch,
so that it doesn't get executed in the join/upperrel case.  I realize
that it's going to iterate zero times in that case, but it's just
confusing to have it there when we don't expect it to do anything
and we would only throw away the results if it did do anything.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-11 Thread Ashutosh Bapat
On Tue, Apr 11, 2017 at 3:53 AM, Tom Lane  wrote:
> I wrote:
>> Apparently, postgres_fdw is trying to store RestrictInfos in the
>> fdw_private field of a ForeignScan node.  That won't do; those aren't
>> supposed to be present in a finished plan tree, so there's no readfuncs.c
>> support for them (and we're not adding it).
>
>> Don't know if this is a new bug, or ancient but not previously reachable.
>> It seems to be nearly the inverse of the problem you found yesterday,
>> in which postgres_fdw was stripping RestrictInfos sooner than it really
>> ought to.  Apparently that whole business needs a fresh look.
>
> Attached is a proposed patch that cleans up the mess here --- and it was
> a mess.  The comments for PgFdwRelationInfo claimed that the remote_conds
> and local_conds fields contained bare clauses, but actually they could
> contain either bare clauses or RestrictInfos or both, depending on where
> the clauses had come from.  And there was some seriously obscure and
> undocumented logic around how the fdw_recheck_quals got set up for a
> foreign join node.  (BTW, said logic is assuming that no EPQ recheck is
> needed for a foreign join.  I commented it to that effect, but I'm not
> very sure that I believe it.  If there's a bug there, though, it's
> independent of the immediate problem.)
>
> Anybody want to review this, or shall I just push it?

+* there is no need for EPQ recheck at a join (and Vars or Aggrefs in
+* the qual might not be available locally anyway).
I am not sure whether EPQ checks make sense for an upper relation, esp. a
grouped relation. So mentioning Aggref can be confusing here. For joins, we
execute EPQ by executing the (so called) outer plan created from fdw_outerpath.
For this, we fetch whole-row references for the joining relations and build the
output row by executing the local outer plan attached to the ForeignScanPlan.
This whole-row references has values for all Vars, so even though Vars are not
available, corresponding column values are available.  So mentioning Vars is
also confusing here.  Attached patch has those comments modified, please check
if that looks ok.

-   extract_actual_join_clauses(extra->restrictlist, ,
);
+   {
+   /* Grovel through the clauses to separate into two lists */
+   joinclauses = otherclauses = NIL;
+   foreach(lc, extra->restrictlist)
+   {
+   RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc);
+
+   if (rinfo->is_pushed_down)
+   otherclauses = lappend(otherclauses, rinfo);
+   else
+   joinclauses = lappend(joinclauses, rinfo);
+   }
+   }
We are duplicating the logic to separate join and other clauses here. But then
we are already using is_pushed_down to separate the clauses at various places
e.g. compute_semi_anti_join_factors(), so probably this change is fine.

We can club the code to separate other and join clauses, checking that all join
clauses are shippable and separating other clauses into local and remote
clauses in a single list traversal as done in the attached patch.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


use-RestrictInfos-consistently-in-postgres_fdw_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-11 Thread Ashutosh Bapat
On Tue, Apr 11, 2017 at 3:53 AM, Tom Lane  wrote:
> I wrote:
>> Apparently, postgres_fdw is trying to store RestrictInfos in the
>> fdw_private field of a ForeignScan node.  That won't do; those aren't
>> supposed to be present in a finished plan tree, so there's no readfuncs.c
>> support for them (and we're not adding it).
>
>> Don't know if this is a new bug, or ancient but not previously reachable.
>> It seems to be nearly the inverse of the problem you found yesterday,
>> in which postgres_fdw was stripping RestrictInfos sooner than it really
>> ought to.  Apparently that whole business needs a fresh look.
>
> Attached is a proposed patch that cleans up the mess here --- and it was
> a mess.  The comments for PgFdwRelationInfo claimed that the remote_conds
> and local_conds fields contained bare clauses, but actually they could
> contain either bare clauses or RestrictInfos or both, depending on where
> the clauses had come from.  And there was some seriously obscure and
> undocumented logic around how the fdw_recheck_quals got set up for a
> foreign join node.  (BTW, said logic is assuming that no EPQ recheck is
> needed for a foreign join.  I commented it to that effect, but I'm not
> very sure that I believe it.  If there's a bug there, though, it's
> independent of the immediate problem.)
>
> Anybody want to review this, or shall I just push it?
>

Will be reviewing it in a couple of hours.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-11 Thread Amit Kapila
On Tue, Apr 11, 2017 at 5:29 AM, Tom Lane  wrote:
> I wrote:
>> (BTW, I've not yet looked to see if this needs to be back-ported.)
>
> postgres_fdw will definitely include RestrictInfos in its fdw_private
> list in 9.6.  However, I've been unable to provoke a visible failure.
> After some rooting around, the reason seems to be that:
>
> 1. postgres_fdw doesn't mark its plans as parallel-safe --- it doesn't
> even have a IsForeignScanParallelSafe method.  So you'd think that one
> of its plans could never be transmitted to a parallel worker ... but:
>
> 2. Andreas' test query doesn't have the foreign scan in the main query
> tree, it's in an InitPlan.  9.6 did not transmit any subplans or initplans
> to parallel workers, but HEAD does.  So HEAD sends the illegal structure
> to the worker which spits up on trying to read a RestrictInfo.
>
> I think the fact that we see this problem at all may indicate an
> oversight in commit 5e6d8d2bbbcace304450b309e79366c0da4063e4 ("Allow
> parallel workers to execute subplans").  If the worker were to actually
> run the initplan, bad things would happen (the worker would create its
> own remote connection, which we don't want).  Now, in the problem plan
> the InitPlan is actually attached to the topmost Gather, which I think
> is safe because it'll be run by the master, but I wonder if we're being
> careful enough about non-parallel-safe plans for initplans/subplans.
>

Initplans are never marked parallel safe, only subplans that are
generated for parallel safe paths are marked as parallel safe.


> Also, even if the worker never actually executes the plan node, just
> doing ExecInitNode on it in a parallel worker might be more than a
> non-parallel-safe FDW is prepared to cope with.
>

I think there is a possibility of doing ExecInitNode in a parallel
worker for a parallel-unsafe subplan, because we pass a list of all
the sublans stored in planned statement.  However, the worker will
never execute such a plan as we don't generate a plan where unsafe
sublan/initplan is referenced in the node passed to the worker.  If we
want to avoid passing parallel-unsafe subplans to workers, then I
think we can maintain a list of parallel safe subplans along with
subplans in PlannerGlobal and PlannedStmt or maybe keep a parallel
safe flag in Plan so that we can pass only parallel safe plans to
workers.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-10 Thread Tom Lane
I wrote:
> (BTW, I've not yet looked to see if this needs to be back-ported.)

postgres_fdw will definitely include RestrictInfos in its fdw_private
list in 9.6.  However, I've been unable to provoke a visible failure.
After some rooting around, the reason seems to be that:

1. postgres_fdw doesn't mark its plans as parallel-safe --- it doesn't
even have a IsForeignScanParallelSafe method.  So you'd think that one
of its plans could never be transmitted to a parallel worker ... but:

2. Andreas' test query doesn't have the foreign scan in the main query
tree, it's in an InitPlan.  9.6 did not transmit any subplans or initplans
to parallel workers, but HEAD does.  So HEAD sends the illegal structure
to the worker which spits up on trying to read a RestrictInfo.

I think the fact that we see this problem at all may indicate an
oversight in commit 5e6d8d2bbbcace304450b309e79366c0da4063e4 ("Allow
parallel workers to execute subplans").  If the worker were to actually
run the initplan, bad things would happen (the worker would create its
own remote connection, which we don't want).  Now, in the problem plan
the InitPlan is actually attached to the topmost Gather, which I think
is safe because it'll be run by the master, but I wonder if we're being
careful enough about non-parallel-safe plans for initplans/subplans.
I do not think the planner tracks the locations of references to initplan
outputs carefully enough to be very sure about what it's doing here.

Also, even if the worker never actually executes the plan node, just
doing ExecInitNode on it in a parallel worker might be more than a
non-parallel-safe FDW is prepared to cope with.

Anyway, at present it doesn't look like we need this patch in 9.6.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-10 Thread Tom Lane
I wrote:
> Apparently, postgres_fdw is trying to store RestrictInfos in the
> fdw_private field of a ForeignScan node.  That won't do; those aren't
> supposed to be present in a finished plan tree, so there's no readfuncs.c
> support for them (and we're not adding it).

> Don't know if this is a new bug, or ancient but not previously reachable.
> It seems to be nearly the inverse of the problem you found yesterday,
> in which postgres_fdw was stripping RestrictInfos sooner than it really
> ought to.  Apparently that whole business needs a fresh look.

Attached is a proposed patch that cleans up the mess here --- and it was
a mess.  The comments for PgFdwRelationInfo claimed that the remote_conds
and local_conds fields contained bare clauses, but actually they could
contain either bare clauses or RestrictInfos or both, depending on where
the clauses had come from.  And there was some seriously obscure and
undocumented logic around how the fdw_recheck_quals got set up for a
foreign join node.  (BTW, said logic is assuming that no EPQ recheck is
needed for a foreign join.  I commented it to that effect, but I'm not
very sure that I believe it.  If there's a bug there, though, it's
independent of the immediate problem.)

Anybody want to review this, or shall I just push it?

(BTW, I've not yet looked to see if this needs to be back-ported.)

regards, tom lane

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index c5149a0..1d5aa83 100644
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
*** classifyConditions(PlannerInfo *root,
*** 210,216 
  
  	foreach(lc, input_conds)
  	{
! 		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
  
  		if (is_foreign_expr(root, baserel, ri->clause))
  			*remote_conds = lappend(*remote_conds, ri);
--- 210,216 
  
  	foreach(lc, input_conds)
  	{
! 		RestrictInfo *ri = lfirst_node(RestrictInfo, lc);
  
  		if (is_foreign_expr(root, baserel, ri->clause))
  			*remote_conds = lappend(*remote_conds, ri);
*** build_tlist_to_deparse(RelOptInfo *forei
*** 869,874 
--- 869,875 
  {
  	List	   *tlist = NIL;
  	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
+ 	ListCell   *lc;
  
  	/*
  	 * For an upper relation, we have already built the target list while
*** build_tlist_to_deparse(RelOptInfo *forei
*** 884,892 
  	tlist = add_to_flat_tlist(tlist,
  	   pull_var_clause((Node *) foreignrel->reltarget->exprs,
  	   PVC_RECURSE_PLACEHOLDERS));
! 	tlist = add_to_flat_tlist(tlist,
! 			  pull_var_clause((Node *) fpinfo->local_conds,
! 			  PVC_RECURSE_PLACEHOLDERS));
  
  	return tlist;
  }
--- 885,898 
  	tlist = add_to_flat_tlist(tlist,
  	   pull_var_clause((Node *) foreignrel->reltarget->exprs,
  	   PVC_RECURSE_PLACEHOLDERS));
! 	foreach(lc, fpinfo->local_conds)
! 	{
! 		RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc);
! 
! 		tlist = add_to_flat_tlist(tlist,
!   pull_var_clause((Node *) rinfo->clause,
!   PVC_RECURSE_PLACEHOLDERS));
! 	}
  
  	return tlist;
  }
*** deparseSelectSql(List *tlist, bool is_su
*** 1049,1054 
--- 1055,1061 
   * "buf".
   *
   * quals is the list of clauses to be included in the WHERE clause.
+  * (These may or may not include RestrictInfo decoration.)
   */
  static void
  deparseFromExpr(List *quals, deparse_expr_cxt *context)
*** deparseLockingClause(deparse_expr_cxt *c
*** 1262,1267 
--- 1269,1277 
   *
   * The conditions in the list are assumed to be ANDed. This function is used to
   * deparse WHERE clauses, JOIN .. ON clauses and HAVING clauses.
+  *
+  * Depending on the caller, the list elements might be either RestrictInfos
+  * or bare clauses.
   */
  static void
  appendConditions(List *exprs, deparse_expr_cxt *context)
*** appendConditions(List *exprs, deparse_ex
*** 1278,1293 
  	{
  		Expr	   *expr = (Expr *) lfirst(lc);
  
! 		/*
! 		 * Extract clause from RestrictInfo, if required. See comments in
! 		 * declaration of PgFdwRelationInfo for details.
! 		 */
  		if (IsA(expr, RestrictInfo))
! 		{
! 			RestrictInfo *ri = (RestrictInfo *) expr;
! 
! 			expr = ri->clause;
! 		}
  
  		/* Connect expressions with "AND" and parenthesize each condition. */
  		if (!is_first)
--- 1288,1296 
  	{
  		Expr	   *expr = (Expr *) lfirst(lc);
  
! 		/* Extract clause from RestrictInfo, if required */
  		if (IsA(expr, RestrictInfo))
! 			expr = ((RestrictInfo *) expr)->clause;
  
  		/* Connect expressions with "AND" and parenthesize each condition. */
  		if (!is_first)
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 6670df5..28a945f 100644
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
*** enum FdwScanPrivateIndex
*** 64,70 
  {
  	/* SQL statement to execute remotely (as a 

Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-09 Thread Tom Lane
Andreas Seltenreich  writes:
> regression=> select (select max(result) from fdw_postgres.num_result) from 
> tt0;
> ERROR:  badly formatted node string "RESTRICTINFO :clause {NULLTEST :"...
> CONTEXT:  parallel worker

Apparently, postgres_fdw is trying to store RestrictInfos in the
fdw_private field of a ForeignScan node.  That won't do; those aren't
supposed to be present in a finished plan tree, so there's no readfuncs.c
support for them (and we're not adding it).

Don't know if this is a new bug, or ancient but not previously reachable.
It seems to be nearly the inverse of the problem you found yesterday,
in which postgres_fdw was stripping RestrictInfos sooner than it really
ought to.  Apparently that whole business needs a fresh look.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers