Re: SegFault on 9.6.14

2019-11-26 Thread Amit Kapila
On Wed, Nov 20, 2019 at 5:12 PM Amit Kapila  wrote:
>
> On Mon, Nov 18, 2019 at 2:22 PM Amit Kapila  wrote:
> >
> > I have modified the commit message as proposed above and additionally
> > added comments in nodeLimit.c.  I think we should move ahead with this
> > bug-fix patch.  If we don't like the comment, it can anyway be
> > improved later.
> >
> > Any suggestions?
> >
>
> If there are no further suggestions or objections, I will commit this
> early next week, probably on Monday.
>

Yesterday, I pushed this patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-11-20 Thread Amit Kapila
On Mon, Nov 18, 2019 at 2:22 PM Amit Kapila  wrote:
>
> I have modified the commit message as proposed above and additionally
> added comments in nodeLimit.c.  I think we should move ahead with this
> bug-fix patch.  If we don't like the comment, it can anyway be
> improved later.
>
> Any suggestions?
>

If there are no further suggestions or objections, I will commit this
early next week, probably on Monday.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-11-18 Thread Amit Kapila
On Fri, Oct 18, 2019 at 10:08 AM Amit Kapila  wrote:
>
> On Thu, Oct 17, 2019 at 10:51 AM Thomas Munro  wrote:
> >
> > ===
> > Don't shut down Gather[Merge] early under Limit.
> >
> > Revert part of commit 19df1702f5.
> >
> > Early shutdown was added by that commit so that we could collect
> > statistics from workers, but unfortunately it interacted badly with
> > rescans.  Rescanning a Limit over a Gather node could produce a SEGV
> > on 9.6 and resource leak warnings on later releases.  By reverting the
> > early shutdown code, we might lose statistics in some cases of Limit
> > over Gather, but that will require further study to fix.
> >
>
> How about some text like below?  I have added slightly different text
> to explain the reason for the problem.
>
> "Early shutdown was added by that commit so that we could collect
> statistics from workers, but unfortunately, it interacted badly with
> rescans.  The problem is that we ended up destroying the parallel
> context which is required for rescans.  This leads to rescans of a
> Limit node over a Gather node to produce unpredictable results as it
> tries to access destroyed parallel context.  By reverting the early
> shutdown code, we might lose statistics in some cases of Limit over
> Gather, but that will require further study to fix."
>
> I am not sure but we can even add a comment in the place where we are
> removing some code (in nodeLimit.c) to indicate that 'Ideally we
> should shutdown parallel resources here to get the correct stats, but
> that would lead to rescans misbehaving when there is a Gather [Merge]
> node beneath it.  (Explain the reason for misbehavior and the ideas we
> discussed in this thread to fix the same) ."
>
> I can try to come up with comments in nodeLimit.c on the above lines
> if we think that is a good idea?
>

I have modified the commit message as proposed above and additionally
added comments in nodeLimit.c.  I think we should move ahead with this
bug-fix patch.  If we don't like the comment, it can anyway be
improved later.

Any suggestions?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


0001-Don-t-shut-down-Gather-Merge-early-under-Limit.patch
Description: Binary data


Re: SegFault on 9.6.14

2019-10-17 Thread Amit Kapila
On Thu, Oct 17, 2019 at 10:51 AM Thomas Munro  wrote:
>
> On Fri, Sep 13, 2019 at 1:35 AM Robert Haas  wrote:
> > On Thu, Sep 12, 2019 at 8:55 AM Amit Kapila  wrote:
> > > Robert, Thomas, do you have any more suggestions related to this.  I
> > > am planning to commit the above-discussed patch (Forbid Limit node to
> > > shutdown resources.) coming Monday, so that at least the reported
> > > problem got fixed.
> >
> > I think that your commit message isn't very clear about what the
> > actual issue is.  And the patch itself doesn't add any comments or
> > anything to try to clear it up. So I wouldn't favor committing it in
> > this form.
>
> Is the proposed commit message at the bottom of this email an improvement?
>
> Do I understand correctly that, with this patch, we can only actually
> lose statistics in the case where we rescan?
>

No, it will lose without rescan as well.  To understand in detail, you
might want to read the emails pointed by me in one of the above email
[1] in this thread.

>  That is, precisely the
> case that crashes (9.6) or spews warnings (10+)?  In a quick
> non-rescan test with the ExecShutdownNode() removed, I don't see any
> problem with the buffer numbers on my screen:
>

Try by removing aggregate function.  Basically, the Limit node has to
finish before consuming all the rows sent by a parallel node beneath
it.

>
> ===
> Don't shut down Gather[Merge] early under Limit.
>
> Revert part of commit 19df1702f5.
>
> Early shutdown was added by that commit so that we could collect
> statistics from workers, but unfortunately it interacted badly with
> rescans.  Rescanning a Limit over a Gather node could produce a SEGV
> on 9.6 and resource leak warnings on later releases.  By reverting the
> early shutdown code, we might lose statistics in some cases of Limit
> over Gather, but that will require further study to fix.
>

How about some text like below?  I have added slightly different text
to explain the reason for the problem.

"Early shutdown was added by that commit so that we could collect
statistics from workers, but unfortunately, it interacted badly with
rescans.  The problem is that we ended up destroying the parallel
context which is required for rescans.  This leads to rescans of a
Limit node over a Gather node to produce unpredictable results as it
tries to access destroyed parallel context.  By reverting the early
shutdown code, we might lose statistics in some cases of Limit over
Gather, but that will require further study to fix."

I am not sure but we can even add a comment in the place where we are
removing some code (in nodeLimit.c) to indicate that 'Ideally we
should shutdown parallel resources here to get the correct stats, but
that would lead to rescans misbehaving when there is a Gather [Merge]
node beneath it.  (Explain the reason for misbehavior and the ideas we
discussed in this thread to fix the same) ."

I can try to come up with comments in nodeLimit.c on the above lines
if we think that is a good idea?

[1] - 
https://www.postgresql.org/message-id/CAA4eK1Ja-eoavXcr0eq7w7hP%3D64VP49k%3DNMFxwhtK28NHfBOdA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-10-16 Thread Thomas Munro
On Fri, Sep 13, 2019 at 1:35 AM Robert Haas  wrote:
> On Thu, Sep 12, 2019 at 8:55 AM Amit Kapila  wrote:
> > Robert, Thomas, do you have any more suggestions related to this.  I
> > am planning to commit the above-discussed patch (Forbid Limit node to
> > shutdown resources.) coming Monday, so that at least the reported
> > problem got fixed.
>
> I think that your commit message isn't very clear about what the
> actual issue is.  And the patch itself doesn't add any comments or
> anything to try to clear it up. So I wouldn't favor committing it in
> this form.

Is the proposed commit message at the bottom of this email an improvement?

Do I understand correctly that, with this patch, we can only actually
lose statistics in the case where we rescan?  That is, precisely the
case that crashes (9.6) or spews warnings (10+)?  In a quick
non-rescan test with the ExecShutdownNode() removed, I don't see any
problem with the buffer numbers on my screen:

postgres=# explain (analyze, buffers, timing off, costs off) select
count(*) from t limit 5;
  QUERY PLAN
--
 Limit (actual rows=1 loops=1)
   Buffers: shared hit=16210 read=28038
   ->  Finalize Aggregate (actual rows=1 loops=1)
 Buffers: shared hit=16210 read=28038
 ->  Gather (actual rows=3 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=16210 read=28038
   ->  Partial Aggregate (actual rows=1 loops=3)
 Buffers: shared hit=16210 read=28038
 ->  Parallel Seq Scan on t (actual rows=333 loops=3)
   Buffers: shared hit=16210 read=28038
 Planning Time: 0.086 ms
 Execution Time: 436.669 ms
(14 rows)

===
Don't shut down Gather[Merge] early under Limit.

Revert part of commit 19df1702f5.

Early shutdown was added by that commit so that we could collect
statistics from workers, but unfortunately it interacted badly with
rescans.  Rescanning a Limit over a Gather node could produce a SEGV
on 9.6 and resource leak warnings on later releases.  By reverting the
early shutdown code, we might lose statistics in some cases of Limit
over Gather, but that will require further study to fix.

Author: Amit Kapila, testcase by Vignesh C
Reported-by: Jerry Sievers
Diagnosed-by: Thomas Munro
Backpatch-through: 9.6
Discussion: https://postgr.es/m/87ims2amh6@jsievers.enova.com
===




Re: SegFault on 9.6.14

2019-09-12 Thread Robert Haas
On Thu, Sep 12, 2019 at 8:55 AM Amit Kapila  wrote:
> Robert, Thomas, do you have any more suggestions related to this.  I
> am planning to commit the above-discussed patch (Forbid Limit node to
> shutdown resources.) coming Monday, so that at least the reported
> problem got fixed.

I think that your commit message isn't very clear about what the
actual issue is.  And the patch itself doesn't add any comments or
anything to try to clear it up. So I wouldn't favor committing it in
this form.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: SegFault on 9.6.14

2019-09-12 Thread Amit Kapila
On Thu, Sep 5, 2019 at 7:53 PM Amit Kapila  wrote:
>
> On Mon, Sep 2, 2019 at 4:51 PM Amit Kapila  wrote:
> >
> > On Fri, Aug 9, 2019 at 6:29 PM Robert Haas  wrote:
> > >
> > >
> > > But beyond that, the issue here is that the Limit node is shutting
> > > down the Gather node too early, and the right fix must be to stop
> > > doing that, not to change the definition of what it means to shut down
> > > a node, as this patch does.  So maybe a possible approach here - which
> > > I think is more or less what Tom is proposing - is:
> > >
> > > 1. Remove the code from ExecLimit() that calls ExecShutdownNode().
> > >
> >
> > Attached patch does that.  I have also added one test as a separate
> > patch so that later if we introduce shutting down resources in Limit
> > node, we don't break anything.  As of now, I have kept it separate for
> > easy verification, but if we decide to go with this approach and test
> > appears fine, we can merge it along with the fix.
> >
>
> I have merged the code change and test case patch as I felt that it is
> good to cover this case.  I have slightly changed the test case to
> make its output predictable (made the inner scan ordered so that the
> query always produces the same result).  One more thing I am not able
> to come up with some predictable test case for 9.6 branches as it
> doesn't support Gather Merge which is required for this particular
> test to always produce predictable output.  There could be some better
> way to write this test, so any input in that regards or otherwise is
> welcome.  So, if we commit this patch the containing test case will be
> for branches HEAD~10, but the code will be for HEAD~9.6.
>

Robert, Thomas, do you have any more suggestions related to this.  I
am planning to commit the above-discussed patch (Forbid Limit node to
shutdown resources.) coming Monday, so that at least the reported
problem got fixed.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-09-05 Thread Amit Kapila
On Mon, Sep 2, 2019 at 4:51 PM Amit Kapila  wrote:
>
> On Fri, Aug 9, 2019 at 6:29 PM Robert Haas  wrote:
> >
> >
> > But beyond that, the issue here is that the Limit node is shutting
> > down the Gather node too early, and the right fix must be to stop
> > doing that, not to change the definition of what it means to shut down
> > a node, as this patch does.  So maybe a possible approach here - which
> > I think is more or less what Tom is proposing - is:
> >
> > 1. Remove the code from ExecLimit() that calls ExecShutdownNode().
> >
>
> Attached patch does that.  I have also added one test as a separate
> patch so that later if we introduce shutting down resources in Limit
> node, we don't break anything.  As of now, I have kept it separate for
> easy verification, but if we decide to go with this approach and test
> appears fine, we can merge it along with the fix.
>

I have merged the code change and test case patch as I felt that it is
good to cover this case.  I have slightly changed the test case to
make its output predictable (made the inner scan ordered so that the
query always produces the same result).  One more thing I am not able
to come up with some predictable test case for 9.6 branches as it
doesn't support Gather Merge which is required for this particular
test to always produce predictable output.  There could be some better
way to write this test, so any input in that regards or otherwise is
welcome.  So, if we commit this patch the containing test case will be
for branches HEAD~10, but the code will be for HEAD~9.6.

> > 2. Adjust ExecutePlan() so that it ensures that ExecuteShutdownNode()
> > gets called at the very end of execution, at least when execute_once
> > is set, before exiting parallel mode.
> >
>
> I am not sure if I completely understand this point.  AFAICS, the
> ExecuteShutdownNode is called whenever we are done getting the tuples.
> One place where it is not there in that function is when we assume
> destination is closed, basically below code:
> ExecutePlan()
> {
> ..
> if (!dest->receiveSlot(slot, dest))
> break;
> ..
> }
>
> Do you expect this case to be also dealt or you have something else in
> mind?
>

It still appears problematic, but I couldn't come up with a test case
to reproduce the problem.  I'll try some more on this, but I think
this anyway can be done separately once we have a test to show the
problem.

>  The other possibility could be that we move the shutdown of the
> node at the end of the function when we exit parallel mode but doing
> that lead to some regression failure on my machine.  I will
> investigate the same.
>

This was failing because use_parallel_mode flag in function
ExecutePlan() won't be set for workers and hence they won't get a
chance to accumulate its stats.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


0001-Forbid-Limit-node-to-shutdown-resources.patch
Description: Binary data


Re: SegFault on 9.6.14

2019-09-02 Thread Amit Kapila
On Fri, Aug 9, 2019 at 6:29 PM Robert Haas  wrote:
>
> On Wed, Aug 7, 2019 at 5:45 AM vignesh C  wrote:
> > I have made a patch based on the above lines.
> > I have tested the scenarios which Thomas had shared in the earlier
> > mail and few more tests based on Thomas's tests.
> > I'm not sure if we will be going ahead with this solution or not.
> > Let me know your opinion on the same.
> > If you feel this approach is ok, we can add few of this tests into pg tests.
>
> I think this patch is bizarre:
>
> - It introduces a new function called getParallelModeLevel(), which is
> randomly capitalized different from the other functions that do
> similar things, and then uses it to do the same thing that could have
> been done with the existing function IsInParallelMode().
> - It contains an "if" statement whose only content is an Assert().
> Don't write if (a) Assert(b); write Assert(!a || b).
> - It contains zero lines of comment changes, which is obviously not
> enough for a patch that proposes to fix a very thorny issue.  This
> failure has two parts.  First, it adds no new comments to explain the
> bug being fixed or the theory of operation of the new code. Second, it
> does not even bother updating existing comments that are falsified by
> the patch, such as the function header comments for
> ExecParallelCleanup and ExecShutdownGather.
> - It changes what ExecParallelCleanup does while adjusting only one of
> the two callers to match the behavior change.  nodeGatherMerge.c
> manages to be completed untouched by this patch.  If you change what a
> function does, you really need to grep for all the calls to that
> function and adjust all callers to match the new set of expectations.
>
> It's a little hard to get past all of those issues and look at what
> the patch actually does, but I'm going to try: the theory of operation
> of the patch seems to be that we can skip destroying the parallel
> context when performing ExecParallelCleanup and in fact when exiting
> parallel mode, and then when we get to executor end time the context
> will still be there and we can fish the instrumentation out of it. But
> this seems problematic for several reasons.  For one thing, as Amit
> already observed, the code currently contains an assertion which
> ensure that a ParallelContext can't outlive the time spent in parallel
> mode, and it doesn't seem desirable to relax that assertion (this
> patch removes it).
>
> But beyond that, the issue here is that the Limit node is shutting
> down the Gather node too early, and the right fix must be to stop
> doing that, not to change the definition of what it means to shut down
> a node, as this patch does.  So maybe a possible approach here - which
> I think is more or less what Tom is proposing - is:
>
> 1. Remove the code from ExecLimit() that calls ExecShutdownNode().
>

Attached patch does that.  I have also added one test as a separate
patch so that later if we introduce shutting down resources in Limit
node, we don't break anything.  As of now, I have kept it separate for
easy verification, but if we decide to go with this approach and test
appears fine, we can merge it along with the fix.

> 2. Adjust ExecutePlan() so that it ensures that ExecuteShutdownNode()
> gets called at the very end of execution, at least when execute_once
> is set, before exiting parallel mode.
>

I am not sure if I completely understand this point.  AFAICS, the
ExecuteShutdownNode is called whenever we are done getting the tuples.
One place where it is not there in that function is when we assume
destination is closed, basically below code:
ExecutePlan()
{
..
if (!dest->receiveSlot(slot, dest))
break;
..
}

Do you expect this case to be also dealt or you have something else in
mind?  The other possibility could be that we move the shutdown of the
node at the end of the function when we exit parallel mode but doing
that lead to some regression failure on my machine.  I will
investigate the same.

> 3. Figure out, possibly at a later time or only in HEAD, how to make
> the early call to ExecLimit() in ExecShutdownNode(), and then put it
> back.

Okay,  Tom has suggested a design to address this, but that will be
for HEAD only.  To be clear, I am not planning to spend time on that
at this moment, but OTOH, I want the bug reported in this thread to be
closed, so for now, we need to proceed with some minimum fix as
mentioned by you in above two points.  If someone else can write a
patch, I can help in the review of same.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


0001-Forbid-Limit-node-to-shutdown-resources.patch
Description: Binary data


0002-Test-rescans-when-parallel-node-is-beneath-Limit-nod.patch
Description: Binary data


Re: SegFault on 9.6.14

2019-08-13 Thread Tom Lane
Amit Kapila  writes:
> Another point which I am wondering is why can't we use the existing
> REWIND flag to solve the current issue, basically if we have access to
> that information in nodeLimit.c (ExecLimit), then can't we just pass
> down that to ExecShutdownNode?

The existing REWIND flag tells subnodes whether they should *optimize*
for getting rewound or not.  I don't recall right now (well past
midnight) why that seemed like a useful definition, but if you grep for
places that are paying attention to that flag, I'm sure you'll find out.

We probably don't want to give up that distinction --- if it had been
equally good to define the flag as a hard yes-or-no, I'm sure we would
have taken that definition, because it's simpler.

regards, tom lane




Re: SegFault on 9.6.14

2019-08-13 Thread Amit Kapila
On Tue, Aug 13, 2019 at 9:28 PM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > On Tue, Aug 13, 2019 at 3:18 AM Tom Lane  wrote:
> >> To clarify my position --- I think it's definitely possible to improve
> >> the situation a great deal.  We "just" have to pass down more information
> >> about whether rescans are possible.
>
> > Right, you have speculated above that it is possible via adding some
> > eflag bits.  Can you please describe a bit more about that idea, so
> > that somebody else can try to write a patch?
>
> Well, there are two components to solving this problem:
>
> 1. What are we going to do about the executor's external API?
>
> Right now, callers of ExecutorStart don't have to say whether they
> might call ExecutorRewind.  We need some way for callers to make a
> binding promise that they won't do any such thing.  Perhaps we just
> want to invent another flag that's like EXEC_FLAG_BACKWARD, but it's
> not clear how it should interact with the existing "soft" REWIND
> flag.

Yeah making it interact with REWIND will be a bit of challenge as I
think to some extent the REWIND flag also indicates the same.  Do I
understand correctly that, we have some form of rule such that if
EXEC_FLAG_REWIND is set or node's chgParam is NULL, then we can expect
that node can support rescan?  If it is true, then maybe we need to
design this new flag in such a way that it covers existing cases of
REWIND as well.

Another point which I am wondering is why can't we use the existing
REWIND flag to solve the current issue, basically if we have access to
that information in nodeLimit.c (ExecLimit), then can't we just pass
down that to ExecShutdownNode?   I guess the problem could be that if
LimitNode doesn't support REWIND, but one of the nodes beneath it
supports that same, then we won't be able to rely on the information
passed to ExecShutdownNode.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-08-13 Thread Tom Lane
Amit Kapila  writes:
> On Tue, Aug 13, 2019 at 3:18 AM Tom Lane  wrote:
>> To clarify my position --- I think it's definitely possible to improve
>> the situation a great deal.  We "just" have to pass down more information
>> about whether rescans are possible.

> Right, you have speculated above that it is possible via adding some
> eflag bits.  Can you please describe a bit more about that idea, so
> that somebody else can try to write a patch?

Well, there are two components to solving this problem:

1. What are we going to do about the executor's external API?

Right now, callers of ExecutorStart don't have to say whether they
might call ExecutorRewind.  We need some way for callers to make a
binding promise that they won't do any such thing.  Perhaps we just
want to invent another flag that's like EXEC_FLAG_BACKWARD, but it's
not clear how it should interact with the existing "soft" REWIND
flag.  Nor do I know how far up the call stack will we have to make
changes to make it possible to promise as much as we can -- for
instance, will we have to adapt the SPI interfaces?

2. What happens inside ExecutorStart in response to such promises?

I imagine that we translate them into additional eflags bits that
get passed down to node init functions, possibly with modification
(e.g., nodeNestloop.c would have to revoke the no-rescans promise
to its inner input).  You'd need to work out what is the most
convenient set of conventions (positive or negative sense of the
flag bits, etc), and go through all the non-leaf node types to
determine what they can pass down.

(BTW, unless I'm missing something, there's not currently any
enforcement of EXEC_FLAG_BACKWARD, ie a caller can fail to pass
that and then try to back up anyway.  We probably want to improve
that situation, and also enforce this new flag about
ExecutorRewind.)

The reason I'm dubious about back-patching this is that each
of these things seems likely to affect external code.  Point 1
could affect external callers of the executor, and point 2 is
likely to have consequences for FDWs and custom-scan providers.
Maybe we can set things up so that everything defaults in a
safe direction for unchanged code, but I don't want to contort
the design just to do that.

regards, tom lane




Re: SegFault on 9.6.14

2019-08-12 Thread Amit Kapila
On Tue, Aug 13, 2019 at 3:18 AM Tom Lane  wrote:
>
> Robert Haas  writes:
> > Being able to do that sort of thing was one of my goals in designing
> > the ExecShutdownNode stuff.  Unfortunately, it's clear from this bug
> > report that it's still a few bricks short of a load, and Tom doesn't
> > seem real optimistic about how easy it will be to buy those bricks at
> > discount prices. But I hope we persist in trying to get there, because
> > I don't like the idea of saying that we'll never be smart enough to
> > know we're done with any part of the plan until we're definitely done
> > with the whole thing. I think that's leaving too much money on the
> > table.
>
> To clarify my position --- I think it's definitely possible to improve
> the situation a great deal.  We "just" have to pass down more information
> about whether rescans are possible.
>

Right, you have speculated above that it is possible via adding some
eflag bits.  Can you please describe a bit more about that idea, so
that somebody else can try to write a patch?  I think if someone other
than you try to write a patch without having some sort of upfront
design, it might lead to a lot of re-work.  It would be great if you
have an interest in doing the leg work which can then be extended to
fix the issue in the parallel query, but if not at least let us know
the idea you have in mind in a bit more detail.

>  What I don't believe is that that
> leads to a bug fix that would be sane to back-patch as far as 9.6.
>

Fair enough.  In such a situation, we can plan to revert the earlier
fix for Limit node and tell people that the same will be fixed in
PG-13.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-08-12 Thread Robert Haas
On Mon, Aug 12, 2019 at 5:48 PM Tom Lane  wrote:
> To clarify my position --- I think it's definitely possible to improve
> the situation a great deal.  We "just" have to pass down more information
> about whether rescans are possible.  What I don't believe is that that
> leads to a bug fix that would be sane to back-patch as far as 9.6.

Sounds like a fair opinion.  I'm not sure how complicated the fix
would be so I don't know whether I agree with your opinion, but you
usually have a fairly good intuition for such things, so...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: SegFault on 9.6.14

2019-08-12 Thread Tom Lane
Robert Haas  writes:
> Being able to do that sort of thing was one of my goals in designing
> the ExecShutdownNode stuff.  Unfortunately, it's clear from this bug
> report that it's still a few bricks short of a load, and Tom doesn't
> seem real optimistic about how easy it will be to buy those bricks at
> discount prices. But I hope we persist in trying to get there, because
> I don't like the idea of saying that we'll never be smart enough to
> know we're done with any part of the plan until we're definitely done
> with the whole thing. I think that's leaving too much money on the
> table.

To clarify my position --- I think it's definitely possible to improve
the situation a great deal.  We "just" have to pass down more information
about whether rescans are possible.  What I don't believe is that that
leads to a bug fix that would be sane to back-patch as far as 9.6.

regards, tom lane




Re: SegFault on 9.6.14

2019-08-12 Thread Thomas Munro
On Tue, Aug 13, 2019 at 7:07 AM Alvaro Herrera  wrote:
> On 2019-Aug-12, Thomas Munro wrote:
> > That's possibly relevant because it means we'd have a ParallelContext
> > or some new overarching object that has a lifetime that is longer than
> > the individual Gather nodes' processes and instrumentation data.  I'm
> > not saying we need to discuss any details of this other concern now,
> > I'm just wondering out loud if the whole problem in this thread goes
> > away automatically when we fix it.
>
> How likely is it that we would ever be able to release memory from a
> Sort (or, say, a hashjoin hash table) when it's done being read, but
> before completing the whole plan?  As I understand, right now we hold
> onto a lot of memory after such plans have been fully read, for no good
> reason other than executor being unaware of this.  This might not be
> directly related to the problem at hand, since it's not just parallel
> plans that are affected.

Right, AIUI we hold onto that memory because it's a nice optimisation
to be able to rescan the sorted data or reuse the hash table (single
batch, non-parallel hash joins only for now).  We have no
disincentive, because our memory model doesn't care about the total
peak memory usage (ie all nodes).  Some other RDBMSs do care about
that, and somehow consider the peak memory usage (that is, considering
early memory release) when comparing join orders.

However, I think it's independent of the DSM lifetime question,
because the main Parallel Context DSM segment is really small, it has
a small fixed header and then a small object per parallel-aware node,
and isn't used for holding the hash table for Parallel Hash and
probably wouldn't be used for a future hypothetical Parallel Sort (if
it's implemented the way I imagine at least).  It contains a DSA area,
which creates more DSM segments as it needs them, and nodes can opt to
free DSA memory sooner, which will likely result in those extra DSM
segments being freed; you can see that happening in Parallel Hash
which in fact does give back memory quite eagerly.  (I'm the first to
admit that it's weird that DSM segments can hold DSA areas and DSA
areas are made up of DSM segments; that falls out of the choice to use
DSM segments both for storage and as a lifetime management system for
shared resources, and I wouldn't be surprised if we reconsider that as
we get more experience and ideas.)

-- 
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-08-12 Thread Robert Haas
On Mon, Aug 12, 2019 at 3:07 PM Alvaro Herrera  wrote:
> How likely is it that we would ever be able to release memory from a
> Sort (or, say, a hashjoin hash table) when it's done being read, but
> before completing the whole plan?  As I understand, right now we hold
> onto a lot of memory after such plans have been fully read, for no good
> reason other than executor being unaware of this.  This might not be
> directly related to the problem at hand, since it's not just parallel
> plans that are affected.

Being able to do that sort of thing was one of my goals in designing
the ExecShutdownNode stuff.  Unfortunately, it's clear from this bug
report that it's still a few bricks short of a load, and Tom doesn't
seem real optimistic about how easy it will be to buy those bricks at
discount prices. But I hope we persist in trying to get there, because
I don't like the idea of saying that we'll never be smart enough to
know we're done with any part of the plan until we're definitely done
with the whole thing. I think that's leaving too much money on the
table.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: SegFault on 9.6.14

2019-08-12 Thread Alvaro Herrera
On 2019-Aug-12, Thomas Munro wrote:

> That's possibly relevant because it means we'd have a ParallelContext
> or some new overarching object that has a lifetime that is longer than
> the individual Gather nodes' processes and instrumentation data.  I'm
> not saying we need to discuss any details of this other concern now,
> I'm just wondering out loud if the whole problem in this thread goes
> away automatically when we fix it.

How likely is it that we would ever be able to release memory from a
Sort (or, say, a hashjoin hash table) when it's done being read, but
before completing the whole plan?  As I understand, right now we hold
onto a lot of memory after such plans have been fully read, for no good
reason other than executor being unaware of this.  This might not be
directly related to the problem at hand, since it's not just parallel
plans that are affected.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: SegFault on 9.6.14

2019-08-11 Thread Thomas Munro
On Sat, Aug 10, 2019 at 12:59 AM Robert Haas  wrote:
> But beyond that, the issue here is that the Limit node is shutting
> down the Gather node too early, and the right fix must be to stop
> doing that, not to change the definition of what it means to shut down
> a node, as this patch does.  So maybe a possible approach here - which
> I think is more or less what Tom is proposing - is:
>
> 1. Remove the code from ExecLimit() that calls ExecShutdownNode().
> 2. Adjust ExecutePlan() so that it ensures that ExecuteShutdownNode()
> gets called at the very end of execution, at least when execute_once
> is set, before exiting parallel mode.
> 3. Figure out, possibly at a later time or only in HEAD, how to make
> the early call to ExecLimit() in ExecShutdownNode(), and then put it
> back. I think we could do this by passing down some information
> indicating which nodes are potentially rescanned by other nodes higher
> up in the tree; there's the separate question of whether rescans can
> happen due to cursor operations, but the execute_once stuff can handle
> that aspect of it, I think.
>
> I'm not quite sure that approach is altogether correct so I'd
> appreciate some analysis on that point.

I'm not sure exactly what we should do yet, but one thought I wanted
to resurrect from older discussions is that we now think it was a
mistake to give every Gather node its own DSM segment, having seen
queries in the wild where that decision interacted badly with large
number of partitions.  In 13 we should try to figure out how to have a
single DSM segment allocated for all Gather[Merge] nodes in the tree
(and remove the embarrassing band-aid hack in commit fd7c0fa7).
That's possibly relevant because it means we'd have a ParallelContext
or some new overarching object that has a lifetime that is longer than
the individual Gather nodes' processes and instrumentation data.  I'm
not saying we need to discuss any details of this other concern now,
I'm just wondering out loud if the whole problem in this thread goes
away automatically when we fix it.
--
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-08-09 Thread Robert Haas
On Wed, Aug 7, 2019 at 5:45 AM vignesh C  wrote:
> I have made a patch based on the above lines.
> I have tested the scenarios which Thomas had shared in the earlier
> mail and few more tests based on Thomas's tests.
> I'm not sure if we will be going ahead with this solution or not.
> Let me know your opinion on the same.
> If you feel this approach is ok, we can add few of this tests into pg tests.

I think this patch is bizarre:

- It introduces a new function called getParallelModeLevel(), which is
randomly capitalized different from the other functions that do
similar things, and then uses it to do the same thing that could have
been done with the existing function IsInParallelMode().
- It contains an "if" statement whose only content is an Assert().
Don't write if (a) Assert(b); write Assert(!a || b).
- It contains zero lines of comment changes, which is obviously not
enough for a patch that proposes to fix a very thorny issue.  This
failure has two parts.  First, it adds no new comments to explain the
bug being fixed or the theory of operation of the new code. Second, it
does not even bother updating existing comments that are falsified by
the patch, such as the function header comments for
ExecParallelCleanup and ExecShutdownGather.
- It changes what ExecParallelCleanup does while adjusting only one of
the two callers to match the behavior change.  nodeGatherMerge.c
manages to be completed untouched by this patch.  If you change what a
function does, you really need to grep for all the calls to that
function and adjust all callers to match the new set of expectations.

It's a little hard to get past all of those issues and look at what
the patch actually does, but I'm going to try: the theory of operation
of the patch seems to be that we can skip destroying the parallel
context when performing ExecParallelCleanup and in fact when exiting
parallel mode, and then when we get to executor end time the context
will still be there and we can fish the instrumentation out of it. But
this seems problematic for several reasons.  For one thing, as Amit
already observed, the code currently contains an assertion which
ensure that a ParallelContext can't outlive the time spent in parallel
mode, and it doesn't seem desirable to relax that assertion (this
patch removes it).

But beyond that, the issue here is that the Limit node is shutting
down the Gather node too early, and the right fix must be to stop
doing that, not to change the definition of what it means to shut down
a node, as this patch does.  So maybe a possible approach here - which
I think is more or less what Tom is proposing - is:

1. Remove the code from ExecLimit() that calls ExecShutdownNode().
2. Adjust ExecutePlan() so that it ensures that ExecuteShutdownNode()
gets called at the very end of execution, at least when execute_once
is set, before exiting parallel mode.
3. Figure out, possibly at a later time or only in HEAD, how to make
the early call to ExecLimit() in ExecShutdownNode(), and then put it
back. I think we could do this by passing down some information
indicating which nodes are potentially rescanned by other nodes higher
up in the tree; there's the separate question of whether rescans can
happen due to cursor operations, but the execute_once stuff can handle
that aspect of it, I think.

I'm not quite sure that approach is altogether correct so I'd
appreciate some analysis on that point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: SegFault on 9.6.14

2019-08-07 Thread Amit Kapila
On Wed, Aug 7, 2019 at 3:15 PM vignesh C  wrote:
>
> On Wed, Jul 31, 2019 at 9:37 AM Amit Kapila  wrote:
> >
> > On Wed, Jul 31, 2019 at 12:05 AM Robert Haas  wrote:
> >
> > The other idea we had discussed which comes closer to adopting Tom's
> > position was that during ExecShutdownNode, we just destroy parallel
> > workers, collect instrumentation data and don't destroy the parallel
> > context.  The parallel context could be destroyed in ExecEndNode
> > (ExecEndGather(Merge)) code path.  The problem with this idea is that
> > ExitParallelMode doesn't expect parallel context to be active.  Now,
> > we can either change the location of Exit/EnterParallelMode or relax
> > that restriction.  As mentioned above that restriction appears good to
> > me, so I am not in favor of changing it unless we have some other
> > solid way to install it.   I am not sure if this idea is better than
> > other approaches we are discussing.
> >
> >
> I have made a patch based on the above lines.
> I have tested the scenarios which Thomas had shared in the earlier
> mail and few more tests based on Thomas's tests.
> I'm not sure if we will be going ahead with this solution or not.
> Let me know your opinion on the same.
> If you feel this approach is ok, we can add few of this tests into pg tests.
>

This patch is on the lines of what I had in mind, but I see some
problems in this which are explained below.  The other approach to fix
this was to move Enter/ExitParallelMode to the outer layer.  For ex.,
can we enter in parallel mode during InitPlan and exit from it during
ExecEndPlan?  That might not be good to backpatch, but it might turn
out to be more robust than the current approach.

Few comments on your patch:
1.
@@ -569,13 +569,6 @@ ExecParallelCleanup(ParallelExecutorInfo *pei)
  if (pei->instrumentation)
  ExecParallelRetrieveInstrumentation(pei->planstate,
  pei->instrumentation);
-
- if (pei->pcxt != NULL)
- {
- DestroyParallelContext(pei->pcxt);
- pei->pcxt = NULL;
- }
- pfree(pei);
 }

Here, you have just removed parallel context-free, but I think we
can't detach from parallel context area here as well, otherwise, it
will create similar problems in other cases. Note, that we create the
area only in ExecInitParallelPlan and just reuse it in
ExecParallelReinitialize.  So, if we allow getting it destroyed in
ExecParallelCleanup (which is called via ExecShutdownNode), we won't
have access to it in rescan code path.  IT is better if we have a test
for the same as well.  I think we should only retrieve the
instrumentation information here.  Also, if we do that, then we might
also want to change function name and comments atop of this function.

2.
ExecEndGather(GatherState *node)
 {
+ ParallelExecutorInfo *pei = node->pei;
  ExecShutdownGather(node);
+
+ if (pei != NULL)
+ {
+ if (pei->pcxt != NULL)
+ {
+ DestroyParallelContext(pei->pcxt);
+ pei->pcxt = NULL;
+ }
+
+ pfree(pei);
+ node->pei = NULL;
+ }

I feel that it is better you move a collection of instrumentation
information from ExecParallelCleanup to a separate function and then
use ExecParallelCleanup here.

3.
extern bool IsInParallelMode(void);
+extern bool getParallelModeLevel(void);

To be consistent, it better to name the function as GetParallelModeLevel.

4.
@@ -1461,6 +1461,8 @@ ExecEndPlan(PlanState *planstate, EState *estate)
  ExecEndNode(subplanstate);
  }

+ if (estate->es_use_parallel_mode)
+ Assert (getParallelModeLevel() > 0 || !ParallelContextActive());

Add some comments here to explain about this Assert.  I am not sure if
this is correct because it won't fail even if the parallel mode is
non-zero and there is no parallel
context.  At this stage, we must have exited the parallel mode.

5.
explain analyze
   select count(*) from join_foo
left join (select b1.id from join_bar b1
 limit 1000) ss

All the tests in your test file use left join to reproduce the issue,
but I think it should be reproducible with inner join as well.  This
comment is not that your test case is wrong, but I want to see if we
can further simplify it.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-08-07 Thread vignesh C
On Wed, Jul 31, 2019 at 9:37 AM Amit Kapila  wrote:
>
> On Wed, Jul 31, 2019 at 12:05 AM Robert Haas  wrote:
> >
> > On Thu, Jul 18, 2019 at 9:45 AM Tom Lane  wrote:
> > > I think this is going in the wrong direction.  Nodes should *always*
> > > assume that a rescan is possible until ExecEndNode is called.
> > > If you want to do otherwise, you are going to be inventing a whole
> > > bunch of complicated and doubtless-initially-buggy control logic
> > > to pass down information about whether a rescan might be possible.
> > > That doesn't sound like a recipe for a back-patchable fix.  Perhaps
> > > we could consider redesigning the rules around REWIND in a future
> > > version, but that's not where to focus the bug fix effort.
> >
> > So, if I can summarize how we got here, as best I understand it:
> >
>
> Thanks for the summarization.  This looks mostly correct to me.
>
> > 0. The historic behavior of the executor is to assume it's OK to leak
> > resources for the lifetime of the query.  Nodes that are executed to
> > completion generally do some cleanup, but we feel free (as under
> > Limit) to just stop executing a node without giving it any hint that
> > it should release resources.  So a Sort may hold onto a terabyte of
> > memory and an index scan may keep holding a pin even after there's no
> > theoretical way of ever needing those resources again, and we just
> > don't care.
> >
> > 1. Parallel query made that perhaps-already-shaky assumption a lot
> > more problematic. Partly that's because workers are a a more scarce
> > and considerably heavier resource than anything else, and moreover act
> > as a container for anything else, so whatever you were leaking before,
> > you can now leak N times more of it, plus N processes, until the end
> > of the query. However, there's a correctness reason too, which is that
> > when a node has a copy in the leader and a copy in every worker, each
> > copy has its own instrumentation data (startup time, run time, nloops,
> > etc) and we can only fold all that together once the node is done
> > executing, because it's really hard to add up a bunch of numbers
> > before the numbers are done changing.  We could've made the
> > instrumentation shared throughout, but if we had, we could have
> > contention for updating the instrumentation data, which seems like
> > it'd be bad.
> >
> > 2. To fix that correctness problem, we decided to try to shut down the
> > node under a limit node when we're done with it (commit
> > 85c9d3475e4f680dbca7c04fe096af018f3b8760). At a certain level, this
> > looks fundamentally necessary to me.  If you're going to have N
> > separate copies of the instrumentation, and you want to add them up
> > when you're done, then you have to decide to be done at some point;
> > otherwise you don't know when to add them up, and maybe won't add them
> > up at all, and then you'll be sad.  This does not mean that the exact
> > timing couldn't be changed somehow, but if you want a correct
> > implementation, you have to shut down Limit's sub-node after you're
> > done executing it (so that you can get the instrumentation data from
> > the workers after it's final) and before you start destroying DSM
> > segments and stuff (so that you get the instrumentation data from the
> > workers before it vanishes).
> >
> > 3. The aforementioned commit turned out to be buggy in at least to two
> > ways, precisely because it didn't do a good enough job predicting when
> > the Limit needed to be shut down.  First, there was commit
> > 2cd0acfdade82f3cab362fd9129d453f81cc2745, where we missed the fact
> > that you could hit the Limit and then back up.
> >
>
> We have not missed it, rather we decided to it separately because it
> appears to impact some different cases as well [1][2].
>
> >  Second, there's the
> > present issue, where the Limit gets rescanned.
> >
> > So, given all that, if we want to adopt Tom's position that we should
> > always cater to a possible rescan, then we're going to have to rethink
> > the way that instrumentation data gets consolidated from workers into
> > the leader in such a way that we can consolidate multiple times
> > without ending up with the wrong answer.
> >
>
> The other idea we had discussed which comes closer to adopting Tom's
> position was that during ExecShutdownNode, we just destroy parallel
> workers, collect instrumentation data and don't destroy the parallel
> context.  The parallel context could be destroyed in ExecEndNode
> (ExecEndGather(Merge)) code path.  The problem with this idea is that
> ExitParallelMode doesn't expect parallel context to be active.  Now,
> we can either change the location of Exit/EnterParallelMode or relax
> that restriction.  As mentioned above that restriction appears good to
> me, so I am not in favor of changing it unless we have some other
> solid way to install it.   I am not sure if this idea is better than
> other approaches we are discussing.
>
>
I have made a patch based on the 

Re: SegFault on 9.6.14

2019-07-30 Thread Tom Lane
Amit Kapila  writes:
> On Wed, Jul 31, 2019 at 12:05 AM Robert Haas  wrote:
>> The other option is to do
>> what I understand Amit and Thomas to be proposing, which is to do a
>> better job identifying the case where we're "done for good" and can
>> trigger the shutdown fearlessly.

> Yes, this sounds safe fix for back-branches.

Actually, my point was exactly that I *didn't* think that would be a
safe fix for the back branches --- at least, not unless you're okay
with a very conservative and hence resource-leaky method for deciding
when it's safe to shut down sub-nodes.

We could do something involving (probably) adding new eflags bits to
pass this sort of info down to child plan nodes.  But that's going
to require design and coding, and it will not be backwards compatible.
At least not from the point of view of any extension that's doing
anything in that area.

regards, tom lane




Re: SegFault on 9.6.14

2019-07-30 Thread Amit Kapila
On Wed, Jul 31, 2019 at 12:05 AM Robert Haas  wrote:
>
> On Thu, Jul 18, 2019 at 9:45 AM Tom Lane  wrote:
> > I think this is going in the wrong direction.  Nodes should *always*
> > assume that a rescan is possible until ExecEndNode is called.
> > If you want to do otherwise, you are going to be inventing a whole
> > bunch of complicated and doubtless-initially-buggy control logic
> > to pass down information about whether a rescan might be possible.
> > That doesn't sound like a recipe for a back-patchable fix.  Perhaps
> > we could consider redesigning the rules around REWIND in a future
> > version, but that's not where to focus the bug fix effort.
>
> So, if I can summarize how we got here, as best I understand it:
>

Thanks for the summarization.  This looks mostly correct to me.

> 0. The historic behavior of the executor is to assume it's OK to leak
> resources for the lifetime of the query.  Nodes that are executed to
> completion generally do some cleanup, but we feel free (as under
> Limit) to just stop executing a node without giving it any hint that
> it should release resources.  So a Sort may hold onto a terabyte of
> memory and an index scan may keep holding a pin even after there's no
> theoretical way of ever needing those resources again, and we just
> don't care.
>
> 1. Parallel query made that perhaps-already-shaky assumption a lot
> more problematic. Partly that's because workers are a a more scarce
> and considerably heavier resource than anything else, and moreover act
> as a container for anything else, so whatever you were leaking before,
> you can now leak N times more of it, plus N processes, until the end
> of the query. However, there's a correctness reason too, which is that
> when a node has a copy in the leader and a copy in every worker, each
> copy has its own instrumentation data (startup time, run time, nloops,
> etc) and we can only fold all that together once the node is done
> executing, because it's really hard to add up a bunch of numbers
> before the numbers are done changing.  We could've made the
> instrumentation shared throughout, but if we had, we could have
> contention for updating the instrumentation data, which seems like
> it'd be bad.
>
> 2. To fix that correctness problem, we decided to try to shut down the
> node under a limit node when we're done with it (commit
> 85c9d3475e4f680dbca7c04fe096af018f3b8760). At a certain level, this
> looks fundamentally necessary to me.  If you're going to have N
> separate copies of the instrumentation, and you want to add them up
> when you're done, then you have to decide to be done at some point;
> otherwise you don't know when to add them up, and maybe won't add them
> up at all, and then you'll be sad.  This does not mean that the exact
> timing couldn't be changed somehow, but if you want a correct
> implementation, you have to shut down Limit's sub-node after you're
> done executing it (so that you can get the instrumentation data from
> the workers after it's final) and before you start destroying DSM
> segments and stuff (so that you get the instrumentation data from the
> workers before it vanishes).
>
> 3. The aforementioned commit turned out to be buggy in at least to two
> ways, precisely because it didn't do a good enough job predicting when
> the Limit needed to be shut down.  First, there was commit
> 2cd0acfdade82f3cab362fd9129d453f81cc2745, where we missed the fact
> that you could hit the Limit and then back up.
>

We have not missed it, rather we decided to it separately because it
appears to impact some different cases as well [1][2].

>  Second, there's the
> present issue, where the Limit gets rescanned.
>
> So, given all that, if we want to adopt Tom's position that we should
> always cater to a possible rescan, then we're going to have to rethink
> the way that instrumentation data gets consolidated from workers into
> the leader in such a way that we can consolidate multiple times
> without ending up with the wrong answer.
>

The other idea we had discussed which comes closer to adopting Tom's
position was that during ExecShutdownNode, we just destroy parallel
workers, collect instrumentation data and don't destroy the parallel
context.  The parallel context could be destroyed in ExecEndNode
(ExecEndGather(Merge)) code path.  The problem with this idea is that
ExitParallelMode doesn't expect parallel context to be active.  Now,
we can either change the location of Exit/EnterParallelMode or relax
that restriction.  As mentioned above that restriction appears good to
me, so I am not in favor of changing it unless we have some other
solid way to install it.   I am not sure if this idea is better than
other approaches we are discussing.

>  The other option is to do
> what I understand Amit and Thomas to be proposing, which is to do a
> better job identifying the case where we're "done for good" and can
> trigger the shutdown fearlessly.
>

Yes, this sounds safe fix for back-branches.  We might 

Re: SegFault on 9.6.14

2019-07-30 Thread Robert Haas
On Thu, Jul 18, 2019 at 9:45 AM Tom Lane  wrote:
> I think this is going in the wrong direction.  Nodes should *always*
> assume that a rescan is possible until ExecEndNode is called.
> If you want to do otherwise, you are going to be inventing a whole
> bunch of complicated and doubtless-initially-buggy control logic
> to pass down information about whether a rescan might be possible.
> That doesn't sound like a recipe for a back-patchable fix.  Perhaps
> we could consider redesigning the rules around REWIND in a future
> version, but that's not where to focus the bug fix effort.

So, if I can summarize how we got here, as best I understand it:

0. The historic behavior of the executor is to assume it's OK to leak
resources for the lifetime of the query.  Nodes that are executed to
completion generally do some cleanup, but we feel free (as under
Limit) to just stop executing a node without giving it any hint that
it should release resources.  So a Sort may hold onto a terabyte of
memory and an index scan may keep holding a pin even after there's no
theoretical way of ever needing those resources again, and we just
don't care.

1. Parallel query made that perhaps-already-shaky assumption a lot
more problematic. Partly that's because workers are a a more scarce
and considerably heavier resource than anything else, and moreover act
as a container for anything else, so whatever you were leaking before,
you can now leak N times more of it, plus N processes, until the end
of the query. However, there's a correctness reason too, which is that
when a node has a copy in the leader and a copy in every worker, each
copy has its own instrumentation data (startup time, run time, nloops,
etc) and we can only fold all that together once the node is done
executing, because it's really hard to add up a bunch of numbers
before the numbers are done changing.  We could've made the
instrumentation shared throughout, but if we had, we could have
contention for updating the instrumentation data, which seems like
it'd be bad.

2. To fix that correctness problem, we decided to try to shut down the
node under a limit node when we're done with it (commit
85c9d3475e4f680dbca7c04fe096af018f3b8760). At a certain level, this
looks fundamentally necessary to me.  If you're going to have N
separate copies of the instrumentation, and you want to add them up
when you're done, then you have to decide to be done at some point;
otherwise you don't know when to add them up, and maybe won't add them
up at all, and then you'll be sad.  This does not mean that the exact
timing couldn't be changed somehow, but if you want a correct
implementation, you have to shut down Limit's sub-node after you're
done executing it (so that you can get the instrumentation data from
the workers after it's final) and before you start destroying DSM
segments and stuff (so that you get the instrumentation data from the
workers before it vanishes).

3. The aforementioned commit turned out to be buggy in at least to two
ways, precisely because it didn't do a good enough job predicting when
the Limit needed to be shut down.  First, there was commit
2cd0acfdade82f3cab362fd9129d453f81cc2745, where we missed the fact
that you could hit the Limit and then back up.  Second, there's the
present issue, where the Limit gets rescanned.

So, given all that, if we want to adopt Tom's position that we should
always cater to a possible rescan, then we're going to have to rethink
the way that instrumentation data gets consolidated from workers into
the leader in such a way that we can consolidate multiple times
without ending up with the wrong answer.  The other option is to do
what I understand Amit and Thomas to be proposing, which is to do a
better job identifying the case where we're "done for good" and can
trigger the shutdown fearlessly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: SegFault on 9.6.14

2019-07-26 Thread Amit Kapila
On Sat, Jul 27, 2019 at 8:29 AM Thomas Munro  wrote:
>
> On Fri, Jul 26, 2019 at 4:13 PM Amit Kapila  wrote:
> > On Tue, Jul 23, 2019 at 5:28 PM Amit Kapila  wrote:
> > > Right, that will be lesser code churn and it can also work.  However,
> > > one thing that needs some thought is till now es_top_eflags is only
> > > set in ExecutorStart and same is mentioned in comments where it is
> > > declared and it seems we are going to change that with this idea.  How
> > > about having a separate function ExecBlahShutdown which will clean up
> > > resources as parallel context and can be called only from ExecutePlan
> > > where we are calling ExecShutdownNode?  I think both these and the
> > > other solution we have discussed are on similar lines and another idea
> > > could be to relax the assert which again is not a superb idea.
> >
> > It seems we don't have a clear preference for any particular solution
> > among these and neither there appears to be any better idea.  I guess
> > we can wait for a few days to see if Robert has any views on this,
> > otherwise, pick one of the above and move ahead.
>
> I take the EXEC_FLAG_DONE idea back.  It's ugly and too hard to verify
> that every appropriate path sets it, and a flag that means the
> opposite would be even more of a kluge, and generally I think I was
> looking at this too myopically: I was looking for a way to shut down
> processes ASAP without giving up the shared memory we'll need for
> rescanning, but what I should have been looking at is the reason you
> did that in the first place: to get the instrumentation data.  Can you
> explain why it's necessary to do that explicitly for Limit?  Wouldn't
> the right place to collect instrumentation be at the end of execution
> when Shutdown will run in all cases anyway (and possibly also during
> ExecParallelReinitialize() or something like that if it's being
> clobbered by rescans, I didn't check)?  What's special about Limit?
>

I think here you are missing the point that to collect the
instrumentation information one also need to use InstrStartNode and
InstrStopNode.  So, for the Limit node, the InstrStopNode would be
already done by the time we call shutdown of workers at the end of
execution.  To know a bit more details, see [1][2][3].

> Today while poking at this and trying to answer those questions for
> myself, I realised that  the repro I posted earlier[1] crashes exactly
> as Jerry reported on REL9_6_STABLE, but in later release branches it
> runs to completion.  That's because the crashing code was removed in
> commit 41b0dd98 "Separate reinitialization of shared parallel-scan
> state from ExecReScan.".
>
> So newer branches get past that problem, but they all spit out tons of
> each of these three warnings:
>
> WARNING:  buffer refcount leak: [172] (rel=base/12558/16390,
> blockNum=5, flags=0x9380, refcount=1 2998)
> ...
> WARNING:  relcache reference leak: relation "join_bar" not closed
> ...
> WARNING:  Snapshot reference leak: Snapshot 0x7ff20383bfb0 still referenced
> ...
>
> Oops.
>

This is exactly due to the same problem that before rescans, we have
destroyed the shared memory.  If you do the earlier trick of not
cleaning up shared memory till ExecEndNode, then you won't see this
problem.


[1] - 
https://www.postgresql.org/message-id/CAA4eK1KZEbYKj9HHP-6WqqjAXuoB%2BWJu-w1s9uovj%3DeeBxC48Q%40mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/CA%2BTgmoY3kcTcc5bFCZeY5NMFna-xaMPuTHA-z-z2Bmfg%2Bdb-XQ%40mail.gmail.com
[3] - 
https://www.postgresql.org/message-id/CAA4eK1L0KAZWgnRJz%3DVNVpyS3FFbVh8E5egyziaR0E10bC204Q%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-07-26 Thread Thomas Munro
On Fri, Jul 26, 2019 at 4:13 PM Amit Kapila  wrote:
> On Tue, Jul 23, 2019 at 5:28 PM Amit Kapila  wrote:
> > Right, that will be lesser code churn and it can also work.  However,
> > one thing that needs some thought is till now es_top_eflags is only
> > set in ExecutorStart and same is mentioned in comments where it is
> > declared and it seems we are going to change that with this idea.  How
> > about having a separate function ExecBlahShutdown which will clean up
> > resources as parallel context and can be called only from ExecutePlan
> > where we are calling ExecShutdownNode?  I think both these and the
> > other solution we have discussed are on similar lines and another idea
> > could be to relax the assert which again is not a superb idea.
>
> It seems we don't have a clear preference for any particular solution
> among these and neither there appears to be any better idea.  I guess
> we can wait for a few days to see if Robert has any views on this,
> otherwise, pick one of the above and move ahead.

I take the EXEC_FLAG_DONE idea back.  It's ugly and too hard to verify
that every appropriate path sets it, and a flag that means the
opposite would be even more of a kluge, and generally I think I was
looking at this too myopically: I was looking for a way to shut down
processes ASAP without giving up the shared memory we'll need for
rescanning, but what I should have been looking at is the reason you
did that in the first place: to get the instrumentation data.  Can you
explain why it's necessary to do that explicitly for Limit?  Wouldn't
the right place to collect instrumentation be at the end of execution
when Shutdown will run in all cases anyway (and possibly also during
ExecParallelReinitialize() or something like that if it's being
clobbered by rescans, I didn't check)?  What's special about Limit?

Today while poking at this and trying to answer those questions for
myself, I realised that  the repro I posted earlier[1] crashes exactly
as Jerry reported on REL9_6_STABLE, but in later release branches it
runs to completion.  That's because the crashing code was removed in
commit 41b0dd98 "Separate reinitialization of shared parallel-scan
state from ExecReScan.".

So newer branches get past that problem, but they all spit out tons of
each of these three warnings:

WARNING:  buffer refcount leak: [172] (rel=base/12558/16390,
blockNum=5, flags=0x9380, refcount=1 2998)
...
WARNING:  relcache reference leak: relation "join_bar" not closed
...
WARNING:  Snapshot reference leak: Snapshot 0x7ff20383bfb0 still referenced
...

Oops.  I don't know exactly why yet, but the problem goes away if you
just comment out the offending ExecShutdownNode() call in nodeLimit.c.
I tried to understand whether the buffer stats were wrong with that
code commented out (Adrien Nayrat's original complaint[2]), but I ran
out of time for debugging adventures today.

[1] 
https://www.postgresql.org/message-id/CA%2BhUKGJyqDp9FZSHLTjiNMcz-c6%3DRdStB%2BUjVZsR8wfHnJXy8Q%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/flat/86137f17-1dfb-42f9-7421-82fd786b04a1%40anayrat.info

-- 
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-07-25 Thread Amit Kapila
On Tue, Jul 23, 2019 at 5:28 PM Amit Kapila  wrote:
>
> On Tue, Jul 23, 2019 at 9:11 AM Thomas Munro  wrote:
> >
>
> > Another idea from the band-aid-solutions-that-are-easy-to-back-patch
> > department: in ExecutePlan() where we call ExecShutdownNode(), we
> > could write EXEC_FLAG_DONE into estate->es_top_eflags, and then have
> > ExecGatherShutdown() only run ExecParallelCleanup() if it sees that
> > flag.  That's not beautiful, but it's less churn that the 'bool final'
> > argument we discussed before, and could be removed in master when we
> > have a better way.
> >
>
> Right, that will be lesser code churn and it can also work.  However,
> one thing that needs some thought is till now es_top_eflags is only
> set in ExecutorStart and same is mentioned in comments where it is
> declared and it seems we are going to change that with this idea.  How
> about having a separate function ExecBlahShutdown which will clean up
> resources as parallel context and can be called only from ExecutePlan
> where we are calling ExecShutdownNode?  I think both these and the
> other solution we have discussed are on similar lines and another idea
> could be to relax the assert which again is not a superb idea.
>

It seems we don't have a clear preference for any particular solution
among these and neither there appears to be any better idea.  I guess
we can wait for a few days to see if Robert has any views on this,
otherwise, pick one of the above and move ahead.

Robert, let us know if you have any preference or better idea to fix
this problem?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-07-23 Thread Amit Kapila
On Tue, Jul 23, 2019 at 9:11 AM Thomas Munro  wrote:
>
> On Fri, Jul 19, 2019 at 3:00 PM Amit Kapila  wrote:
> > I am thinking that why not we remove the part of destroying the
> > parallel context (and shared memory) from ExecShutdownGather (and
> > ExecShutdownGatherMerge) and then do it at the time of ExecEndGather
> > (and ExecEndGatherMerge)?   This should fix the bug in hand and seems
> > to be more consistent with our overall design principles.  I have not
> > tried to code it to see if there are any other repercussions of the
> > same but seems worth investigating.  What do you think?
>
> I tried moving ExecParallelCleanup() into ExecEndGather().  The first
> problem is that ExecutePlan() wraps execution in
> EnterParallelMode()/ExitParallelMode(), but ExitParallelMode() fails
> an assertion that no parallel context is active because
> ExecEndGather() hasn't run yet.  The enclosing
> ExecutorStart()/ExecutorEnd() calls are further down the call stack,
> in ProcessQuery().  So some more restructuring might be needed to exit
> parallel mode later, but then I feel like you might be getting way out
> of back-patchable territory, especially if it involves moving code to
> the other side of the executor hook boundary.  Is there an easier way?
>

If we have to follow the solution on these lines, then I don't see an
easier way.  One idea could be that we relax the assert in
ExitParallelMode so that it doesn't expect parallel context to be gone
by that time, but not sure if that is a good idea because it is used
in some other places as well.  I feel in general it is a good
assertion that before we leave parallel mode, the parallel context
should be gone as that ensures we won't do any parallel activity after
that.

> Another idea from the band-aid-solutions-that-are-easy-to-back-patch
> department: in ExecutePlan() where we call ExecShutdownNode(), we
> could write EXEC_FLAG_DONE into estate->es_top_eflags, and then have
> ExecGatherShutdown() only run ExecParallelCleanup() if it sees that
> flag.  That's not beautiful, but it's less churn that the 'bool final'
> argument we discussed before, and could be removed in master when we
> have a better way.
>

Right, that will be lesser code churn and it can also work.  However,
one thing that needs some thought is till now es_top_eflags is only
set in ExecutorStart and same is mentioned in comments where it is
declared and it seems we are going to change that with this idea.  How
about having a separate function ExecBlahShutdown which will clean up
resources as parallel context and can be called only from ExecutePlan
where we are calling ExecShutdownNode?  I think both these and the
other solution we have discussed are on similar lines and another idea
could be to relax the assert which again is not a superb idea.

> Stepping back a bit, it seems like we need two separate tree-walking
> calls: one to free resources not needed anymore by the current rescan
> (workers), and another to free resources not needed ever again
> (parallel context).  That could be spelled ExecShutdownNode(false) and
> ExecShutdownNode(true), or controlled with the EXEC_FLAG_DONE kluge,
> or a new additional ExecSomethingSomethingNode() function, or as you
> say, perhaps the second thing could be incorporated into
> ExecEndNode().  I suspect that the Shutdown callbacks for Hash, Hash
> Join, Custom Scan and Foreign Scan might not be needed anymore if we
> could keep the parallel context around until after the run
> ExecEndNode().
>

I think we need those to collect instrumentation information.  I guess
that has to be done before we call InstrStopNode, otherwise, we might
miss some instrumentation information.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-07-22 Thread Thomas Munro
On Fri, Jul 19, 2019 at 3:00 PM Amit Kapila  wrote:
> On Thu, Jul 18, 2019 at 7:15 PM Tom Lane  wrote:
> > Thomas Munro  writes:
> > > Hmm, so something like a new argument "bool final" added to the
> > > ExecXXXShutdown() functions, which receives false in this case to tell
> > > it that there could be a rescan so keep the parallel context around.
> >
> > I think this is going in the wrong direction.  Nodes should *always*
> > assume that a rescan is possible until ExecEndNode is called.
>
> I am thinking that why not we remove the part of destroying the
> parallel context (and shared memory) from ExecShutdownGather (and
> ExecShutdownGatherMerge) and then do it at the time of ExecEndGather
> (and ExecEndGatherMerge)?   This should fix the bug in hand and seems
> to be more consistent with our overall design principles.  I have not
> tried to code it to see if there are any other repercussions of the
> same but seems worth investigating.  What do you think?

I tried moving ExecParallelCleanup() into ExecEndGather().  The first
problem is that ExecutePlan() wraps execution in
EnterParallelMode()/ExitParallelMode(), but ExitParallelMode() fails
an assertion that no parallel context is active because
ExecEndGather() hasn't run yet.  The enclosing
ExecutorStart()/ExecutorEnd() calls are further down the call stack,
in ProcessQuery().  So some more restructuring might be needed to exit
parallel mode later, but then I feel like you might be getting way out
of back-patchable territory, especially if it involves moving code to
the other side of the executor hook boundary.  Is there an easier way?

Another idea from the band-aid-solutions-that-are-easy-to-back-patch
department: in ExecutePlan() where we call ExecShutdownNode(), we
could write EXEC_FLAG_DONE into estate->es_top_eflags, and then have
ExecGatherShutdown() only run ExecParallelCleanup() if it sees that
flag.  That's not beautiful, but it's less churn that the 'bool final'
argument we discussed before, and could be removed in master when we
have a better way.

Stepping back a bit, it seems like we need two separate tree-walking
calls: one to free resources not needed anymore by the current rescan
(workers), and another to free resources not needed ever again
(parallel context).  That could be spelled ExecShutdownNode(false) and
ExecShutdownNode(true), or controlled with the EXEC_FLAG_DONE kluge,
or a new additional ExecSomethingSomethingNode() function, or as you
say, perhaps the second thing could be incorporated into
ExecEndNode().  I suspect that the Shutdown callbacks for Hash, Hash
Join, Custom Scan and Foreign Scan might not be needed anymore if we
could keep the parallel context around until after the run
ExecEndNode().

-- 
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-07-18 Thread Amit Kapila
On Thu, Jul 18, 2019 at 7:15 PM Tom Lane  wrote:
>
> Thomas Munro  writes:
> > Hmm, so something like a new argument "bool final" added to the
> > ExecXXXShutdown() functions, which receives false in this case to tell
> > it that there could be a rescan so keep the parallel context around.
>
> I think this is going in the wrong direction.  Nodes should *always*
> assume that a rescan is possible until ExecEndNode is called.
>

I am thinking that why not we remove the part of destroying the
parallel context (and shared memory) from ExecShutdownGather (and
ExecShutdownGatherMerge) and then do it at the time of ExecEndGather
(and ExecEndGatherMerge)?   This should fix the bug in hand and seems
to be more consistent with our overall design principles.  I have not
tried to code it to see if there are any other repercussions of the
same but seems worth investigating.  What do you think?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-07-18 Thread Tom Lane
Thomas Munro  writes:
> Hmm, so something like a new argument "bool final" added to the
> ExecXXXShutdown() functions, which receives false in this case to tell
> it that there could be a rescan so keep the parallel context around.

I think this is going in the wrong direction.  Nodes should *always*
assume that a rescan is possible until ExecEndNode is called.  See the
commentary about EXEC_FLAG_REWIND in executor.h:

 * REWIND indicates that the plan node should try to efficiently support
 * rescans without parameter changes.  (Nodes must support ExecReScan calls
 * in any case, but if this flag was not given, they are at liberty to do it
 * through complete recalculation.  Note that a parameter change forces a
 * full recalculation in any case.)

If nodeLimit is doing something that's incompatible with that, it's
nodeLimit's fault; and similarly for the parallel machinery.

If you want to do otherwise, you are going to be inventing a whole
bunch of complicated and doubtless-initially-buggy control logic
to pass down information about whether a rescan might be possible.
That doesn't sound like a recipe for a back-patchable fix.  Perhaps
we could consider redesigning the rules around REWIND in a future
version, but that's not where to focus the bug fix effort.

regards, tom lane




Re: SegFault on 9.6.14

2019-07-18 Thread Thomas Munro
On Thu, Jul 18, 2019 at 6:40 PM Amit Kapila  wrote:
> On Wed, Jul 17, 2019 at 4:10 PM Amit Kapila  wrote:
> > Yeah, that is a problem.  Actually, what we need here is to
> > wait-for-workers-to-finish and collect all the instrumentation
> > information.  We don't need to destroy the shared memory at this
> > stage, but we don't have a special purpose function which can just
> > allow us to collect stats.  One idea could be that we create a special
> > purpose function which sounds like a recipe of code duplication,
> > another could be that somehow pass the information through
> > ExecShutdownNode to Gather/GatherMerge that they don't destroy shared
> > memory.  Immediately, I can't think of better ideas, but it is
> > possible that there is some better way to deal with this.
>
> I am not able to come up with anything better.  Robert, Thomas, do you
> see any problem with this idea or do you have any better ideas to fix
> this issue?

Hmm, so something like a new argument "bool final" added to the
ExecXXXShutdown() functions, which receives false in this case to tell
it that there could be a rescan so keep the parallel context around.
Or alternatively a separate function with another end-of-scan type of
name that I'm having trouble inventing, which is basically the same
but a bigger patch.  If you add a new argument you might in theory
want to pass that on to the ShutdownForeignScan and ShutdownCustomScan
callbacks, but we obviously can't change those APIs in the back
branches.  If you add a new function instead you might theoretically
want to add it to those APIs too, which you also can't really do in
the back branches either (well even if you could, existing extensions
won't register anything).  I think the new argument version is
probably better because I suspect only Gather would really ever have
any reason to treat the two cases differently, and all existing cases
in or out of core would just keep doing what they're doing.  So I
think adding "bool final" (or better name) would probably work out OK,
and I don't have a better idea.

--
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-07-18 Thread Amit Kapila
On Wed, Jul 17, 2019 at 4:10 PM Amit Kapila  wrote:
>
> On Wed, Jul 17, 2019 at 6:28 AM Thomas Munro  wrote:
> >
> > On Wed, Jul 17, 2019 at 12:44 PM Thomas Munro  
> > wrote:
> > > > #11 0x55666e0359df in ExecShutdownNode 
> > > > (node=node@entry=0x55667033a6c8)
> > > > at 
> > > > /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
> > > > #12 0x55666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
> > > > at 
> > > > /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139
> > >
> > > https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/executor/nodeLimit.c#L139
> > >
> > > Limit thinks it's OK to "shut down" the subtree, but if you shut down a
> > > Gather node you can't rescan it later because it destroys its shared
> > > memory.  Oops.  Not sure what to do about that yet.
> >
>
> Yeah, that is a problem.  Actually, what we need here is to
> wait-for-workers-to-finish and collect all the instrumentation
> information.  We don't need to destroy the shared memory at this
> stage, but we don't have a special purpose function which can just
> allow us to collect stats.  One idea could be that we create a special
> purpose function which sounds like a recipe of code duplication,
> another could be that somehow pass the information through
> ExecShutdownNode to Gather/GatherMerge that they don't destroy shared
> memory.  Immediately, I can't think of better ideas, but it is
> possible that there is some better way to deal with this.
>

I am not able to come up with anything better.  Robert, Thomas, do you
see any problem with this idea or do you have any better ideas to fix
this issue?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-07-17 Thread Amit Kapila
On Wed, Jul 17, 2019 at 6:28 AM Thomas Munro  wrote:
>
> On Wed, Jul 17, 2019 at 12:44 PM Thomas Munro  wrote:
> > > #11 0x55666e0359df in ExecShutdownNode 
> > > (node=node@entry=0x55667033a6c8)
> > > at 
> > > /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
> > > #12 0x55666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
> > > at 
> > > /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139
> >
> > https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/executor/nodeLimit.c#L139
> >
> > Limit thinks it's OK to "shut down" the subtree, but if you shut down a
> > Gather node you can't rescan it later because it destroys its shared
> > memory.  Oops.  Not sure what to do about that yet.
>

Yeah, that is a problem.  Actually, what we need here is to
wait-for-workers-to-finish and collect all the instrumentation
information.  We don't need to destroy the shared memory at this
stage, but we don't have a special purpose function which can just
allow us to collect stats.  One idea could be that we create a special
purpose function which sounds like a recipe of code duplication,
another could be that somehow pass the information through
ExecShutdownNode to Gather/GatherMerge that they don't destroy shared
memory.  Immediately, I can't think of better ideas, but it is
possible that there is some better way to deal with this.

> CCing Amit and Robert, authors of commits 19df1702 and 69de1718.
>

Thanks for diagnosing the issue.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SegFault on 9.6.14

2019-07-16 Thread Thomas Munro
On Wed, Jul 17, 2019 at 12:57 PM Thomas Munro  wrote:
> On Wed, Jul 17, 2019 at 12:44 PM Thomas Munro  wrote:
> > > #11 0x55666e0359df in ExecShutdownNode 
> > > (node=node@entry=0x55667033a6c8)
> > > at 
> > > /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
> > > #12 0x55666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
> > > at 
> > > /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139
> >
> > https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/executor/nodeLimit.c#L139
> >
> > Limit thinks it's OK to "shut down" the subtree, but if you shut down a
> > Gather node you can't rescan it later because it destroys its shared
> > memory.  Oops.  Not sure what to do about that yet.
>
> CCing Amit and Robert, authors of commits 19df1702 and 69de1718.

Here's a repro (I'm sure you can find a shorter one, this one's hacked
up from join_hash.sql, basically just adding LIMIT):

create table join_foo as select generate_series(1, 3000) as id,
'x'::text as t;
alter table join_foo set (parallel_workers = 0);
create table join_bar as select generate_series(0, 1) as id,
'x'::text as t;
alter table join_bar set (parallel_workers = 2);

set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;
set max_parallel_workers_per_gather = 2;
set enable_material = off;
set enable_mergejoin = off;
set work_mem = '1GB';

select count(*) from join_foo
left join (select b1.id, b1.t from join_bar b1 join join_bar b2
using (id) limit 1000) ss
on join_foo.id < ss.id + 1 and join_foo.id > ss.id - 1;

-- 
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-07-16 Thread Thomas Munro
On Wed, Jul 17, 2019 at 12:44 PM Thomas Munro  wrote:
> > #11 0x55666e0359df in ExecShutdownNode (node=node@entry=0x55667033a6c8)
> > at 
> > /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
> > #12 0x55666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
> > at 
> > /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139
>
> https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/executor/nodeLimit.c#L139
>
> Limit thinks it's OK to "shut down" the subtree, but if you shut down a
> Gather node you can't rescan it later because it destroys its shared
> memory.  Oops.  Not sure what to do about that yet.

CCing Amit and Robert, authors of commits 19df1702 and 69de1718.

-- 
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-07-16 Thread Jerry Sievers
Thomas Munro  writes:

> On Wed, Jul 17, 2019 at 12:26 PM Jerry Sievers  wrote:
>
>> Is this the right sequencing?
>>
>> 1. Start client and get backend pid
>> 2. GDB;  handle SIGUSR1, break, cont
>> 3. Run query
>> 4. bt
>
> Perfect, thanks.  I think I just spotted something:

Dig that!  Great big thanks to you and Tomas, et al for jumping on this.

Please let know if there's anything else I can submit that would be
helpful.


>
>> #11 0x55666e0359df in ExecShutdownNode (node=node@entry=0x55667033a6c8)
>> at 
>> /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
>> #12 0x55666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
>> at 
>> /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139
>
> https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/executor/nodeLimit.c#L139
>
> Limit thinks it's OK to "shut down" the subtree, but if you shut down a
> Gather node you can't rescan it later because it destroys its shared
> memory.  Oops.  Not sure what to do about that yet.
>
>
> --
> Thomas Munro
> https://enterprisedb.com
>
>
>

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net




Re: SegFault on 9.6.14

2019-07-16 Thread Thomas Munro
On Wed, Jul 17, 2019 at 12:26 PM Jerry Sievers  wrote:
> Is this the right sequencing?
>
> 1. Start client and get backend pid
> 2. GDB;  handle SIGUSR1, break, cont
> 3. Run query
> 4. bt

Perfect, thanks.  I think I just spotted something:

> #11 0x55666e0359df in ExecShutdownNode (node=node@entry=0x55667033a6c8)
> at 
> /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
> #12 0x55666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
> at 
> /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139

https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/executor/nodeLimit.c#L139

Limit thinks it's OK to "shut down" the subtree, but if you shut down a
Gather node you can't rescan it later because it destroys its shared
memory.  Oops.  Not sure what to do about that yet.


--
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-07-16 Thread Jerry Sievers
Thomas Munro  writes:

> On Wed, Jul 17, 2019 at 12:05 PM Jerry Sievers  wrote:
>
>> Program received signal SIGUSR1, User defined signal 1.
>
> Oh, we need to ignore those pesky signals with "handle SIGUSR1 noprint 
> nostop".

Is this the right sequencing?

1. Start client and get backend pid
2. GDB;  handle SIGUSR1, break, cont
3. Run query
4. bt

Thanks

Don't think I am doing this correctly.  Please advise.

handle SIGUSR1 noprint nostop
SignalStop  Print   Pass to program Description
SIGUSR1   NoNo  Yes User defined signal 1
(gdb) break munmap
Breakpoint 1 at 0x7f3d09331740: file ../sysdeps/unix/syscall-template.S, line 
84.
(gdb) cont
Continuing.

Breakpoint 1, munmap () at ../sysdeps/unix/syscall-template.S:84
84  ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0  munmap () at ../sysdeps/unix/syscall-template.S:84
#1  0x55666e12d7f4 in dsm_impl_posix (impl_private=0x22, elevel=19, 
mapped_size=0x556670205890, mapped_address=0x556670205888, request_size=0, 
handle=, op=DSM_OP_DETACH)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/storage/ipc/dsm_impl.c:259
#2  dsm_impl_op (op=op@entry=DSM_OP_DETACH, handle=, 
request_size=request_size@entry=0, 
impl_private=impl_private@entry=0x556670205880, 
mapped_address=mapped_address@entry=0x556670205888, 
mapped_size=mapped_size@entry=0x556670205890, elevel=19)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/storage/ipc/dsm_impl.c:176
#3  0x55666e12efb1 in dsm_detach (seg=0x556670205860)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/storage/ipc/dsm.c:738
#4  0x55666df31369 in DestroyParallelContext (pcxt=0x556670219b68)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/transam/parallel.c:750
#5  0x55666e0357bb in ExecParallelCleanup (pei=0x7f3d012218b0)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execParallel.c:575
#6  0x55666e047ca2 in ExecShutdownGather (node=node@entry=0x55667033bed0)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:443
#7  0x55666e0359f5 in ExecShutdownNode (node=0x55667033bed0)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:820
#8  0x55666e0777e1 in planstate_tree_walker (planstate=0x55667033b2b0, 
walker=0x55666e0359a0 , context=0x0)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/nodes/nodeFuncs.c:3636
#9  0x55666e0777e1 in planstate_tree_walker (planstate=0x55667033b040, 
walker=0x55666e0359a0 , context=0x0)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/nodes/nodeFuncs.c:3636
#10 0x55666e0777e1 in planstate_tree_walker 
(planstate=planstate@entry=0x55667033a6c8, 
walker=walker@entry=0x55666e0359a0 , 
context=context@entry=0x0)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/nodes/nodeFuncs.c:3636
#11 0x55666e0359df in ExecShutdownNode (node=node@entry=0x55667033a6c8)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
#12 0x55666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139
#13 0x55666e035d28 in ExecProcNode (node=node@entry=0x55667033a428)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:531
#14 0x55666e051f69 in ExecNestLoop (node=node@entry=0x55667031c660)
---Type  to continue, or q  to quit---at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeNestloop.c:174
#15 0x55666e035e28 in ExecProcNode (node=node@entry=0x55667031c660)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:476
#16 0x55666e054989 in ExecSort (node=node@entry=0x55667031c3f0)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSort.c:103
#17 0x55666e035de8 in ExecProcNode (node=0x55667031c3f0)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:495
#18 0x55666e041fe9 in fetch_input_tuple 
(aggstate=aggstate@entry=0x55667031ba18)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:598
#19 0x55666e043bb3 in agg_retrieve_direct (aggstate=0x55667031ba18)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:2078
#20 ExecAgg (node=node@entry=0x55667031ba18)
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:1903
#21 0x55666e035dc8 in ExecProcNode 

Re: SegFault on 9.6.14

2019-07-16 Thread Thomas Munro
On Wed, Jul 17, 2019 at 12:05 PM Jerry Sievers  wrote:
> Program received signal SIGUSR1, User defined signal 1.

Oh, we need to ignore those pesky signals with "handle SIGUSR1 noprint nostop".

-- 
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-07-16 Thread Jerry Sievers
Thomas Munro  writes:

> On Wed, Jul 17, 2019 at 11:33 AM Jerry Sievers  wrote:
>
>>  ->  Nested Loop Left Join  (cost=251621.81..12300177.37 rows=48 
>> width=44)
>>->  Gather  (cost=1001.55..270403.27 rows=48 width=40)
>
>>->  Limit  (cost=250620.25..250620.27 rows=1 width=20)
>
>>->  Gather  (cost=1000.00..250452.00 
>> rows=3706 width=4)
>
> One observation is that it's a rescan a bit like the one in the
> unsuccessful repro attempt I posted, but it has *two* Gather nodes in
> it (and thus two parallel query DSM segments), and only one of them
> should be rescanned, and from the backtrace we see that it is indeed
> the expected one, the one under the Limit operator.  Neither of them
> should be getting unmapped in the leader though and AFAIK nothing
> happening in the workers could cause this effect, the leader would
> have to explicitly unmap the thing AFAIK.
>
> On Wed, Jul 17, 2019 at 11:42 AM Jerry Sievers  wrote:
>> mmap(NULL, 287624, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011f2000
>> mmap(NULL, 262504, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011b1000
>> munmap(0x7f3d011b1000, 262504)  = 0
>
> Ok, there go our two parallel query DSM segments, and there it is
> being unmapped.  Hmm.  Any chance you could attach a debugger, and
> "break munmap", "cont", and then show us the backtrace "bt" when that
> is reached?

gdb /usr/lib/postgresql/9.6/bin/postgres 21640
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/9.6/bin/postgres...Reading symbols 
from 
/usr/lib/debug/.build-id/04/6f55a5ce6ce05064edfc8feee61c6cb039d296.debug...done.
done.
Attaching to program: /usr/lib/postgresql/9.6/bin/postgres, process 21640
Reading symbols from /usr/lib/x86_64-linux-gnu/libxml2.so.2...Reading symbols 
from 
/usr/lib/debug/.build-id/d3/57ce1dba1fab803eddf48922123ffd0a303676.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libpam.so.0...(no debugging symbols 
found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libssl.so.1.0.0...Reading symbols 
from 
/usr/lib/debug/.build-id/ff/69ea60ebe05f2dd689d2b26fc85a73e5fbc3a0.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0...Reading symbols 
from 
/usr/lib/debug/.build-id/15/ffeb43278726b025f020862bf51302822a40ec.debug...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2...(no 
debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...Reading symbols from 
/usr/lib/debug//lib/x86_64-linux-gnu/librt-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...Reading symbols from 
/usr/lib/debug//lib/x86_64-linux-gnu/libdl-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...Reading symbols from 
/usr/lib/debug//lib/x86_64-linux-gnu/libm-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2...Reading 
symbols from 
/usr/lib/debug/.build-id/38/90d33727391e4a85dc0f819ab0aa29bb5dfc86.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libsystemd.so.0...(no debugging 
symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from 
/usr/lib/debug//lib/x86_64-linux-gnu/libc-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libicuuc.so.55...Reading symbols 
from 
/usr/lib/debug/.build-id/46/3d8b610702d64ae0803c7dfcaa02cfb4c6477b.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libz.so.1...Reading symbols from 
/usr/lib/debug/.build-id/8d/9bd4ce26e45ef16075c67d5f5eeafd8b562832.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/liblzma.so.5...(no debugging symbols 
found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libaudit.so.1...(no debugging 
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libkrb5.so.3...(no debugging 
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libk5crypto.so.3...(no debugging 
symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libcom_err.so.2...Reading symbols 
from /usr/lib/debug//lib/x86_64-linux-gnu/libcom_err.so.2.1...done.
done.
Reading symbols from 

Re: SegFault on 9.6.14

2019-07-16 Thread Thomas Munro
On Wed, Jul 17, 2019 at 11:33 AM Jerry Sievers  wrote:
>  ->  Nested Loop Left Join  (cost=251621.81..12300177.37 rows=48 
> width=44)
>->  Gather  (cost=1001.55..270403.27 rows=48 width=40)

>->  Limit  (cost=250620.25..250620.27 rows=1 width=20)

>->  Gather  (cost=1000.00..250452.00 
> rows=3706 width=4)

One observation is that it's a rescan a bit like the one in the
unsuccessful repro attempt I posted, but it has *two* Gather nodes in
it (and thus two parallel query DSM segments), and only one of them
should be rescanned, and from the backtrace we see that it is indeed
the expected one, the one under the Limit operator.  Neither of them
should be getting unmapped in the leader though and AFAIK nothing
happening in the workers could cause this effect, the leader would
have to explicitly unmap the thing AFAIK.

On Wed, Jul 17, 2019 at 11:42 AM Jerry Sievers  wrote:
> mmap(NULL, 287624, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011f2000
> mmap(NULL, 262504, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011b1000
> munmap(0x7f3d011b1000, 262504)  = 0

Ok, there go our two parallel query DSM segments, and there it is
being unmapped.  Hmm.  Any chance you could attach a debugger, and
"break munmap", "cont", and then show us the backtrace "bt" when that
is reached?




--
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-07-16 Thread Jerry Sievers
Thomas Munro  writes:

> On Wed, Jul 17, 2019 at 11:06 AM Jerry Sievers  wrote:
>
>> (gdb) p *scan->rs_parallel
>> Cannot access memory at address 0x7fa673a54108
>
> So I guess one question is: was it a valid address that's been
> unexpectedly unmapped, or is the pointer corrupted?  Any chance you
> can strace the backend and pull out the map, unmap calls?

There were about 60k lines from strace including these few...


mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f3d0127a000
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f3d01239000
mmap(NULL, 287624, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011f2000
mmap(NULL, 262504, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011b1000
munmap(0x7f3d011b1000, 262504)  = 0

Thx





-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net




Re: SegFault on 9.6.14

2019-07-16 Thread Jerry Sievers
Thomas Munro  writes:

> On Wed, Jul 17, 2019 at 11:06 AM Jerry Sievers  wrote:
>
>> (gdb) p *scan->rs_parallel
>> Cannot access memory at address 0x7fa673a54108
>
> So I guess one question is: was it a valid address that's been
> unexpectedly unmapped, or is the pointer corrupted?  Any chance you
> can strace the backend and pull out the map, unmap calls?

I'll dig further.

Here is a sanitized look at the query and explain plan...

The segfault happens $immediately upon issuance of the query.





begin;

-- This setting makes the segfault go away
--set local max_parallel_workers_per_gather to 0;

explain
select v.account_id, COUNT(cnt.clicks), te.description,
l.product_id
from thing3.thing10 te
join thing3.thing9 pv on pv.page_view_id = te.page_view_id
join thing3.thing11 v on v.visit_id = pv.visit_id
left join thing6.thing12 l on v.account_id=l.account_id
  left join lateral (
select MAX(v.visit_id)
 ,COUNT(*) as clicks
 from thing3.thing10 te
 join thing3.thing9 pv on pv.page_view_id =
te.page_view_id
 join thing3.thing11 v on v.visit_id = pv.visit_id
 where te.description in ('thing7',
'thing8')
   and v.account_id=l.account_id
 GROUP BY v.account_id, v.visit_id
 order by v.account_id, v.visit_id desc
 limit 1
)cnt on true
where (te.description in ('thing4',
'thing5')
 or te.description like'%auto%')
  and te.created_at > '2019-06-24 00:00:00'
--and l.loan_status_id in (5,6)
group by v.account_id, te.description,
l.product_id;

abort;
BEGIN

QUERY PLAN  

   
---
 GroupAggregate  (cost=12300178.71..12300179.79 rows=48 width=44)
   Group Key: v.account_id, te.description, l.product_id
   ->  Sort  (cost=12300178.71..12300178.83 rows=48 width=44)
 Sort Key: v.account_id, te.description, l.product_id
 ->  Nested Loop Left Join  (cost=251621.81..12300177.37 rows=48 
width=44)
   ->  Gather  (cost=1001.55..270403.27 rows=48 width=40)
 Workers Planned: 3
 ->  Nested Loop Left Join  (cost=1.56..269398.47 rows=15 
width=40)
   ->  Nested Loop  (cost=1.13..269391.71 rows=14 
width=32)
 ->  Nested Loop  (cost=0.57..269368.66 rows=39 
width=32)
   ->  Parallel Seq Scan on thing10 te  
(cost=0.00..269228.36 rows=39 width=32)
 Filter: ((created_at > '2019-06-24 
00:00:00'::timestamp without time zone) AND (((description)::text = ANY 
('{thing4,thing5}'::text[])) OR ((description)::text ~~ '%auto%'::text)))
   ->  Index Scan using page_views_pkey on 
thing9 pv  (cost=0.57..3.59 rows=1 width=8)
 Index Cond: (page_view_id = 
te.page_view_id)
 ->  Index Scan using visits_pkey on thing11 v  
(cost=0.56..0.58 rows=1 width=8)
   Index Cond: (visit_id = pv.visit_id)
   ->  Index Scan using index_loans_on_account_id on 
thing12 l  (cost=0.42..0.46 rows=2 width=8)
 Index Cond: (v.account_id = account_id)
   ->  Limit  (cost=250620.25..250620.27 rows=1 width=20)
 ->  GroupAggregate  (cost=250620.25..250620.27 rows=1 
width=20)
   Group Key: v_1.visit_id
   ->  Sort  (cost=250620.25..250620.26 rows=1 width=8)
 Sort Key: v_1.visit_id DESC
 ->  Hash Join  (cost=1154.34..250620.24 rows=1 
width=8)
   Hash Cond: (te_1.page_view_id = 
pv_1.page_view_id)
   ->  Gather  (cost=1000.00..250452.00 
rows=3706 width=4)
 Workers Planned: 3
 ->  Parallel Seq Scan on thing10 
te_1  (cost=0.00..249081.40 rows=1195 width=4)
   Filter: ((description)::text 
= ANY ('{thing7,thing8}'::text[]))
   ->  Hash  (cost=152.85..152.85 rows=119 
width=12)
 ->  Nested Loop  
(cost=1.01..152.85 rows=119 width=12)
   ->  Index Scan using 
index_visits_on_account_id on thing11 v_1  (cost=0.43..15.63 rows=18 

Re: SegFault on 9.6.14

2019-07-16 Thread Thomas Munro
On Wed, Jul 17, 2019 at 11:11 AM Thomas Munro  wrote:
> map, unmap

mmap, munmap

-- 
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-07-16 Thread Thomas Munro
On Wed, Jul 17, 2019 at 11:06 AM Jerry Sievers  wrote:
> (gdb) p *scan->rs_parallel
> Cannot access memory at address 0x7fa673a54108

So I guess one question is: was it a valid address that's been
unexpectedly unmapped, or is the pointer corrupted?  Any chance you
can strace the backend and pull out the map, unmap calls?

-- 
Thomas Munro
https://enterprisedb.com




Re: SegFault on 9.6.14

2019-07-16 Thread Jerry Sievers
Tomas Vondra  writes:

> On Mon, Jul 15, 2019 at 08:20:00PM -0500, Jerry Sievers wrote:
>
>>Tomas Vondra  writes:
>>
>>> On Mon, Jul 15, 2019 at 07:22:55PM -0500, Jerry Sievers wrote:
>>>
Tomas Vondra  writes:

> On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:
>
>>Greetings Hackers.
>>
>>We have a reproduceable case of $subject that issues a backtrace such as
>>seen below.
>>
>>The query that I'd prefer to sanitize before sending is <30 lines of at
>>a glance, not terribly complex logic.
>>
>>It nonetheless dies hard after a few seconds of running and as expected,
>>results in an automatic all-backend restart.
>>
>>Please advise on how to proceed.  Thanks!
>>
>>bt
>>#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, 
>>keep_startblock=keep_startblock@entry=1 '\001')
>>at 
>> /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
>>#1  0x55d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, 
>>key=key@entry=0x0) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
>>#2  0x55d7a7451fef in ExecReScanSeqScan 
>>(node=node@entry=0x55d7a7d85100) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
>>#3  0x55d7a742d36e in ExecReScan (node=0x55d7a7d85100) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
>>#4  0x55d7a7445d38 in ExecReScanGather 
>>(node=node@entry=0x55d7a7d84d30) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
>>#5  0x55d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
>>#6  0x55d7a7448673 in ExecReScanHashJoin 
>>(node=node@entry=0x55d7a7d84110) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
>>#7  0x55d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
>>
>>
>
> Hmmm, that means it's crashing here:
>
>if (scan->rs_parallel != NULL)
>scan->rs_nblocks = scan->rs_parallel->phs_nblocks; <--- here
>else
>scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);
>
> But clearly, scan is valid (otherwise it'd crash on the if condition),
> and scan->rs_parallel must me non-NULL. Which probably means the pointer
> is (no longer) valid.
>
> Could it be that the rs_parallel DSM disappears on rescan, or something
> like that?

No clue but something I just tried was to disable parallelism by setting
max_parallel_workers_per_gather to 0 and however the query has not
finished after a few minutes, there is no crash.

>>>
>>> That might be a hint my rough analysis was somewhat correct. The
>>> question is whether the non-parallel plan does the same thing. Maybe it
>>> picks a plan that does not require rescans, or something like that.
>>>
Please advise.

>>>
>>> It would be useful to see (a) exacution plan of the query, (b) full
>>> backtrace and (c) a bit of context for the place where it crashed.
>>>
>>> Something like (in gdb):
>>>
>>>bt full
>>>list
>>>p *scan
>>
>>The p *scan did nothing unless I ran it first however my gdb $foo isn't
>>strong presently.
>
> Hmm, the rs_parallel pointer looks sane (it's not obvious garbage). Can
> you try this?
>
>   p *scan->rs_parallel


$ gdb /usr/lib/postgresql/9.6/bin/postgres core
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/9.6/bin/postgres...Reading symbols 
from 
/usr/lib/debug/.build-id/04/6f55a5ce6ce05064edfc8feee61c6cb039d296.debug...done.
done.
[New LWP 31654]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: foo_eis_segfault: jsievers staging 
10.220.22.26(57948) SELECT   '.
Program terminated 

Re: SegFault on 9.6.14

2019-07-16 Thread Thomas Munro
On Tue, Jul 16, 2019 at 8:22 PM Tomas Vondra
 wrote:
> On Mon, Jul 15, 2019 at 08:20:00PM -0500, Jerry Sievers wrote:
> >We have a reproduceable case of $subject that issues a backtrace such as
> >seen below.

> >#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, 
> >keep_startblock=keep_startblock@entry=1 '\001')
> >at 
> > /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
> >#1  0x55d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, 
> >key=key@entry=0x0) at 
> >/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
> >#2  0x55d7a7451fef in ExecReScanSeqScan 
> >(node=node@entry=0x55d7a7d85100) at 
> >/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
> >#3  0x55d7a742d36e in ExecReScan (node=0x55d7a7d85100) at 
> >/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
> >#4  0x55d7a7445d38 in ExecReScanGather 
> >(node=node@entry=0x55d7a7d84d30) at 
> >/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
> >#5  0x55d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at 
> >/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166

Here's a query that rescans a gather node repeatedly on 9.6 in case it
helps someone build a repro, but it works fine here.

-- 
Thomas Munro
https://enterprisedb.com


test.sql
Description: application/sql


Re: SegFault on 9.6.14

2019-07-16 Thread Tomas Vondra

On Mon, Jul 15, 2019 at 08:20:00PM -0500, Jerry Sievers wrote:

Tomas Vondra  writes:


On Mon, Jul 15, 2019 at 07:22:55PM -0500, Jerry Sievers wrote:


Tomas Vondra  writes:


On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:


Greetings Hackers.

We have a reproduceable case of $subject that issues a backtrace such as
seen below.

The query that I'd prefer to sanitize before sending is <30 lines of at
a glance, not terribly complex logic.

It nonetheless dies hard after a few seconds of running and as expected,
results in an automatic all-backend restart.

Please advise on how to proceed.  Thanks!

bt
#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, 
keep_startblock=keep_startblock@entry=1 '\001')
   at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
#1  0x55d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) 
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
#2  0x55d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
#3  0x55d7a742d36e in ExecReScan (node=0x55d7a7d85100) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
#4  0x55d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
#5  0x55d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
#6  0x55d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) 
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
#7  0x55d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226




Hmmm, that means it's crashing here:

   if (scan->rs_parallel != NULL)
   scan->rs_nblocks = scan->rs_parallel->phs_nblocks; <--- here
   else
   scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);

But clearly, scan is valid (otherwise it'd crash on the if condition),
and scan->rs_parallel must me non-NULL. Which probably means the pointer
is (no longer) valid.

Could it be that the rs_parallel DSM disappears on rescan, or something
like that?


No clue but something I just tried was to disable parallelism by setting
max_parallel_workers_per_gather to 0 and however the query has not
finished after a few minutes, there is no crash.



That might be a hint my rough analysis was somewhat correct. The
question is whether the non-parallel plan does the same thing. Maybe it
picks a plan that does not require rescans, or something like that.


Please advise.



It would be useful to see (a) exacution plan of the query, (b) full
backtrace and (c) a bit of context for the place where it crashed.

Something like (in gdb):

   bt full
   list
   p *scan


The p *scan did nothing unless I ran it first however my gdb $foo isn't
strong presently.


Hmm, the rs_parallel pointer looks sane (it's not obvious garbage). Can
you try this?

  p *scan->rs_parallel

Another question - are you sure this is not an OOM issue? That might
sometimes look like SIGSEGV due to overcommit. What's the memory
consumption / is there anything in dmesg?

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: SegFault on 9.6.14

2019-07-15 Thread Jerry Sievers
Tomas Vondra  writes:

> On Mon, Jul 15, 2019 at 07:22:55PM -0500, Jerry Sievers wrote:
>
>>Tomas Vondra  writes:
>>
>>> On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:
>>>
Greetings Hackers.

We have a reproduceable case of $subject that issues a backtrace such as
seen below.

The query that I'd prefer to sanitize before sending is <30 lines of at
a glance, not terribly complex logic.

It nonetheless dies hard after a few seconds of running and as expected,
results in an automatic all-backend restart.

Please advise on how to proceed.  Thanks!

bt
#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, 
keep_startblock=keep_startblock@entry=1 '\001')
at 
 /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
#1  0x55d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, 
key=key@entry=0x0) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
#2  0x55d7a7451fef in ExecReScanSeqScan 
(node=node@entry=0x55d7a7d85100) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
#3  0x55d7a742d36e in ExecReScan (node=0x55d7a7d85100) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
#4  0x55d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) 
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
#5  0x55d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
#6  0x55d7a7448673 in ExecReScanHashJoin 
(node=node@entry=0x55d7a7d84110) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
#7  0x55d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226


>>>
>>> Hmmm, that means it's crashing here:
>>>
>>>if (scan->rs_parallel != NULL)
>>>scan->rs_nblocks = scan->rs_parallel->phs_nblocks; <--- here
>>>else
>>>scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);
>>>
>>> But clearly, scan is valid (otherwise it'd crash on the if condition),
>>> and scan->rs_parallel must me non-NULL. Which probably means the pointer
>>> is (no longer) valid.
>>>
>>> Could it be that the rs_parallel DSM disappears on rescan, or something
>>> like that?
>>
>>No clue but something I just tried was to disable parallelism by setting
>>max_parallel_workers_per_gather to 0 and however the query has not
>>finished after a few minutes, there is no crash.
>>
>
> That might be a hint my rough analysis was somewhat correct. The
> question is whether the non-parallel plan does the same thing. Maybe it
> picks a plan that does not require rescans, or something like that.
>
>>Please advise.
>>
>
> It would be useful to see (a) exacution plan of the query, (b) full
> backtrace and (c) a bit of context for the place where it crashed.
>
> Something like (in gdb):
>
>bt full
>list
>p *scan

The p *scan did nothing unless I ran it first however my gdb $foo isn't
strong presently.

I'll need to sanitize the explain output but can do so ASAP and send it
along.

Thx!


$ gdb /usr/lib/postgresql/9.6/bin/postgres core
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/9.6/bin/postgres...Reading symbols 
from 
/usr/lib/debug/.build-id/04/6f55a5ce6ce05064edfc8feee61c6cb039d296.debug...done.
done.
[New LWP 31654]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: foo_eis_segfault: jsievers staging 
10.220.22.26(57948) SELECT   '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, 
keep_startblock=keep_startblock@entry=1 '\001')
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
233 

Re: SegFault on 9.6.14

2019-07-15 Thread Tomas Vondra

On Mon, Jul 15, 2019 at 07:22:55PM -0500, Jerry Sievers wrote:

Tomas Vondra  writes:


On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:


Greetings Hackers.

We have a reproduceable case of $subject that issues a backtrace such as
seen below.

The query that I'd prefer to sanitize before sending is <30 lines of at
a glance, not terribly complex logic.

It nonetheless dies hard after a few seconds of running and as expected,
results in an automatic all-backend restart.

Please advise on how to proceed.  Thanks!

bt
#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, 
keep_startblock=keep_startblock@entry=1 '\001')
   at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
#1  0x55d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) 
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
#2  0x55d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
#3  0x55d7a742d36e in ExecReScan (node=0x55d7a7d85100) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
#4  0x55d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
#5  0x55d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
#6  0x55d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) 
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
#7  0x55d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226




Hmmm, that means it's crashing here:

   if (scan->rs_parallel != NULL)
   scan->rs_nblocks = scan->rs_parallel->phs_nblocks; <--- here
   else
   scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);

But clearly, scan is valid (otherwise it'd crash on the if condition),
and scan->rs_parallel must me non-NULL. Which probably means the pointer
is (no longer) valid.

Could it be that the rs_parallel DSM disappears on rescan, or something
like that?


No clue but something I just tried was to disable parallelism by setting
max_parallel_workers_per_gather to 0 and however the query has not
finished after a few minutes, there is no crash.



That might be a hint my rough analysis was somewhat correct. The
question is whether the non-parallel plan does the same thing. Maybe it
picks a plan that does not require rescans, or something like that.


Please advise.



It would be useful to see (a) exacution plan of the query, (b) full
backtrace and (c) a bit of context for the place where it crashed.

Something like (in gdb):

   bt full
   list
   p *scan



regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: SegFault on 9.6.14

2019-07-15 Thread Jerry Sievers
Tomas Vondra  writes:

> On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:
>
>>Greetings Hackers.
>>
>>We have a reproduceable case of $subject that issues a backtrace such as
>>seen below.
>>
>>The query that I'd prefer to sanitize before sending is <30 lines of at
>>a glance, not terribly complex logic.
>>
>>It nonetheless dies hard after a few seconds of running and as expected,
>>results in an automatic all-backend restart.
>>
>>Please advise on how to proceed.  Thanks!
>>
>>bt
>>#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, 
>>keep_startblock=keep_startblock@entry=1 '\001')
>>at 
>> /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
>>#1  0x55d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, 
>>key=key@entry=0x0) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
>>#2  0x55d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) 
>>at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
>>#3  0x55d7a742d36e in ExecReScan (node=0x55d7a7d85100) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
>>#4  0x55d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) 
>>at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
>>#5  0x55d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
>>#6  0x55d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) 
>>at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
>>#7  0x55d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at 
>>/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
>>
>>
>
> Hmmm, that means it's crashing here:
>
>if (scan->rs_parallel != NULL)
>scan->rs_nblocks = scan->rs_parallel->phs_nblocks; <--- here
>else
>scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);
>
> But clearly, scan is valid (otherwise it'd crash on the if condition),
> and scan->rs_parallel must me non-NULL. Which probably means the pointer
> is (no longer) valid.
>
> Could it be that the rs_parallel DSM disappears on rescan, or something
> like that?

No clue but something I just tried was to disable parallelism by setting
max_parallel_workers_per_gather to 0 and however the query has not
finished after a few minutes, there is no crash.

Please advise.

Thx

>
>
> regards

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net




Re: SegFault on 9.6.14

2019-07-15 Thread Tomas Vondra

On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:

Greetings Hackers.

We have a reproduceable case of $subject that issues a backtrace such as
seen below.

The query that I'd prefer to sanitize before sending is <30 lines of at
a glance, not terribly complex logic.

It nonetheless dies hard after a few seconds of running and as expected,
results in an automatic all-backend restart.

Please advise on how to proceed.  Thanks!

bt
#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, 
keep_startblock=keep_startblock@entry=1 '\001')
   at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
#1  0x55d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) 
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
#2  0x55d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
#3  0x55d7a742d36e in ExecReScan (node=0x55d7a7d85100) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
#4  0x55d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
#5  0x55d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
#6  0x55d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) 
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
#7  0x55d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226




Hmmm, that means it's crashing here:

   if (scan->rs_parallel != NULL)
   scan->rs_nblocks = scan->rs_parallel->phs_nblocks; <--- here
   else
   scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);

But clearly, scan is valid (otherwise it'd crash on the if condition),
and scan->rs_parallel must me non-NULL. Which probably means the pointer
is (no longer) valid.

Could it be that the rs_parallel DSM disappears on rescan, or something
like that?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





SegFault on 9.6.14

2019-07-15 Thread Jerry Sievers
Greetings Hackers.

We have a reproduceable case of $subject that issues a backtrace such as
seen below.

The query that I'd prefer to sanitize before sending is <30 lines of at
a glance, not terribly complex logic.

It nonetheless dies hard after a few seconds of running and as expected,
results in an automatic all-backend restart.

Please advise on how to proceed.  Thanks!

bt
#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, 
keep_startblock=keep_startblock@entry=1 '\001')
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
#1  0x55d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) 
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
#2  0x55d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
#3  0x55d7a742d36e in ExecReScan (node=0x55d7a7d85100) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
#4  0x55d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
#5  0x55d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
#6  0x55d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) 
at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
#7  0x55d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at 
/build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226


-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net