Re: TopoSort() fix

2019-07-31 Thread Robert Haas
On Tue, Jul 30, 2019 at 2:10 PM Tom Lane  wrote:
> Sure.  But I think what we can foresee is that if there are any bugs
> reachable this way, they'd be reachable and need fixing regardless.
> We've already established that parallel workers can take and release locks
> that their leader isn't holding.  Apparently, they won't take anything
> stronger than RowExclusiveLock; but even AccessShare is enough to let a
> process participate in all interesting behaviors of the lock manager,
> including blocking, being blocked, and being released early by deadlock
> resolution.  And the advisory-lock functions are pretty darn thin wrappers
> around the lock manager.  So I'm finding it hard to see where there's
> incremental risk, even if a user does intentionally bypass the parallel
> safety markings.  And what we get in return is an easier way to add tests
> for this area.

Sure, I was basically just asking whether you could foresee any
crash-risk of the proposed change.  It sounds like the answer is "no,"
so I'm fine with it on that basis.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: TopoSort() fix

2019-07-30 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jul 30, 2019 at 1:44 PM Tom Lane  wrote:
>> Well, there'd be an actual isolation test that they work ;-), if you
>> override the marking.  Admittedly, one test case does not prove that
>> there's no way to crash the system, but that can be said of most
>> parts of Postgres.

> True.  I'm just talking about what we can foresee.

Sure.  But I think what we can foresee is that if there are any bugs
reachable this way, they'd be reachable and need fixing regardless.
We've already established that parallel workers can take and release locks
that their leader isn't holding.  Apparently, they won't take anything
stronger than RowExclusiveLock; but even AccessShare is enough to let a
process participate in all interesting behaviors of the lock manager,
including blocking, being blocked, and being released early by deadlock
resolution.  And the advisory-lock functions are pretty darn thin wrappers
around the lock manager.  So I'm finding it hard to see where there's
incremental risk, even if a user does intentionally bypass the parallel
safety markings.  And what we get in return is an easier way to add tests
for this area.

regards, tom lane




Re: TopoSort() fix

2019-07-30 Thread Robert Haas
On Tue, Jul 30, 2019 at 1:44 PM Tom Lane  wrote:
> Well, there'd be an actual isolation test that they work ;-), if you
> override the marking.  Admittedly, one test case does not prove that
> there's no way to crash the system, but that can be said of most
> parts of Postgres.

True.  I'm just talking about what we can foresee.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: TopoSort() fix

2019-07-30 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jul 30, 2019 at 1:36 PM Tom Lane  wrote:
>> In any case, my question at the moment is whether we need the belt-and-
>> suspenders-too approach of having both non-parallel-safe marking and an
>> explicit check inside these functions.  We've largely moved away from
>> hard-wired checks for e.g. superuserness, and surely these things are
>> less dangerous than most formerly-superuser-only functions.

> If we can't think of a way that the lack of these checks could crash
> it, then I think it's OK to remove the hardwired checks.  If we can,
> I'd favor keeping them.

Well, there'd be an actual isolation test that they work ;-), if you
override the marking.  Admittedly, one test case does not prove that
there's no way to crash the system, but that can be said of most
parts of Postgres.

regards, tom lane




Re: TopoSort() fix

2019-07-30 Thread Robert Haas
On Tue, Jul 30, 2019 at 1:36 PM Tom Lane  wrote:
> No, there's a sufficient reason why we should force advisory locks
> to be taken in the leader process, namely that the behavior is totally
> different if we don't: they will disappear at the end of the parallel
> worker run, not at end of transaction or session as documented.

Oh, good point.  I forgot about that.

> However, that argument doesn't seem to be a reason why the advisory-lock
> functions couldn't be parallel-restricted rather than parallel-unsafe.

Agreed.

> In any case, my question at the moment is whether we need the belt-and-
> suspenders-too approach of having both non-parallel-safe marking and an
> explicit check inside these functions.  We've largely moved away from
> hard-wired checks for e.g. superuserness, and surely these things are
> less dangerous than most formerly-superuser-only functions.

If we can't think of a way that the lack of these checks could crash
it, then I think it's OK to remove the hardwired checks.  If we can,
I'd favor keeping them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: TopoSort() fix

2019-07-30 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jul 30, 2019 at 10:27 AM Tom Lane  wrote:
>> (BTW, why aren't these functions just "parallel restricted"?)

> ...
> But it is really pretty arguable whether we should feel responsible
> for that problem.  We could just decide that if you're doing that, and
> you don't want the scenario described above to happen, you oughta mark
> the function that contains this logic at least PARALLEL RESTRICTED,
> and if you don't, then it's your fault for doing a dumb thing.  I
> believe when we were early on in the development of this we wanted to
> be very conservative lest, ah, someone accuse us of not locking things
> down well enough, but maybe at this point parallel query is a
> sufficiently well-established concept that we should lighten up on
> some cases where we took an overly-stringent line.  If we take that
> view, then I'm not sure why these functions couldn't be just marked
> PARALLEL SAFE.

No, there's a sufficient reason why we should force advisory locks
to be taken in the leader process, namely that the behavior is totally
different if we don't: they will disappear at the end of the parallel
worker run, not at end of transaction or session as documented.

However, that argument doesn't seem to be a reason why the advisory-lock
functions couldn't be parallel-restricted rather than parallel-unsafe.

In any case, my question at the moment is whether we need the belt-and-
suspenders-too approach of having both non-parallel-safe marking and an
explicit check inside these functions.  We've largely moved away from
hard-wired checks for e.g. superuserness, and surely these things are
less dangerous than most formerly-superuser-only functions.

regards, tom lane




Re: TopoSort() fix

2019-07-30 Thread Robert Haas
On Tue, Jul 30, 2019 at 10:27 AM Tom Lane  wrote:
> I also looked into whether one could use SELECT FOR UPDATE/SHARE to get
> stronger locks at a tuple level, but that's been blocked off as well.
> You guys really did a pretty good job of locking that down.

Thanks.  We learned from the master.

> After thinking about this for awhile, though, I believe it might be
> reasonable to just remove PreventAdvisoryLocksInParallelMode()
> altogether.  The "parallel unsafe" markings on the advisory-lock
> functions seem like adequate protection against somebody running
> them in a parallel worker.  If you defeat that by calling them from
> a mislabeled-parallel-safe wrapper (as the proposed test case does),
> then any negative consequences are on your own head.  AFAICT the
> only actual negative consequence is that the locks disappear the
> moment the parallel worker exits, so we'd not be opening any large
> holes even to people who rip off the safety cover.
>
> (BTW, why aren't these functions just "parallel restricted"?)

I don't exactly remember why we installed all of these restrictions
any more.  You might be able to find some discussion of it by
searching the archives.  I believe we may have been concerned about
the fact that group locking would cause advisory locks taken in one
process not to conflict with the same advisory lock taken in some
cooperating process, and maybe that would be unwelcome behavior for
someone.  For example, suppose the user defines a function that takes
an advisory lock on the number 1, does a bunch of stuff that should
never happen multiply at the same time, and then releases the lock.
Without parallel query, that will work.  With parallel query, it
won't, because several workers running the same query might run the
same function simultaneously and their locks won't conflict.

But it is really pretty arguable whether we should feel responsible
for that problem.  We could just decide that if you're doing that, and
you don't want the scenario described above to happen, you oughta mark
the function that contains this logic at least PARALLEL RESTRICTED,
and if you don't, then it's your fault for doing a dumb thing.  I
believe when we were early on in the development of this we wanted to
be very conservative lest, ah, someone accuse us of not locking things
down well enough, but maybe at this point parallel query is a
sufficiently well-established concept that we should lighten up on
some cases where we took an overly-stringent line.  If we take that
view, then I'm not sure why these functions couldn't be just marked
PARALLEL SAFE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: TopoSort() fix

2019-07-30 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jul 29, 2019 at 9:48 PM Tom Lane  wrote:
>> I believe the only accessible route to taking any sort of new lock
>> in a parallel worker is catalog lookups causing AccessShareLock on
>> a catalog.

> Can't the worker just query a previously-untouched table, maybe by
> constructing a string and then using EXECUTE to execute it?

Hm, yeah, looks like you could get a new AccessShareLock that way too.
But not any exclusive lock.

I also looked into whether one could use SELECT FOR UPDATE/SHARE to get
stronger locks at a tuple level, but that's been blocked off as well.
You guys really did a pretty good job of locking that down.

After thinking about this for awhile, though, I believe it might be
reasonable to just remove PreventAdvisoryLocksInParallelMode()
altogether.  The "parallel unsafe" markings on the advisory-lock
functions seem like adequate protection against somebody running
them in a parallel worker.  If you defeat that by calling them from
a mislabeled-parallel-safe wrapper (as the proposed test case does),
then any negative consequences are on your own head.  AFAICT the
only actual negative consequence is that the locks disappear the
moment the parallel worker exits, so we'd not be opening any large
holes even to people who rip off the safety cover.

(BTW, why aren't these functions just "parallel restricted"?)

regards, tom lane




Re: TopoSort() fix

2019-07-29 Thread Tom Lane
Alvaro Herrera  writes:
> On 2019-Jul-29, Tom Lane wrote:
>> FYI, I just got done inventing a way to reach that code, and I have
>> to suspect that it's impossible to do so in production, because under
>> ordinary circumstances no parallel worker will take any exclusive lock
>> that isn't already held by its leader.

> Hmm, okay, so this wasn't a bug that would have bit anyone in practice,
> yeah?  That's reassuring.

At the least, you'd have to go well out of your way to make it happen.

regards, tom lane




Re: TopoSort() fix

2019-07-29 Thread Alvaro Herrera
On 2019-Jul-29, Tom Lane wrote:

> FYI, I just got done inventing a way to reach that code, and I have
> to suspect that it's impossible to do so in production, because under
> ordinary circumstances no parallel worker will take any exclusive lock
> that isn't already held by its leader.

Hmm, okay, so this wasn't a bug that would have bit anyone in practice,
yeah?  That's reassuring.

Thanks,

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: TopoSort() fix

2019-07-29 Thread Michael Paquier
On Mon, Jul 29, 2019 at 10:56:05AM -0400, Tom Lane wrote:
> Michael Paquier  writes:
>> +1.  Any volunteers?
> 
> If Robert doesn't weigh in pretty soon, I'll take responsibility for it.

Thanks Tom for taking care of it!
--
Michael


signature.asc
Description: PGP signature


Re: TopoSort() fix

2019-07-29 Thread Robert Haas
On Mon, Jul 29, 2019 at 9:48 PM Tom Lane  wrote:
> I tried that first.  There are backstops preventing doing LOCK TABLE
> in a worker, just like for advisory locks.
>
> I believe the only accessible route to taking any sort of new lock
> in a parallel worker is catalog lookups causing AccessShareLock on
> a catalog.

Can't the worker just query a previously-untouched table, maybe by
constructing a string and then using EXECUTE to execute it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: TopoSort() fix

2019-07-29 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jul 29, 2019 at 5:55 PM Tom Lane  wrote:
>> FYI, I just got done inventing a way to reach that code, and I have
>> to suspect that it's impossible to do so in production, because under
>> ordinary circumstances no parallel worker will take any exclusive lock
>> that isn't already held by its leader.  (If you happen to know an
>> easy counterexample, let's see it.)

> I think the way you could make that happen would be to run a parallel
> query that calls a user-defined function which does LOCK TABLE.

I tried that first.  There are backstops preventing doing LOCK TABLE
in a worker, just like for advisory locks.

I believe the only accessible route to taking any sort of new lock
in a parallel worker is catalog lookups causing AccessShareLock on
a catalog.  In principle, maybe you could make a deadlock situation
by combining parallel workers with something that takes
AccessExclusiveLock on a catalog ... but making that into a reliable
test case sounds about impossible, because AEL on a catalog will
have all sorts of unpleasant side-effects, such as blocking
isolationtester's own queries.  (Getting it to work in a
CLOBBER_CACHE_ALWAYS build seems right out, for instance.)

regards, tom lane




Re: TopoSort() fix

2019-07-29 Thread Robert Haas
On Mon, Jul 29, 2019 at 5:55 PM Tom Lane  wrote:
> FYI, I just got done inventing a way to reach that code, and I have
> to suspect that it's impossible to do so in production, because under
> ordinary circumstances no parallel worker will take any exclusive lock
> that isn't already held by its leader.  (If you happen to know an
> easy counterexample, let's see it.)

I think the way you could make that happen would be to run a parallel
query that calls a user-defined function which does LOCK TABLE.

> Anyway, armed with this, I was able to prove that HEAD just hangs up
> on this test case; apparently the deadlock checker never detects that
> the additional holders of the advisory lock need to be rearranged.
> And removing that "break" fixes it.

Nice!

> So I'll go commit the break-ectomy, but what do people think about
> testing this better?

I think it's a great idea.  I was never very happy with the amount of
exercise I was able to give this code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: TopoSort() fix

2019-07-29 Thread Tom Lane
[ removing , as that mailing address seems to be MIA ]

Robert Haas  writes:
> On Mon, Jul 29, 2019 at 10:56 AM Tom Lane  wrote:
>> If Robert doesn't weigh in pretty soon, I'll take responsibility for it.

> That's fine, or if you prefer that I commit it, I will.

FYI, I just got done inventing a way to reach that code, and I have
to suspect that it's impossible to do so in production, because under
ordinary circumstances no parallel worker will take any exclusive lock
that isn't already held by its leader.  (If you happen to know an
easy counterexample, let's see it.)

The attached heavily-hacked version of deadlock-soft.spec makes it go by
forcing duplicate advisory locks to be taken in worker processes, which
of course first requires disabling PreventAdvisoryLocksInParallelMode().
I kind of wonder if we should provide some debug-only, here-be-dragons
way to disable that restriction so that we could make this an official
regression test, because I'm now pretty suspicious that none of this code
has ever executed before.

Anyway, armed with this, I was able to prove that HEAD just hangs up
on this test case; apparently the deadlock checker never detects that
the additional holders of the advisory lock need to be rearranged.
And removing that "break" fixes it.

So I'll go commit the break-ectomy, but what do people think about
testing this better?

regards, tom lane

diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index ffd1970..1e815a2 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -658,10 +658,6 @@ pg_isolation_test_session_is_blocked(PG_FUNCTION_ARGS)
 static void
 PreventAdvisoryLocksInParallelMode(void)
 {
-	if (IsInParallelMode())
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
- errmsg("cannot use advisory locks during a parallel operation")));
 }
 
 /*
diff --git a/src/test/isolation/expected/deadlock-soft.out b/src/test/isolation/expected/deadlock-soft.out
index 24a35da..7abbd19 100644
--- a/src/test/isolation/expected/deadlock-soft.out
+++ b/src/test/isolation/expected/deadlock-soft.out
@@ -1,17 +1,47 @@
 Parsed test spec with 4 sessions
 
 starting permutation: d1a1 d2a2 e1l e2l d1a2 d2a1 d1c e1c d2c e2c
-step d1a1: LOCK TABLE a1 IN ACCESS SHARE MODE;
-step d2a2: LOCK TABLE a2 IN ACCESS SHARE MODE;
-step e1l: LOCK TABLE a1 IN ACCESS EXCLUSIVE MODE; 
-step e2l: LOCK TABLE a2 IN ACCESS EXCLUSIVE MODE; 
-step d1a2: LOCK TABLE a2 IN ACCESS SHARE MODE; 
-step d2a1: LOCK TABLE a1 IN ACCESS SHARE MODE; 
+step d1a1: select lock_share(1,x) from bigt limit 1;
+lock_share 
+
+1  
+step d2a2: select lock_share(2,x) from bigt limit 1;
+lock_share 
+
+1  
+step e1l: select lock_excl(1,x) from bigt limit 1; 
+step e2l: select lock_excl(2,x) from bigt limit 1; 
+step d1a2: SET force_parallel_mode = on;
+set parallel_setup_cost=0;
+set parallel_tuple_cost=0;
+set min_parallel_table_scan_size=0;
+set parallel_leader_participation=off;
+set max_parallel_workers_per_gather=4;
+ select sum(lock_share(2,x)) from bigt; 
+step d2a1: SET force_parallel_mode = on;
+set parallel_setup_cost=0;
+set parallel_tuple_cost=0;
+set min_parallel_table_scan_size=0;
+set parallel_leader_participation=off;
+set max_parallel_workers_per_gather=4;
+ select sum(lock_share(1,x)) from bigt; 
 step d1a2: <... completed>
+sum
+
+1  
 step d1c: COMMIT;
 step e1l: <... completed>
-step e1c: COMMIT;
+lock_excl  
+
+1  
 step d2a1: <... completed>
+sum
+
+1  
+step e1c: COMMIT;
 step d2c: COMMIT;
 step e2l: <... completed>
+lock_excl  
+
+1  
 step e2c: COMMIT;
diff --git a/src/test/isolation/specs/deadlock-soft.spec b/src/test/isolation/specs/deadlock-soft.spec
index 49d16e0..b03f0c7 100644
--- a/src/test/isolation/specs/deadlock-soft.spec
+++ b/src/test/isolation/specs/deadlock-soft.spec
@@ -1,3 +1,12 @@
+# Test deadlock resolution with parallel process groups.
+
+# It's fairly hard to get parallel worker processes to block on locks,
+# since generally they don't want any locks their leader didn't already
+# take.  We cheat like mad here by making a function that takes a lock,
+# and is incorrectly marked immutable so that it can execute in a worker.
+# (This also requires disabling the lockfuncs.c code that prevents that.)
+
+# Otherwise, this is morally equivalent to standard deadlock-soft.spec:
 # Four-process deadlock with two hard edges and two soft edges.
 # d2 waits for e1 (soft edge), e1 waits for d1 (hard edge),
 # d1 waits for e2 (soft edge), e2 waits for d2 (hard edge).
@@ -6,35 +15,67 @@
 
 setup
 {
-  CREATE TABLE a1 ();
-  CREATE TABLE a2 ();
+  create function lock_share(int,int) returns int language sql as
+  'select pg_advisory_xact_lock_shared($1); select 1;' immutable parallel safe;
+
+  create function lock_excl(int,int) returns int language sql as
+  'select pg_advisory_xact_lock($1); 

Re: TopoSort() fix

2019-07-29 Thread Robert Haas
On Mon, Jul 29, 2019 at 10:56 AM Tom Lane  wrote:
> Michael Paquier  writes:
> > On Fri, Jul 26, 2019 at 08:24:16PM -0400, Tom Lane wrote:
> >> I think this is a sufficiently obvious bug, and a sufficiently
> >> obvious fix, that we should just fix it and not insist on getting
> >> a reproducible test case first.  I think a test case would soon
> >> bit-rot anyway, and no longer exercise the problem.
> >> I certainly do *not* want to wait so long that we miss the
> >> upcoming minor releases.
>
> > +1.  Any volunteers?
>
> If Robert doesn't weigh in pretty soon, I'll take responsibility for it.

That's fine, or if you prefer that I commit it, I will.

I just got back from a week's vacation and am only very gradually
unburying myself from mounds of email.  (Of course, the way
pgsql-hackers is getting, there's sort of always a mound of email
these days.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: TopoSort() fix

2019-07-29 Thread Tom Lane
Michael Paquier  writes:
> On Fri, Jul 26, 2019 at 08:24:16PM -0400, Tom Lane wrote:
>> I think this is a sufficiently obvious bug, and a sufficiently
>> obvious fix, that we should just fix it and not insist on getting
>> a reproducible test case first.  I think a test case would soon
>> bit-rot anyway, and no longer exercise the problem.
>> I certainly do *not* want to wait so long that we miss the
>> upcoming minor releases.

> +1.  Any volunteers?

If Robert doesn't weigh in pretty soon, I'll take responsibility for it.

regards, tom lane




Re: TopoSort() fix

2019-07-28 Thread Michael Paquier
On Fri, Jul 26, 2019 at 08:24:16PM -0400, Tom Lane wrote:
> I think this is a sufficiently obvious bug, and a sufficiently
> obvious fix, that we should just fix it and not insist on getting
> a reproducible test case first.  I think a test case would soon
> bit-rot anyway, and no longer exercise the problem.
> 
> I certainly do *not* want to wait so long that we miss the
> upcoming minor releases.

+1.  Any volunteers?
--
Michael


signature.asc
Description: PGP signature


Re: TopoSort() fix

2019-07-26 Thread Tom Lane
Andres Freund  writes:
> On 2019-07-26 18:05:38 -0400, Alvaro Herrera wrote:
>> Hello, is anybody looking into this issue?

> I guess this is on Robert's docket otherwise. He's on vacation till
> early next week...

I think this is a sufficiently obvious bug, and a sufficiently
obvious fix, that we should just fix it and not insist on getting
a reproducible test case first.  I think a test case would soon
bit-rot anyway, and no longer exercise the problem.

I certainly do *not* want to wait so long that we miss the
upcoming minor releases.

regards, tom lane




Re: TopoSort() fix

2019-07-26 Thread Andres Freund
Hi,

On 2019-07-26 18:05:38 -0400, Alvaro Herrera wrote:
> On 2019-Jul-04, Rui Hai Jiang wrote:
> 
> > I'll try to figure out some  scenarios to do the test. A parallel process
> > group is needed for the test.

Rui, have you made any progress on this?


> > Actually I was trying to do some testing against the locking mechanism. I
> > happened to see this issue.
> 
> Hello, is anybody looking into this issue?

I guess this is on Robert's docket otherwise. He's on vacation till
early next week...

Greetings,

Andres Freund




Re: TopoSort() fix

2019-07-26 Thread Alvaro Herrera
On 2019-Jul-04, Rui Hai Jiang wrote:

> I'll try to figure out some  scenarios to do the test. A parallel process
> group is needed for the test.
> 
> Actually I was trying to do some testing against the locking mechanism. I
> happened to see this issue.

Hello, is anybody looking into this issue?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: TopoSort() fix

2019-07-03 Thread Rui Hai Jiang
I'll try to figure out some  scenarios to do the test. A parallel process
group is needed for the test.

Actually I was trying to do some testing against the locking mechanism. I
happened to see this issue.

On Wed, Jul 3, 2019 at 9:38 PM Robert Haas  wrote:

> On Tue, Jul 2, 2019 at 11:23 AM Tom Lane  wrote:
> > Rui Hai Jiang  writes:
> > > I think I found an issue in the TopoSort() function.
> >
> > This indeed seems like a live bug introduced by a1c1af2a.
> > Robert?
>
> This is pretty thoroughly swapped out of my head, but it looks like
> that analysis might be correct.
>
> Is it practical to come up with a test case that demonstrates the problem?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: TopoSort() fix

2019-07-03 Thread Robert Haas
On Tue, Jul 2, 2019 at 11:23 AM Tom Lane  wrote:
> Rui Hai Jiang  writes:
> > I think I found an issue in the TopoSort() function.
>
> This indeed seems like a live bug introduced by a1c1af2a.
> Robert?

This is pretty thoroughly swapped out of my head, but it looks like
that analysis might be correct.

Is it practical to come up with a test case that demonstrates the problem?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: TopoSort() fix

2019-07-03 Thread Michael Paquier
On Wed, Jul 03, 2019 at 10:41:59AM +0800, Rui Hai Jiang wrote:
> Could the attached patch fix this issue? Or does any one else plan to fix
> it?
> 
> If people are busy and have not  time, I can go ahead to fix it.  To fix
> this issue, do we need a patch for each official branch?

Only a committer could merge any fix you produce.  What you have sent
looks fine to me, so let's wait for Robert, who has visiblu broken
this part to comment.  Back-patched versions are usually taken care of
by the committer merging the fix, and by experience it is better to
agree about the shape of a patch on HEAD before working on other
branches.  Depending on the review done, the patch's shape may change
slightly...
--
Michael


signature.asc
Description: PGP signature


Re: TopoSort() fix

2019-07-02 Thread Rui Hai Jiang
Could the attached patch fix this issue? Or does any one else plan to fix
it?

If people are busy and have not  time, I can go ahead to fix it.  To fix
this issue, do we need a patch for each official branch?


Regards,
Ruihai

On Tue, Jul 2, 2019 at 11:23 PM Tom Lane  wrote:

> Rui Hai Jiang  writes:
> > I think I found an issue in the TopoSort() function.
>
> This indeed seems like a live bug introduced by a1c1af2a.
> Robert?
>
> regards, tom lane
>


Re: TopoSort() fix

2019-07-02 Thread Tom Lane
Rui Hai Jiang  writes:
> I think I found an issue in the TopoSort() function.

This indeed seems like a live bug introduced by a1c1af2a.
Robert?

regards, tom lane




TopoSort() fix

2019-07-02 Thread Rui Hai Jiang
Hi Hackers,

I think I found an issue in the TopoSort() function.

As the comments say,

/* .
 * .. If there are any other processes
 * in the same lock group on the queue, set their number of
 * beforeConstraints to -1 to indicate that they should be
emitted
 * with their groupmates rather than considered separately.
 */

If the line "break;" exists, there is no chance to set beforeConstraints to
-1 for other processes in the same lock group.

So, I think we need delete the line "break;" . See the patch.

I just took a look, and I found all the following versions have this line .


postgresql-12beta2, postgresql-12beta1, postgresql-11.4,
postgresql-11.3,postgresql-11.0,
postgresql-10.9,postgresql-10.5, postgresql-10.0


Thanks,
Ruihai


TopoSort.patch
Description: Binary data