Re: [HACKERS] foreign key locks

2013-01-23 Thread Alvaro Herrera
I just pushed this patch to the master branch.  There was a
corresponding catversion bump and pg_control version bump.  I have
verified that make check-world passes on my machine, as well as
isolation tests and pg_upgrade.

Tom Lane said at one point this is too complex to maintain.  Several
times during the development I feared he might well be right.  I am sure
he will be discovering many oversights and poor design choices, when
gets around to reviewing the code; hopefully some extra effort will be
all that's needed to make the whole thing work sanely and not eat
anyone's data.  I just hope that nothing so serious comes up that the
patch will need to be reverted completely.

This patch is the result of the work of many people.  I am not allowed
to mention the patch sponsors in the commit message, so I'm doing it
here: first and foremost I need to thank my previous and current
employers Command Prompt and 2ndQuadrant -- they were extremely kind in
allowing me to work on this for days on end (and not all of it was
supported by financial sponsors).  Joel Jacobson started the whole
effort by posting a screencast of a problem his company was having; I
hope they found a workaround in the meantime, because his post was in
mid 2010.  The key idea of this patch' design came from Simon Riggs;
Noah Misch provided additional design advice that allowed the project
torun to completion.  Noah and Andres Freund spent considerable time
reviewing early versions of this patch; they uncovered many embarrasing
bugs in my implementation.  Kevin Grittner, Robert Haas, and others,
provided useful comments as well.  Noah Misch, Andres Freund, Marti
Raudsepp and Alexander Shulgin also provided bits of code.

Any bugs that remain are, of course, my own fault only.

Financial support came from

* Command Prompt Inc. of Washington, US
* 2ndQuadrant Ltd. of United Kingdom
* Trustly (then Glue Finance) of Sweden
* Novozymes A/S of Denmark
* MailerMailer LLC of Maryland, US
* PostgreSQL Experts, Inc. of California, US.

My sincerest thanks to everyone.


I seriously hope that no patch of mine ever becomes this monstruous
again.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-19 Thread Simon Riggs
On 18 January 2013 21:01, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
 I doubt it ever came up before.  What use is logging only the content of
 a buffer page?  Surely you'd need to know, for example, which relation
 and page number it is from.

 It only got to be a length of 0 because the the data got removed due to
 a logged full page write. And the backup block contains the data about
 which blocks it is logging in itself.

 And if the full-page-image case *hadn't* been invoked, what then?  I
 still don't see a very good argument for xlog records with no fixed
 data.

 I wonder if the check shouldn't just check write_len instead, directly
 below the loop that ads backup blocks.

 We're not changing this unless you can convince me that the read-side
 error check mentioned in the comment is useless.


There's some confusion here, I think. Alvaro wasn't proposing a WAL
record that had no fixed data.

The current way XLogRecData works is that if you put data and buffer
together on the same chunk then we optimise the data away if we take a
backup block.

Alvaro chose what looks like the right way to do this when you have
simple data: use one XLogRecData chunk and let the data part get
optimised away. But doing that results in a WAL record that has a
backup block, plus data of 0 length, which then fails the later check.

All current code gets around this by including multiple XLogRecData
chunks, which results in including additional data that is superfluous
when the backup block is present. Alvaro was questioning this; I
didn't understand that either when he said it, but I do now.

The zero length check should stay, definitely.

What this looks like is that further compression of the WAL is
possible, but given its alongside backup blocks, that's on the order
of a 1% saving, so probably isn't worth considering right now. The way
to do that would to include a small token to show record has been
optimised, rather than being zero length.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-18 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Here's version 28 of this patch.  The only substantive change here from
 v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
 or LockTupleNoKeyExclusive, depending on whether the key columns are
 being modified by the update.  This needs to go through EvalPlanQual, so
 that function is now getting the lock mode as a parameter instead of
 hardcoded LockTupleExclusive.  (All other users of GetTupleForTrigger
 still use LockTupleExclusive, so there's no change for anybody other
 than FOR EACH ROW BEFORE UPDATE triggers).
 
 At this point, I think I've done all I wanted to do in connection with
 this patch, and I intend to commit.  I think this is a good time to get
 it committed, so that people can play with it more extensively and
 report any breakage I might have caused before we even get into beta.

While trying this out this morning I noticed a bug in the XLog code:
trying to lock the updated version of a tuple when the page that
contains the updated version requires a backup block, would cause this:

PANIC:  invalid xlog record length 0

The reason is that there is an (unknown to me) rule that there must be
some data not associated with a buffer:

/*
 * NOTE: We disallow len == 0 because it provides a useful bit of extra
 * error checking in ReadRecord.  This means that all callers of
 * XLogInsert must supply at least some not-in-a-buffer data. [...]
 */

This seems pretty strange to me.  And having the rule be spelled out
only in a comment within XLogInsert and not at its top, and not nearby
the XLogRecData struct definition either, seems pretty strange to me.
I wonder how come every PG hacker except me knows this.

For the curious, I attach an isolationtester spec file I used to
reproduce the problem (and verify the fix).

To fix the crash I needed to do this:

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 99fced1..9762890 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4838,7 +4838,7 @@ l4:
{
xl_heap_lock_updated xlrec;
XLogRecPtr  recptr;
-   XLogRecData rdata;
+   XLogRecData rdata[2];
Pagepage = BufferGetPage(buf);
 
xlrec.target.node = rel-rd_node;
@@ -4846,13 +4846,18 @@ l4:
xlrec.xmax = new_xmax;
xlrec.infobits_set = compute_infobits(new_infomask, 
new_infomask2);
 
-   rdata.data = (char *) xlrec;
-   rdata.len = SizeOfHeapLockUpdated;
-   rdata.buffer = buf;
-   rdata.buffer_std = true;
-   rdata.next = NULL;
+   rdata[0].data = (char *) xlrec;
+   rdata[0].len = SizeOfHeapLockUpdated;
+   rdata[0].buffer = InvalidBuffer;
+   rdata[0].next = (rdata[1]);
+
+   rdata[1].data = NULL;
+   rdata[1].len = 0;
+   rdata[1].buffer = buf;
+   rdata[1].buffer_std = true;
+   rdata[1].next = NULL;
 
-   recptr = XLogInsert(RM_HEAP2_ID, 
XLOG_HEAP2_LOCK_UPDATED, rdata);
+   recptr = XLogInsert(RM_HEAP2_ID, 
XLOG_HEAP2_LOCK_UPDATED, rdata);
 
PageSetLSN(page, recptr);
PageSetTLI(page, ThisTimeLineID);

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
setup
{
  CREATE TABLE foo (
key serial PRIMARY KEY,
value   int
  );

  INSERT INTO foo SELECT value FROM generate_series(1, 226) AS value;
}

teardown
{
  DROP TABLE foo;
}

session s1
step s1b  { BEGIN ISOLATION LEVEL REPEATABLE READ; }
step s1s  { SELECT * FROM foo LIMIT 0; }  # obtain snapshot
step s1l  { SELECT * FROM foo WHERE key = 1 FOR KEY SHARE; } # obtain lock
step s1c  { COMMIT; }

session s2
step s2b  { BEGIN; }
step s2u  { UPDATE foo SET value = 2 WHERE key = 1; }
step s2c  { COMMIT; }

session s3
step s3ck { CHECKPOINT; }

permutation s1b s1s s2b s2u s3ck s1l s1c s2c

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-18 Thread Simon Riggs
On 18 January 2013 20:02, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 XLOG_HEAP2_LOCK_UPDATED

Every xlog record needs to know where to put the block.

Compare with XLOG_HEAP_LOCK

xlrec.target.node = relation-rd_node;

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-18 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 The reason is that there is an (unknown to me) rule that there must be
 some data not associated with a buffer:

   /*
* NOTE: We disallow len == 0 because it provides a useful bit of extra
* error checking in ReadRecord.  This means that all callers of
* XLogInsert must supply at least some not-in-a-buffer data. [...]
*/

 This seems pretty strange to me.  And having the rule be spelled out
 only in a comment within XLogInsert and not at its top, and not nearby
 the XLogRecData struct definition either, seems pretty strange to me.
 I wonder how come every PG hacker except me knows this.

I doubt it ever came up before.  What use is logging only the content of
a buffer page?  Surely you'd need to know, for example, which relation
and page number it is from.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-18 Thread Andres Freund
On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  The reason is that there is an (unknown to me) rule that there must be
  some data not associated with a buffer:
 
  /*
   * NOTE: We disallow len == 0 because it provides a useful bit of extra
   * error checking in ReadRecord.  This means that all callers of
   * XLogInsert must supply at least some not-in-a-buffer data. [...]
   */
 
  This seems pretty strange to me.  And having the rule be spelled out
  only in a comment within XLogInsert and not at its top, and not nearby
  the XLogRecData struct definition either, seems pretty strange to me.
  I wonder how come every PG hacker except me knows this.
 
 I doubt it ever came up before.  What use is logging only the content of
 a buffer page?  Surely you'd need to know, for example, which relation
 and page number it is from.

It only got to be a length of 0 because the the data got removed due to
a logged full page write. And the backup block contains the data about
which blocks it is logging in itself.
I wonder if the check shouldn't just check write_len instead, directly
below the loop that ads backup blocks.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-18 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
 I doubt it ever came up before.  What use is logging only the content of
 a buffer page?  Surely you'd need to know, for example, which relation
 and page number it is from.

 It only got to be a length of 0 because the the data got removed due to
 a logged full page write. And the backup block contains the data about
 which blocks it is logging in itself.

And if the full-page-image case *hadn't* been invoked, what then?  I
still don't see a very good argument for xlog records with no fixed
data.

 I wonder if the check shouldn't just check write_len instead, directly
 below the loop that ads backup blocks.

We're not changing this unless you can convince me that the read-side
error check mentioned in the comment is useless.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-18 Thread Andres Freund
On 2013-01-18 16:01:18 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
  I doubt it ever came up before.  What use is logging only the content of
  a buffer page?  Surely you'd need to know, for example, which relation
  and page number it is from.
 
  It only got to be a length of 0 because the the data got removed due to
  a logged full page write. And the backup block contains the data about
  which blocks it is logging in itself.
 
 And if the full-page-image case *hadn't* been invoked, what then?  I
 still don't see a very good argument for xlog records with no fixed
 data.

In that case data would have been logged?

The code in question was:

xl_heap_lock_updated xlrec;
xlrec.target.node = rel-rd_node;
...
xlrec.xmax = new_xmax;

-   rdata.data = (char *) xlrec;
-   rdata.len = SizeOfHeapLockUpdated;
-   rdata.buffer = buf;
-   rdata.buffer_std = true;
-   rdata.next = NULL;
-   recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_LOCK_UPDATED, rdata);

Other wal logging code (and fklocks now as well) just put those into
two XLogRecData blocks to avoid the issue.

  I wonder if the check shouldn't just check write_len instead, directly
  below the loop that ads backup blocks.
 
 We're not changing this unless you can convince me that the read-side
 error check mentioned in the comment is useless.

Youre right, the read side check is worth quite a bit. I think I am
retracting my suggestion.
I guess the amount of extra data thats uselessly logged although never
used in in the redo functions doesn't really amass to anything
significant in comparison to the backup block data.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-11 Thread Alvaro Herrera
Andres Freund wrote:
 On 2013-01-10 18:00:40 -0300, Alvaro Herrera wrote:
  Here's version 28 of this patch.  The only substantive change here from
  v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
  or LockTupleNoKeyExclusive, depending on whether the key columns are
  being modified by the update.  This needs to go through EvalPlanQual, so
  that function is now getting the lock mode as a parameter instead of
  hardcoded LockTupleExclusive.  (All other users of GetTupleForTrigger
  still use LockTupleExclusive, so there's no change for anybody other
  than FOR EACH ROW BEFORE UPDATE triggers).
 
 Is that enough in case of a originally non-key update in read committed
 mode that turns into a key update due to a concurrent key update?

Hm, let me try to work through your example.  You say that a transaction
T1 does a non-key update, and is working through the BEFORE UPDATE
trigger; then transaction T2 does a key update and changes the key
underneath T1?  So at that point T1 becomes a key update, because it's
now using the original key values which are no longer the key?

I don't think this can really happen, because T2 (which is requesting
TupleLockExclusive) would block on the lock that the trigger is grabbing
(TupleLockNoKeyExclusive) on the tuple.  So T2 would sleep until T1 is
committed.


Now, maybe you meant that the BEFORE UPDATE trigger changes the key
value but the user-supplied UPDATE query does not.  So the trigger turns
the no-key update into a key update.  What would happen here is that
GetTupleForTrigger would acquire TupleLockNoKeyExclusive on the tuple,
and later heap_update would acquire TupleLockExclusive.  So there is
lock escalation happening.  This could cause a deadlock against someone
else grabbing a TupleLockKeyShare on the tuple.  I think the answer is
don't do that, i.e. don't update the key columns in a BEFORE UPDATE
trigger.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-11 Thread Andres Freund
On 2013-01-11 12:11:47 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:
  On 2013-01-10 18:00:40 -0300, Alvaro Herrera wrote:
   Here's version 28 of this patch.  The only substantive change here from
   v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
   or LockTupleNoKeyExclusive, depending on whether the key columns are
   being modified by the update.  This needs to go through EvalPlanQual, so
   that function is now getting the lock mode as a parameter instead of
   hardcoded LockTupleExclusive.  (All other users of GetTupleForTrigger
   still use LockTupleExclusive, so there's no change for anybody other
   than FOR EACH ROW BEFORE UPDATE triggers).
 
  Is that enough in case of a originally non-key update in read committed
  mode that turns into a key update due to a concurrent key update?

 Hm, let me try to work through your example.  You say that a transaction
 T1 does a non-key update, and is working through the BEFORE UPDATE
 trigger; then transaction T2 does a key update and changes the key
 underneath T1?  So at that point T1 becomes a key update, because it's
 now using the original key values which are no longer the key?

 I don't think this can really happen, because T2 (which is requesting
 TupleLockExclusive) would block on the lock that the trigger is grabbing
 (TupleLockNoKeyExclusive) on the tuple.  So T2 would sleep until T1 is
 committed.

No, I was thinking about an update without triggers present.

T0: CREATE TABLE tbl(id serial pk, name text unique, data text);
T1: BEGIN; -- read committed
T1: UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; /* key update of row id = 
1 */
T2: BEGIN; -- read committed
T2: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; /* no key 
update, waiting */
T1: COMMIT;
T2: /* UPDATE follows to updated row, due to the changed name its a key update 
now */

Does that make sense?

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-11 Thread Alvaro Herrera
Andres Freund wrote:

 No, I was thinking about an update without triggers present.
 
 T0: CREATE TABLE tbl(id serial pk, name text unique, data text);
 T1: BEGIN; -- read committed
 T1: UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; /* key update of row id 
 = 1 */
 T2: BEGIN; -- read committed
 T2: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; /* no key 
 update, waiting */
 T1: COMMIT;
 T2: /* UPDATE follows to updated row, due to the changed name its a key 
 update now */
 
 Does that make sense?

So I guess your question is is T2 now holding a TupleLockExclusive
lock?  To answer it, I turned your example into a isolationtester spec:

setup
{
CREATE TABLE tbl(id serial primary key, name text unique, data text);
INSERT INTO tbl VALUES (1, 'blarg', 'no data');
}

teardown
{
DROP TABLE tbl;
}

session s1
step s1b { BEGIN; }
step s1u { UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; }
step s1c { COMMIT; }

session s2
step s2b { BEGIN; }
step s2u { UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; }
step s2c { COMMIT; }

session s3
step s3l { SELECT * FROM tbl FOR KEY SHARE; }

permutation s1b s1u s2b s2u s1c s3l s2c


And the results:
Parsed test spec with 3 sessions

starting permutation: s1b s1u s2b s2u s1c s3l s2c
step s1b: BEGIN;
step s1u: UPDATE tbl SET name = 'foo' WHERE name = 'blarg';
step s2b: BEGIN;
step s2u: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; waiting 
...
step s1c: COMMIT;
step s2u: ... completed
step s3l: SELECT * FROM tbl FOR KEY SHARE; waiting ...
step s2c: COMMIT;
step s3l: ... completed
id name   data   

1  blarg  blarg  


So session 3 is correctly waiting for session 2 to finish before being
ablt to grab its FOR KEY SHARE lock, indicating that session 2 is
holding a FOR UPDATE lock.  Good.

If I change session 1 to update the data column instead of name, session
3 no longer needs to wait for session 2, meaning session 2 now only
grabs a FOR NO KEY UPDATE lock.  Also good.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-11 Thread Andres Freund
On 2013-01-11 13:10:49 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:

  No, I was thinking about an update without triggers present.
 
  T0: CREATE TABLE tbl(id serial pk, name text unique, data text);
  T1: BEGIN; -- read committed
  T1: UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; /* key update of row 
  id = 1 */
  T2: BEGIN; -- read committed
  T2: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; /* no key 
  update, waiting */
  T1: COMMIT;
  T2: /* UPDATE follows to updated row, due to the changed name its a key 
  update now */
 
  Does that make sense?

 So I guess your question is is T2 now holding a TupleLockExclusive
 lock?  To answer it, I turned your example into a isolationtester spec:

Great! I reread the code and it does make sense the way its implemented
now. I misremembered something...

I vote for adding that spectest including some appropriate permutations.

FWIW: Looks good to me. It could use another pair of eyes, but I guess
that will have to come by being used.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2013-01-10 Thread Andres Freund
On 2013-01-10 18:00:40 -0300, Alvaro Herrera wrote:
 Here's version 28 of this patch.  The only substantive change here from
 v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
 or LockTupleNoKeyExclusive, depending on whether the key columns are
 being modified by the update.  This needs to go through EvalPlanQual, so
 that function is now getting the lock mode as a parameter instead of
 hardcoded LockTupleExclusive.  (All other users of GetTupleForTrigger
 still use LockTupleExclusive, so there's no change for anybody other
 than FOR EACH ROW BEFORE UPDATE triggers).

Is that enough in case of a originally non-key update in read committed
mode that turns into a key update due to a concurrent key update?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-12-22 Thread Erik Rijkers
On Thu, November 29, 2012 17:16, Alvaro Herrera wrote:
 Here it is.

   fklocks-26.patch.gz

This applies today after removing, not only the infamous catversion.h chunk, 
but also a file_fdw
chunk. (It's not really a problem.)

I also wondered if anyone has any perl/python/bash programs handy that uses 
these new options.  I
can hack something together myself, but I just thought I'd ask.


Thanks,


Erik Rijkers




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-12-22 Thread Andres Freund
On 2012-12-22 10:53:47 +0100, Erik Rijkers wrote:
 On Thu, November 29, 2012 17:16, Alvaro Herrera wrote:
  Here it is.
 
fklocks-26.patch.gz

 This applies today after removing, not only the infamous catversion.h chunk, 
 but also a file_fdw
 chunk. (It's not really a problem.)

 I also wondered if anyone has any perl/python/bash programs handy that uses 
 these new options.  I
 can hack something together myself, but I just thought I'd ask.

I have mostly done code review and some very targeted testing (pgbench,
gdb + psql). So no ready scripts, sorry.
There are imo some unfinished pieces (the whole EPQ integration) that
will require significant retesting once Alvaro has time to work on this
again..

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-29 Thread Alvaro Herrera
Kevin Grittner wrote:
 Kevin Grittner wrote:
  Alvaro Herrera wrote:
  
  Here's version 24.
  
  This no longer applies after the rmgr rm_desc patch.
 
 I took a shot at merging those so I could review the patch against
 HEAD; attached in hopes that it will be useful.
 
 Hopefully this isn't too far off from what .

Uh, sorry for remaining silent -- I'm about to post an updated patch,
having just pushed my merge to github.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks - EvalPlanQual interactions

2012-11-29 Thread Andres Freund
On 2012-11-27 12:07:34 -0300, Alvaro Herrera wrote:
 Alvaro Herrera wrote:
  Here's version 24.

 Old review emails still contains some items that didn't lead to any
 changes.  I tried to keep close track of those.  To that list I add a
 couple of things of my own.  Here they are, for those following along.
 I welcome suggestions.

 - EvalPlanQual and ExecLockRows maybe need a different overall locking
   strategy.  Right now follow_updates=false is used to avoid deadlocks.

I think this really might need some work. Afaics the EPQ code now needs
to actually determine what locklevel needs to be passed to
EvalPlanQualFetch via EvalPlanQual otherwise some of the locking issues
that triggered all this remain. That sucks a bit from a modularity
perspective, but I don't see another way.

Greetings,

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-28 Thread Kevin Grittner
Alvaro Herrera wrote:

 Here's version 24.

This no longer applies after the rmgr rm_desc patch.

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-27 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Here's version 24.

Old review emails still contains some items that didn't lead to any
changes.  I tried to keep close track of those.  To that list I add a
couple of things of my own.  Here they are, for those following along.
I welcome suggestions.


- Fix the multixact cache in multixact.c -- see mXactCacheEnt.

- ResetHintBitMulti() was removed; need to remove old XMAX_IS_MULTI
  somewhere; perhaps during HOT page pruning?

- EvalPlanQual and ExecLockRows maybe need a different overall locking
  strategy.  Right now follow_updates=false is used to avoid deadlocks.

- Ensure multixact.c behaves sanely on wraparound.

- Is the case of a a non-key-update followed by a key-update actually handled
  when doing a heap_lock_tuple with mode = LockTupleKeyShare and follow_updates
  = false? I don't really see how, so it seems to deserve at least a comment.

- if oldestMultiXactId + db is set and then that database is dropped we seem to
  have a problem because MultiXactAdvanceOldest won't overwrite those
  values. Should probably use SetMultiXactIdLimit directly.

- what stop multixacts only being filled out (i.e RecordNewMultiXact()) *after*
  the XLogInsert() *and* after a MultiXactGetCheckptMulti()? Afaics
  MultiXactGenLock is not hold in CreateMultiXactId(). If we crash in that
  moment we loose the multixact data which now means potential data loss...

- multixact member group data crossing 512 sector boundaries makes me uneasy
  (as its 5 bytes). I don't really see a scenario where its dangerous, but
  ... Does anybody see a problem here?

- there are quite some places that do
multiStopLimit = multiWrapLimit - 100;
if (multiStopLimit  FirstMultiXactId)
multiStopLimit -= FirstMultiXactId;

  perhaps MultiXactIdAdvance and MultiXactIdRetreat macros are in order?

- not really padding, MultiXactStatus is 4bytes...
/*
 * XXX Note: there's a lot of padding space in MultiXactMember.  We 
could
 * find a more compact representation of this Xlog record -- perhaps 
all the
 * status flags in one XLogRecData, then all the xids in another one?  
Not
 * clear that it's worth the trouble though.
 */

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-20 Thread Alvaro Herrera
Andres Freund wrote:


 - I find using a default: clause in switches with enum types where everything
   is expected to be handled like the following a bad idea, this way the
   compiler won't warn you if youve missed case's which makes 
 changing/extending code harder:
   switch (rc-strength)
   {

I eliminated some of these default clauses, but the compiler is not
happy about not having some of them, so they remain.

 -
 #if 0
   /*
* The idea here is to remove the IS_MULTI bit, and 
 replace the
* xmax with the updater's Xid.  However, we can't 
 really do it:
* modifying the Xmax is not allowed under our buffer 
 locking
* rules, unless we have an exclusive lock; but we 
 don't know that
* we have it.  So the multi needs to remain in place 
 :-(
*/
   ResetMultiHintBit(tuple, buffer, xmax, true);
 #endif
 
  Three things:
 - HeapTupleSatisfiesUpdate is actually always called exclusively locked ;)
 - Extending something like LWLockHeldByMe to also return the current
   lockmode doesn't sound hard
 - we seem to be using #ifdef NOT_YET for such cases

After spending some time trying to make this work usefully, I observed
that it's pointless, at least if we apply it only in
HeapTupleSatisfiesUpdate, because the IS_MULTI bit is going to be
removed by compute_new_xmax_infomask if the multi is gone.  Something
like this would be useful if we could do it in other places; say when
we're merely scanning page without the specific intention of modifying
that particular tuple.  Maybe in heap_prune_page, for example.  ISTM we
can see that as an optimization we can add later.

For the record, the implementation of ResetMultiHintBit looks like this:

/*
 * When a tuple is found to be marked by a MultiXactId, all whose members are
 * completed transactions, the HEAP_XMAX_IS_MULTI bit can be removed.
 * Additionally, at the request of caller, we can set the Xmax to the given
 * Xid, and set HEAP_XMAX_COMMITTED.
 *
 * Note that per buffer access rules, the caller to this function must hold
 * exclusive content lock on the affected buffer.
 */
static inline void
ResetMultiHintBit(HeapTupleHeader tuple, Buffer buffer,
  TransactionId xid, bool mark_committed)
{
Assert(LWLockModeHeld((BufferDescriptors[buffer])-content_lock ==
  LW_EXCLUSIVE));
Assert(tuple-t_infomask  HEAP_XMAX_IS_MULTI);
Assert(!(tuple-t_infomask  HEAP_XMAX_INVALID));
Assert(!MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple)));

tuple-t_infomask = ~HEAP_XMAX_BITS;
HeapTupleHeaderSetXmax(tuple, xid);
if (!TransactionIdIsValid(xid))
tuple-t_infomask |= HEAP_XMAX_INVALID;
/* note that HEAP_KEYS_UPDATED persists, if set */

if (mark_committed)
SetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED, xid);
else
SetBufferCommitInfoNeedsSave(buffer);
}

(This is removed by commit 52f86f82fb3e01a6ed0b8106bac20f319bb3ad0a in
my github tree, but that commit might be lost in the future, hence this
paste.)


Some of your other observations have been fixed by commits that I have
pushed to my github tree.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-20 Thread Andres Freund
On 2012-11-20 17:36:14 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:


  - I find using a default: clause in switches with enum types where 
  everything
is expected to be handled like the following a bad idea, this way the
compiler won't warn you if youve missed case's which makes 
  changing/extending code harder:
  switch (rc-strength)
  {

 I eliminated some of these default clauses, but the compiler is not
 happy about not having some of them, so they remain.

You have removed the ones that looked removable to me...


  -
  #if 0
  /*
   * The idea here is to remove the IS_MULTI bit, and 
  replace the
   * xmax with the updater's Xid.  However, we can't 
  really do it:
   * modifying the Xmax is not allowed under our buffer 
  locking
   * rules, unless we have an exclusive lock; but we 
  don't know that
   * we have it.  So the multi needs to remain in place 
  :-(
   */
  ResetMultiHintBit(tuple, buffer, xmax, true);
  #endif
 
   Three things:
  - HeapTupleSatisfiesUpdate is actually always called exclusively locked 
  ;)
  - Extending something like LWLockHeldByMe to also return the current
lockmode doesn't sound hard
  - we seem to be using #ifdef NOT_YET for such cases

 After spending some time trying to make this work usefully, I observed
 that it's pointless, at least if we apply it only in
 HeapTupleSatisfiesUpdate, because the IS_MULTI bit is going to be
 removed by compute_new_xmax_infomask if the multi is gone.  Something
 like this would be useful if we could do it in other places; say when
 we're merely scanning page without the specific intention of modifying
 that particular tuple.  Maybe in heap_prune_page, for example.  ISTM we
 can see that as an optimization we can add later.

heap_prune_page sounds like a good fit, yes. One other place would be
when following hot chains, but the locking would be far more critical in
that path, so its less likely to be beneficial.
Pushing it off sounds good to me.

 Some of your other observations have been fixed by commits that I have
 pushed to my github tree.

A short repetition of the previous pgbench run of many SELECT FOR
SHARE's:

10s test:
master:  22673 24637 23874 25527
fklocks: 24835 24601 24606 24868

60s test:
master:  32609 33087
fklocks: 33350 33359

Very nice!

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-19 Thread Andres Freund

On 2012-11-14 13:27:26 -0300, Alvaro Herrera wrote:
 Alvaro Herrera wrote:

   * In heap_lock_tuple's  XMAX_IS_MULTI case
  
   [snip]
  
   why is it membermode  mode and not membermode = mode?
 
  Uh, that's a bug.  Fixed.  As noticed in the comment above that snippet,
  there was a deadlock possible here.  Maybe I should add a test to ensure
  this doesn't happen.

 Done:
 https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770

Some more review bits, based on ffd6250d1d393f2ecb9bfc55c2c6f715dcece557

- if oldestMultiXactId + db is set and then that database is dropped we seem to
  have a problem because MultiXactAdvanceOldest won't overwrite those
  values. Should probably use SetMultiXactIdLimit directly.

- what stop multixacts only being filled out (i.e RecordNewMultiXact()) *after*
  the XLogInsert() *and* after a MultiXactGetCheckptMulti()? Afaics
  MultiXactGenLock is not hold in CreateMultiXactId(). If we crash in that
  moment we loose the multixact data which now means potential data loss...

- multixact member group data crossing 512 sector boundaries makes me uneasy
  (as its 5 bytes). I don't really see a scenario where its dangerous, but
  ... Does anybody see a problem here?

- there are quite some places that do
multiStopLimit = multiWrapLimit - 100;
if (multiStopLimit  FirstMultiXactId)
multiStopLimit -= FirstMultiXactId;

  perhaps MultiXactIdAdvance and MultiXactIdRetreat macros are in order?

- I find using a default: clause in switches with enum types where everything
  is expected to be handled like the following a bad idea, this way the
  compiler won't warn you if youve missed case's which makes changing/extending 
code harder:
switch (rc-strength)
{
case LCS_FORNOKEYUPDATE:
newrc-markType = ROW_MARK_EXCLUSIVE;
break;
case LCS_FORSHARE:
newrc-markType = ROW_MARK_SHARE;
break;
case LCS_FORKEYSHARE:
newrc-markType = ROW_MARK_KEYSHARE;
break;
case LCS_FORUPDATE:
newrc-markType = ROW_MARK_KEYEXCLUSIVE;
break;
default:
elog(ERROR, unsupported rowmark type %d, 
rc-strength);
}
-
#if 0
/*
 * The idea here is to remove the IS_MULTI bit, and 
replace the
 * xmax with the updater's Xid.  However, we can't 
really do it:
 * modifying the Xmax is not allowed under our buffer 
locking
 * rules, unless we have an exclusive lock; but we 
don't know that
 * we have it.  So the multi needs to remain in place 
:-(
 */
ResetMultiHintBit(tuple, buffer, xmax, true);
#endif

 Three things:
- HeapTupleSatisfiesUpdate is actually always called exclusively locked ;)
- Extending something like LWLockHeldByMe to also return the current
  lockmode doesn't sound hard
- we seem to be using #ifdef NOT_YET for such cases

- Using a separate production for the lockmode seems to be nicer imo, not
  really important though
for_locking_item:
FOR UPDATE locked_rels_list opt_nowait
...
| FOR NO KEY UPDATE locked_rels_list opt_nowait
...
| FOR SHARE locked_rels_list opt_nowait
...
| FOR KEY SHARE locked_rels_list opt_nowait
;

- not really padding, MultiXactStatus is 4bytes...
/*
 * XXX Note: there's a lot of padding space in MultiXactMember.  We 
could
 * find a more compact representation of this Xlog record -- perhaps 
all the
 * status flags in one XLogRecData, then all the xids in another one?  
Not
 * clear that it's worth the trouble though.
 */
- why
#define SizeOfMultiXactCreate (offsetof(xl_multixact_create, nmembers) + 
sizeof(int32))
and not
#define SizeOfMultiXactCreate offsetof(xl_multixact_create, members)
- starting a critical section in GetNewMultiXactId but not ending it is ugly,
  but not new

Greetings,

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-19 Thread Andres Freund
On 2012-11-14 13:27:26 -0300, Alvaro Herrera wrote:
 Alvaro Herrera wrote:

   * In heap_lock_tuple's  XMAX_IS_MULTI case
  
   [snip]
  
   why is it membermode  mode and not membermode = mode?
 
  Uh, that's a bug.  Fixed.  As noticed in the comment above that snippet,
  there was a deadlock possible here.  Maybe I should add a test to ensure
  this doesn't happen.

 Done:
 https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770

 (I don't think this is worth a v24 submission).

One more observation:
/*
 * Get and lock the updated version of the row; if fail, return
NULL.
 */
-   copyTuple = EvalPlanQualFetch(estate, relation, LockTupleExclusive,
+   copyTuple = EvalPlanQualFetch(estate, relation, LockTupleNoKeyExclusive,

That doesn't seem to be correct to me. Why is it ok to acquire a
potentially too low locklevel here?

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-17 Thread Andres Freund
On 2012-11-16 22:31:51 -0500, Noah Misch wrote:
 On Fri, Nov 16, 2012 at 05:31:12PM +0100, Andres Freund wrote:
  On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote:
   Andres is on the verge of convincing me that we need to support
   singleton FOR SHARE without multixacts due to performance concerns.
 
  I don't really see any arguments against doing so. We aren't in a that
  big shortage of flags and if we need more than available I think we can
  free some (e.g. XMAX/XMIN_INVALID).

 The patch currently leaves two unallocated bits.  Reclaiming currently-used
 bits means a binary compatibility break.  Until we plan out such a thing,
 reclaimable bits are not as handy as never-allocated bits.  I don't think we
 should lightly expend one of the final two.

Not sure what you mean with a binary compatibility break?
pg_upgrade'ability?

What I previously suggested somewhere was:

#define HEAP_XMAX_SHR_LOCK0x0010
#define HEAP_XMAX_EXCL_LOCK   0x0040
#define HEAP_XMAX_KEYSHR_LOCK (HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)
/*
 * Different from _LOCK_BITS because it doesn't include LOCK_ONLY
 */
#define HEAP_LOCK_MASK(HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)

#define HEAP_XMAX_IS_SHR_LOCKED(tup) \
(((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_SHR_LOCK)
#define HEAP_XMAX_IS_EXCL_LOCKED(tup) \
(((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_EXCL_LOCK)
#define HEAP_XMAX_IS_KEYSHR_LOCKED(tup) \
(((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_KEYSHR_LOCK)

It makes checking for locks sightly more more complicated, but its not
too bad...

   It
   would be useful for more people to chime in here: is FOR SHARE an
   important case to cater for?  I wonder if using FOR KEY SHARE (keep
   performance characteristics, but would need to revise application code)
   would satisfy Andres' users, for example.
 
  It definitely wouldn't help in the cases I have seen because the point
  is to protect against actual content changes of the rows, not just the
  keys.
  Note that you actually need to use explicit FOR SHARE/UPDATE for
  correctness purposes in many scenarios unless youre running in 9.1+
  serializable mode. And that cannot be used in some cases (try queuing
  for example) because the rollback rates would be excessive.

 I agree that tripling FOR SHARE cost is risky.  Where is the added cost
 concentrated?  Perchance that multiple belies optimization opportunities.

Good question, let me play a bit.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-17 Thread Noah Misch
On Sat, Nov 17, 2012 at 03:20:20PM +0100, Andres Freund wrote:
 On 2012-11-16 22:31:51 -0500, Noah Misch wrote:
  On Fri, Nov 16, 2012 at 05:31:12PM +0100, Andres Freund wrote:
   On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote:
Andres is on the verge of convincing me that we need to support
singleton FOR SHARE without multixacts due to performance concerns.
  
   I don't really see any arguments against doing so. We aren't in a that
   big shortage of flags and if we need more than available I think we can
   free some (e.g. XMAX/XMIN_INVALID).
 
  The patch currently leaves two unallocated bits.  Reclaiming currently-used
  bits means a binary compatibility break.  Until we plan out such a thing,
  reclaimable bits are not as handy as never-allocated bits.  I don't think we
  should lightly expend one of the final two.
 
 Not sure what you mean with a binary compatibility break?
 pg_upgrade'ability?

Yes.  If we decide HEAP_XMIN_INVALID isn't helping, we can stop adding it to
tuples anytime.  Old tuples may continue to carry the bit, with no particular
deadline for their demise.  To reuse that bit in the mean time, we'll need to
prove that no tuple writable by PostgreSQL 8.3+ will get an unacceptable
interpretation under the new meaning of the bit.  Alternately, build the
mechanism to prove that all such old bits are gone before using the bit in the
new way.  This keeps HEAP_MOVED_IN and HEAP_MOVED_OFF unavailable today.

 What I previously suggested somewhere was:
 
 #define HEAP_XMAX_SHR_LOCK0x0010
 #define HEAP_XMAX_EXCL_LOCK   0x0040
 #define HEAP_XMAX_KEYSHR_LOCK (HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)
 /*
  * Different from _LOCK_BITS because it doesn't include LOCK_ONLY
  */
 #define HEAP_LOCK_MASK(HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)
 
 #define HEAP_XMAX_IS_SHR_LOCKED(tup) \
 (((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_SHR_LOCK)
 #define HEAP_XMAX_IS_EXCL_LOCKED(tup) \
 (((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_EXCL_LOCK)
 #define HEAP_XMAX_IS_KEYSHR_LOCKED(tup) \
 (((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_KEYSHR_LOCK)
 
 It makes checking for locks sightly more more complicated, but its not
 too bad...

Agreed; that seems reasonable.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-17 Thread Andres Freund
  I agree that tripling FOR SHARE cost is risky.  Where is the added cost
  concentrated?  Perchance that multiple belies optimization opportunities.

 Good question, let me play a bit.

Ok, I benchmarked around and from what I see there is no single easy
target.
The biggest culprits I could find are:
1. higher amount of XLogInsert calls per transaction (visible
in pgbench -t instead of -T mode while watching the WAL volume)
2. Memory allocations in GetMultiXactIdMembers
3. Memory allocations in mXactCachePut
 a) cache entry itself
 b) the cache context
4. More lwlocks acquisitions

We can possibly optimize a bit with 2) by using a static buffer for
common member sizes, but thats not going to buy us too much...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-17 Thread Noah Misch
On Sat, Nov 17, 2012 at 05:07:18PM +0100, Andres Freund wrote:
   I agree that tripling FOR SHARE cost is risky.  Where is the added cost
   concentrated?  Perchance that multiple belies optimization opportunities.
 
  Good question, let me play a bit.
 
 Ok, I benchmarked around and from what I see there is no single easy
 target.
 The biggest culprits I could find are:
 1. higher amount of XLogInsert calls per transaction (visible
 in pgbench -t instead of -T mode while watching the WAL volume)
 2. Memory allocations in GetMultiXactIdMembers
 3. Memory allocations in mXactCachePut
  a) cache entry itself
  b) the cache context
 4. More lwlocks acquisitions
 
 We can possibly optimize a bit with 2) by using a static buffer for
 common member sizes, but thats not going to buy us too much...

In that case, +1 for your proposal to prop up FOR SHARE.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-16 Thread Andres Freund
On 2012-11-14 13:27:26 -0300, Alvaro Herrera wrote:
 Alvaro Herrera wrote:

   * In heap_lock_tuple's  XMAX_IS_MULTI case
  
   [snip]
  
   why is it membermode  mode and not membermode = mode?
 
  Uh, that's a bug.  Fixed.  As noticed in the comment above that snippet,
  there was a deadlock possible here.  Maybe I should add a test to ensure
  this doesn't happen.

 Done:
 https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770

 (I don't think this is worth a v24 submission).

I have started doing some performance testing and I fear I was right in
being suspicious about the performance difference for FOR SHARE locks:

Tested with
pgbench -p 5442 -U andres \
 -c 30  -j 30 \
 -M prepared -f ~/tmp/postgres-fklocks/select-for-share.sql \
 -T 20 postgres

on a pgbench -i -s 10 database, where select-for-share.sql is:

BEGIN;
\set naccounts 100
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid FOR SHARE;
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid FOR SHARE;
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid FOR SHARE;
COMMIT;

which very roughly resembles workloads I have seen in reality (locking
some records your rely on while you do some work).

With
52b4729fcfc20f056f17531a6670d8c4b9696c39 (alvherre/fklocks)
vs
273986bf0d39e5166eb15ba42ebff4803e23a688 (latest merged master)

I get
tps = 8986.133496 (excluding connections establishing)
vs
tps = 25307.861193 (excluding connections establishing)

Thats nearly a factor of three which seems to be too big to be
acceptable to me.
So I really think we need to bring FOR SHARE locks back as a flag.

I have done some benchmarking of other cases (plain pgbench, pgbench
with foreign keys, large insertions, large amounts of FOR SHARE locks)
and haven't found anything really outstanding so far.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-16 Thread Alvaro Herrera
Noah Misch wrote:
 On Wed, Nov 14, 2012 at 01:27:26PM -0300, Alvaro Herrera wrote:
  https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770
  
  (I don't think this is worth a v24 submission).
 
 Are you aware of any defects in or unanswered questions of this version that
 would stall your commit thereof?

Yeah, I am revisiting the list of XXX/FIXME comments you pointed out
previously.

And I would still like someone with EPQ understanding to review the
ExecLockRows / EvalPlanQual / heap_lock_tuple interactions.

Andres is on the verge of convincing me that we need to support
singleton FOR SHARE without multixacts due to performance concerns.  It
would be useful for more people to chime in here: is FOR SHARE an
important case to cater for?  I wonder if using FOR KEY SHARE (keep
performance characteristics, but would need to revise application code)
would satisfy Andres' users, for example.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-16 Thread Andres Freund
On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote:
 Noah Misch wrote:
  On Wed, Nov 14, 2012 at 01:27:26PM -0300, Alvaro Herrera wrote:
   https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770
  
   (I don't think this is worth a v24 submission).
 
  Are you aware of any defects in or unanswered questions of this version that
  would stall your commit thereof?

 Yeah, I am revisiting the list of XXX/FIXME comments you pointed out
 previously.

 And I would still like someone with EPQ understanding to review the
 ExecLockRows / EvalPlanQual / heap_lock_tuple interactions.

I am in the process of looking at those atm, but we need somebody that
actually understands the intricacies here...

 Andres is on the verge of convincing me that we need to support
 singleton FOR SHARE without multixacts due to performance concerns.

I don't really see any arguments against doing so. We aren't in a that
big shortage of flags and if we need more than available I think we can
free some (e.g. XMAX/XMIN_INVALID).

 It
 would be useful for more people to chime in here: is FOR SHARE an
 important case to cater for?  I wonder if using FOR KEY SHARE (keep
 performance characteristics, but would need to revise application code)
 would satisfy Andres' users, for example.

It definitely wouldn't help in the cases I have seen because the point
is to protect against actual content changes of the rows, not just the
keys.
Note that you actually need to use explicit FOR SHARE/UPDATE for
correctness purposes in many scenarios unless youre running in 9.1+
serializable mode. And that cannot be used in some cases (try queuing
for example) because the rollback rates would be excessive.

Even if applications could be fixed, requiring changes to applications
locking behaviour, which possibly is far from trivial, seems like a too
big upgrade barrier.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-16 Thread Noah Misch
On Fri, Nov 16, 2012 at 05:31:12PM +0100, Andres Freund wrote:
 On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote:
  Andres is on the verge of convincing me that we need to support
  singleton FOR SHARE without multixacts due to performance concerns.
 
 I don't really see any arguments against doing so. We aren't in a that
 big shortage of flags and if we need more than available I think we can
 free some (e.g. XMAX/XMIN_INVALID).

The patch currently leaves two unallocated bits.  Reclaiming currently-used
bits means a binary compatibility break.  Until we plan out such a thing,
reclaimable bits are not as handy as never-allocated bits.  I don't think we
should lightly expend one of the final two.

  It
  would be useful for more people to chime in here: is FOR SHARE an
  important case to cater for?  I wonder if using FOR KEY SHARE (keep
  performance characteristics, but would need to revise application code)
  would satisfy Andres' users, for example.
 
 It definitely wouldn't help in the cases I have seen because the point
 is to protect against actual content changes of the rows, not just the
 keys.
 Note that you actually need to use explicit FOR SHARE/UPDATE for
 correctness purposes in many scenarios unless youre running in 9.1+
 serializable mode. And that cannot be used in some cases (try queuing
 for example) because the rollback rates would be excessive.

I agree that tripling FOR SHARE cost is risky.  Where is the added cost
concentrated?  Perchance that multiple belies optimization opportunities.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-15 Thread Noah Misch
On Wed, Nov 14, 2012 at 01:27:26PM -0300, Alvaro Herrera wrote:
 https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770
 
 (I don't think this is worth a v24 submission).

Are you aware of any defects in or unanswered questions of this version that
would stall your commit thereof?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-14 Thread Alvaro Herrera
Alvaro Herrera wrote:

  * In heap_lock_tuple's  XMAX_IS_MULTI case
  
  [snip]
  
  why is it membermode  mode and not membermode = mode?
 
 Uh, that's a bug.  Fixed.  As noticed in the comment above that snippet,
 there was a deadlock possible here.  Maybe I should add a test to ensure
 this doesn't happen.

Done:
https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770

(I don't think this is worth a v24 submission).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-13 Thread Alvaro Herrera
Noah Misch wrote:
 On Wed, Oct 31, 2012 at 05:22:10PM -0300, Alvaro Herrera wrote:

  Not really sure about the proposed syntax, but yes clearly we need some
  other syntax to mean FOR NON KEY UPDATE.  I would rather keep FOR
  UPDATE to mean what I currently call FOR KEY UPDATE.  More proposals for
  the other (weaker) lock level welcome (but if you love FOR NON KEY
  UPDATE, please chime in too)
 
 Agree on having FOR UPDATE without any FOR KEY UPDATE synonym.  For the
 weaker lock, I mildly preferred the proposal of FOR NO KEY UPDATE.  NON KEY
 captures the idea better in English, but NO is close enough and already part
 of the SQL lexicon.

This is the proposal I like best; however there is an asymmetry, because
the locking options now are

FOR KEY SHARE
FOR SHARE
FOR NO KEY UPDATE
FOR UPDATE

I used to have comments such as

/* Currently, SELECT ... FOR [KEY] UPDATE/SHARE requires UPDATE privileges */
#define ACL_SELECT_FOR_UPDATE   ACL_UPDATE

but now they are slightly incorrect because the NO is not illustrated.
I guess I could use SELECT ... FOR [NO KEY] UPDATE/SHARE but this leaves
out the FOR KEY SHARE case (and can be thought to introduce a FOR NO
KEY SHARE case).  And getting much more verbose than that is probably
not warranted.  In some places I would like the use a phrase like the
locking clause, but I'm not sure that it's clear enough.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-05 Thread Alvaro Herrera
FWIW I have gotten a lot of feedback about this patch, and since I don't
have time right now to produce an updated version, that I'm going to
close this as Returned with Feedback, and submit an updated version to
the upcoming commitfest.

This patch still needs much more review -- for example, as far as I
know, the multixact.c changes have gone largely unreviewed; they have
changed somewhat since Noah reviewed them (back in version 15 or so, I
think).  Of course, to me it all makes sense, but then I'm only its
author.

-- 
Alvaro Herrera http://www.flickr.com/photos/alvherre/
Everybody understands Mickey Mouse. Few understand Hermann Hesse.
Hardly anybody understands Einstein. And nobody understands Emperor Norton.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-05 Thread Andres Freund
On Monday, November 05, 2012 02:37:15 PM Alvaro Herrera wrote:
 FWIW I have gotten a lot of feedback about this patch, and since I don't
 have time right now to produce an updated version, that I'm going to
 close this as Returned with Feedback, and submit an updated version to
 the upcoming commitfest.
 
 This patch still needs much more review -- for example, as far as I
 know, the multixact.c changes have gone largely unreviewed; they have
 changed somewhat since Noah reviewed them (back in version 15 or so, I
 think).  Of course, to me it all makes sense, but then I'm only its
 author.

There also hasn't been any recent performance testing. I am not sure if I can 
get the time to do so before the next commitfest...

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-05 Thread Dimitri Fontaine
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 FOR NON KEY UPDATE
 FOR KEY UPDATE
 
 KEY is the default, so FOR UPDATE is a synonym of FOR KEY UPDATE

 Not really sure about the proposed syntax, but yes clearly we need some
 other syntax to mean FOR NON KEY UPDATE.  I would rather keep FOR
 UPDATE to mean what I currently call FOR KEY UPDATE.  More proposals for
 the other (weaker) lock level welcome (but if you love FOR NON KEY
 UPDATE, please chime in too)

FOR ANY UPDATE, synonym of FOR UPDATE
FOR KEY UPDATE, optimized version, when it applies to your case

I also tend to think that we should better not change the current
meaning of FOR UPDATE and have it default to FOR ANY UPDATE.

Unless it's easy to upgrade from ANY to KEY, and do that automatically
at the right time, but I fear there lie dragons (or something).

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-31 Thread Noah Misch
On Mon, Oct 29, 2012 at 08:18:33AM -0400, Kevin Grittner wrote:
 Andres Freund wrote:
  On Sunday, October 28, 2012 11:47:16 PM Simon Riggs wrote:
  Andres Freund and...@2ndquadrant.com wrote:
 
  * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2
  and earlier you could be sure that if you FOR UPDATE'ed a row you
  could delete it. Unless I miss something now this will not block
  somebody else acquiring a FOR KEY SHARE lock, so this guarantee
  is gone.
  
  Yes, that is exactly the point of this.
  
  The point is the introduction of a weaker lock level which can be 
  used by the ri triggers. I don't see any imperative that the
  semantics of the old lock level need to be redefined. That just
  seems dangerous to me.
 
 I agree with Andres here -- the new lock level is needed within RI
 triggers, and we might as well expose it for application programmer
 use (with proper documentations), but there's no reason to break
 existing application code by making the semantics of SELECT FOR
 UPDATE less strict than they were before. Let's keep that term
 meaning exactly the same thing if we can, and use new names for the
 new levels.

+1.  I had not considered this angle during previous reviews, but I agree with
Andres's position.  Since this patch does not strengthen the strongest tuple
lock relative to its PostgreSQL 9.2 counterpart, that lock type should
continue to use the syntax FOR UPDATE.  It will come to mean the lock type
sufficient for all possible UPDATEs of the row.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-31 Thread Simon Riggs
On 31 October 2012 19:35, Noah Misch n...@leadboat.com wrote:
 On Mon, Oct 29, 2012 at 08:18:33AM -0400, Kevin Grittner wrote:
 Andres Freund wrote:
  On Sunday, October 28, 2012 11:47:16 PM Simon Riggs wrote:
  Andres Freund and...@2ndquadrant.com wrote:

  * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2
  and earlier you could be sure that if you FOR UPDATE'ed a row you
  could delete it. Unless I miss something now this will not block
  somebody else acquiring a FOR KEY SHARE lock, so this guarantee
  is gone.
 
  Yes, that is exactly the point of this.
 
  The point is the introduction of a weaker lock level which can be
  used by the ri triggers. I don't see any imperative that the
  semantics of the old lock level need to be redefined. That just
  seems dangerous to me.

 I agree with Andres here -- the new lock level is needed within RI
 triggers, and we might as well expose it for application programmer
 use (with proper documentations), but there's no reason to break
 existing application code by making the semantics of SELECT FOR
 UPDATE less strict than they were before. Let's keep that term
 meaning exactly the same thing if we can, and use new names for the
 new levels.

 +1.  I had not considered this angle during previous reviews, but I agree with
 Andres's position.  Since this patch does not strengthen the strongest tuple
 lock relative to its PostgreSQL 9.2 counterpart, that lock type should
 continue to use the syntax FOR UPDATE.  It will come to mean the lock type
 sufficient for all possible UPDATEs of the row.

So we have syntax

FOR NON KEY UPDATE
FOR KEY UPDATE

KEY is the default, so FOR UPDATE is a synonym of FOR KEY UPDATE

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-31 Thread Alvaro Herrera
Simon Riggs wrote:
 On 31 October 2012 19:35, Noah Misch n...@leadboat.com wrote:
  On Mon, Oct 29, 2012 at 08:18:33AM -0400, Kevin Grittner wrote:
  Andres Freund wrote:

   The point is the introduction of a weaker lock level which can be
   used by the ri triggers. I don't see any imperative that the
   semantics of the old lock level need to be redefined. That just
   seems dangerous to me.
 
  I agree with Andres here -- the new lock level is needed within RI
  triggers, and we might as well expose it for application programmer
  use (with proper documentations), but there's no reason to break
  existing application code by making the semantics of SELECT FOR
  UPDATE less strict than they were before. Let's keep that term
  meaning exactly the same thing if we can, and use new names for the
  new levels.
 
  +1.  I had not considered this angle during previous reviews, but I agree 
  with
  Andres's position.  Since this patch does not strengthen the strongest tuple
  lock relative to its PostgreSQL 9.2 counterpart, that lock type should
  continue to use the syntax FOR UPDATE.  It will come to mean the lock 
  type
  sufficient for all possible UPDATEs of the row.

Yeah, I agree with this too.

 So we have syntax
 
 FOR NON KEY UPDATE
 FOR KEY UPDATE
 
 KEY is the default, so FOR UPDATE is a synonym of FOR KEY UPDATE

Not really sure about the proposed syntax, but yes clearly we need some
other syntax to mean FOR NON KEY UPDATE.  I would rather keep FOR
UPDATE to mean what I currently call FOR KEY UPDATE.  More proposals for
the other (weaker) lock level welcome (but if you love FOR NON KEY
UPDATE, please chime in too)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-31 Thread Noah Misch
On Wed, Oct 31, 2012 at 05:22:10PM -0300, Alvaro Herrera wrote:
 Simon Riggs wrote:
  On 31 October 2012 19:35, Noah Misch n...@leadboat.com wrote:
   +1.  I had not considered this angle during previous reviews, but I agree 
   with
   Andres's position.  Since this patch does not strengthen the strongest 
   tuple
   lock relative to its PostgreSQL 9.2 counterpart, that lock type should
   continue to use the syntax FOR UPDATE.  It will come to mean the lock 
   type
   sufficient for all possible UPDATEs of the row.
 
 Yeah, I agree with this too.
 
  So we have syntax
  
  FOR NON KEY UPDATE
  FOR KEY UPDATE
  
  KEY is the default, so FOR UPDATE is a synonym of FOR KEY UPDATE
 
 Not really sure about the proposed syntax, but yes clearly we need some
 other syntax to mean FOR NON KEY UPDATE.  I would rather keep FOR
 UPDATE to mean what I currently call FOR KEY UPDATE.  More proposals for
 the other (weaker) lock level welcome (but if you love FOR NON KEY
 UPDATE, please chime in too)

Agree on having FOR UPDATE without any FOR KEY UPDATE synonym.  For the
weaker lock, I mildly preferred the proposal of FOR NO KEY UPDATE.  NON KEY
captures the idea better in English, but NO is close enough and already part
of the SQL lexicon.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-29 Thread Andres Freund
On Sunday, October 28, 2012 11:47:16 PM Simon Riggs wrote:
 On 27 October 2012 00:06, Andres Freund and...@2ndquadrant.com wrote:
  On Thursday, October 18, 2012 09:58:20 PM Alvaro Herrera wrote:
  Here is version 22 of this patch.  This version contains fixes to issues
  reported by Andres, as well as a rebase to latest master.
  
  Ok, I now that pgconf.eu has ended I am starting to do a real review:
  
  * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2 and
  earlier you could be sure that if you FOR UPDATE'ed a row you could
  delete it. Unless I miss something now this will not block somebody else
  acquiring a FOR KEY SHARE lock, so this guarantee is gone.
 
 Yes, that is exactly the point of this.

Yes, sure. The point is the introduction of a weaker lock level which can be 
used by the ri triggers. I don't see any imperative that the semantics of the 
old lock level need to be redefined. That just seems dangerous to me.

We need to take care to reduce the complications of upgrades not introduce 
changes that require complex code reviews.

  This seems likely to introduce subtle problems in user applications.
 
 Yes, it could. So we need some good docs on explaining this.
 
 Which applications use FOR UPDATE?

Any that want to protect themselves against deadlocks and/or visibility 
problems due to READ COMMITTED. Thats quite a bunch.

 Any analysis of particular situations would be very helpful in doing
 the correct thing here.

Usual things include

* avoiding problems due to lock upgrades in combination with foreign keys (as 
far as I can see some of those still persist).
* prevent rows being deleted by other transactions
* prepare for updating if computation of the new values take some time
* guarantee order of locking to make sure rows are DELETE/UPDATEed in the same 
order (still no ORDER BY in UPDATE/DELETE)
...

 I think introducing FOR DELETE would be much clearer as an addition/
 synonym for FOR KEY UPDATE.

Hm. Not really convinced. For me that seems to just make the overall situation 
even more complex.

  I propose renaming FOR UPDATE = FOR NO KEY UPDATE, FOR KEY UPDATE = FOR
  UPDATE or similar (or PREVENT KEY UPDATE?). That keeps the old meaning of
  FOR UPDATE.
 
 Which is essentially unwinding the feature, to some extent.

Why? The overall features available are just the same? The only thing is that 
existing semantics aren't changed.

 Does FOR UPDATE throw an error if the user later issues an UPDATE of
 the PK or a DELETE? That sequence of actions would cause lock
 escalation in the application, which could also lead to
 deadlock/contention.

Unless I miss something it precisely does *not* result in lock escalation with 
the 9.2 semanticsbut it *would* with fklocks applied. Thats exactly my point.

 This sounds like we need a GUC or table-level default to control
 whether FOR UPDATE means FOR UPDATE or FOR DELETE

I don't like adding a new guc for something that should be solveable with some 
creative naming. If a new user doesn't get a bit more concurrency due to 
manually issued 9.2 FOR UPDATE implicitly being converted into a FOR NO KEY 
UPDATE its not too bad. The code needs to be checked whether thats valid 
anyway. The reverse is not true...

 More thought/input required on this point, it seems important.

Yep, more input welcome.

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-29 Thread Simon Riggs
On 29 October 2012 12:27, Andres Freund and...@2ndquadrant.com wrote:

 This sounds like we need a GUC or table-level default to control
 whether FOR UPDATE means FOR UPDATE or FOR DELETE

 I don't like adding a new guc for something that should be solveable with some
 creative naming. If a new user doesn't get a bit more concurrency due to
 manually issued 9.2 FOR UPDATE implicitly being converted into a FOR NO KEY
 UPDATE its not too bad. The code needs to be checked whether thats valid
 anyway. The reverse is not true...

The main point here is that introducing new non-optional behaviour is
a compatibility problem and we shouldn't rush that.

If we decide to change FOR UPDATE to mean FOR NO KEY UPDATE that needs
separate consideration and shouldn't happen until sometime after the
feature goes in (months, or perhaps releases).

We're introducing a new feature that will allow us to avoid lock
problems, by taking the current FOR UPDATE behaviour and splitting it
into two options: one weaker and one stronger. We need explicit names
for both of those options. The most obvious naming is
FOR [[NO] KEY] UPDATE

Don't much like that, but its pretty unambiguous and fairly neat.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-29 Thread Kevin Grittner
Andres Freund wrote:
 On Sunday, October 28, 2012 11:47:16 PM Simon Riggs wrote:
 Andres Freund and...@2ndquadrant.com wrote:

 * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2
 and earlier you could be sure that if you FOR UPDATE'ed a row you
 could delete it. Unless I miss something now this will not block
 somebody else acquiring a FOR KEY SHARE lock, so this guarantee
 is gone.
 
 Yes, that is exactly the point of this.
 
 The point is the introduction of a weaker lock level which can be 
 used by the ri triggers. I don't see any imperative that the
 semantics of the old lock level need to be redefined. That just
 seems dangerous to me.

I agree with Andres here -- the new lock level is needed within RI
triggers, and we might as well expose it for application programmer
use (with proper documentations), but there's no reason to break
existing application code by making the semantics of SELECT FOR
UPDATE less strict than they were before. Let's keep that term
meaning exactly the same thing if we can, and use new names for the
new levels.

We already present a barrier to people migrating from Oracle because
SELECT FOR UPDATE in PostgreSQL is weaker than it is in Oracle (where
it is just as strong as if an actual UPDATE were done -- write
conflicts and all); let's not increase the barrier by making SELECT
FOR UPDATE even weaker.

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-28 Thread Simon Riggs
On 27 October 2012 00:06, Andres Freund and...@2ndquadrant.com wrote:
 On Thursday, October 18, 2012 09:58:20 PM Alvaro Herrera wrote:
 Here is version 22 of this patch.  This version contains fixes to issues
 reported by Andres, as well as a rebase to latest master.

 Ok, I now that pgconf.eu has ended I am starting to do a real review:

 * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2 and earlier
 you could be sure that if you FOR UPDATE'ed a row you could delete it. Unless
 I miss something now this will not block somebody else acquiring a FOR KEY
 SHARE lock, so this guarantee is gone.

Yes, that is exactly the point of this.

 This seems likely to introduce subtle problems in user applications.

Yes, it could. So we need some good docs on explaining this.

Which applications use FOR UPDATE?
Any analysis of particular situations would be very helpful in doing
the correct thing here.

I think introducing FOR DELETE would be much clearer as an addition/
synonym for FOR KEY UPDATE.


 I propose renaming FOR UPDATE = FOR NO KEY UPDATE, FOR KEY UPDATE = FOR
 UPDATE or similar (or PREVENT KEY UPDATE?). That keeps the old meaning of FOR
 UPDATE.

Which is essentially unwinding the feature, to some extent.

Does FOR UPDATE throw an error if the user later issues an UPDATE of
the PK or a DELETE? That sequence of actions would cause lock
escalation in the application, which could also lead to
deadlock/contention.

This sounds like we need a GUC or table-level default to control
whether FOR UPDATE means FOR UPDATE or FOR DELETE

More thought/input required on this point, it seems important.


 You write SELECT FOR UPDATE is a standards-compliant exclusive lock. I
 didn't really find all that much about the semantics of FOR UPDATE on cursors
 in the standard other than The operations of update and delete are permitted
 for updatable cursors, subject to constraining Access Rules..

 * I think some of the longer comments could use the attention of a native
 speaker, unfortunately thats not me.



-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-19 Thread Noah Misch
On Thu, Oct 18, 2012 at 04:58:20PM -0300, Alvaro Herrera wrote:
 Here is version 22 of this patch.  This version contains fixes to issues
 reported by Andres, as well as a rebase to latest master.

I scanned this for obvious signs of work left to do.  It contains numerous XXX
and FIXME comments.  Many highlight micro-optimization opportunities and the
like; those can stay.  Others preclude commit, either highlighting an unsolved
problem or wrongly highlighting a non-problem:

 + /*
 +  * XXX we do not lock this tuple here; the theory is that it's 
 sufficient
 +  * with the buffer lock we're about to grab.  Any other code must be 
 able
 +  * to cope with tuple lock specifics changing while they don't hold 
 buffer
 +  * lock anyway.
 +  */

* so they will be uninteresting by the time our next transaction starts.
* (XXX not clear that this is correct --- other members of the MultiXact
* could hang around longer than we did.  However, it's not clear what a
 !  * better policy for flushing old cache entries would be.)  FIXME actually
 !  * this is plain wrong now that multixact's may contain update Xids.

 ! nmembers = GetMultiXactIdMembers(multi, members, true);
 ! /*
 !  * XXX we don't have the infomask to run the required consistency check
 !  * here; the required notational overhead seems excessive.
 !  */

   /* We assume the cache entries are sorted */
 ! /* XXX we assume the unused bits in status are zeroed */
 ! if (memcmp(members, entry-members, nmembers * 
 sizeof(MultiXactMember)) == 0)

 !  * XXX do we have any issues with needing to checkpoint here?
*/
 ! static void
 ! TruncateMultiXact(void)
   {

 + /* FIXME what should we initialize this to? */
 + newFrozenMulti = ReadNextMultiXactId();

 +  * FIXME -- the XMAX_IS_MULTI test is a bit wrong .. it's possible to
 +  * have tuples with that bit set that are dead.  However, if that's
 +  * changed, the RawXmax() call below should probably be researched as 
 well.
*/
   if (tuple-t_infomask 
 ! (HEAP_XMAX_INVALID | HEAP_XMAX_LOCK_ONLY | HEAP_XMAX_IS_MULTI))
   return false;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-12 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Andres Freund wrote:

  * heap_lock_tuple with mode == LockTupleKeyShare  nowait looks like it 
  would 
  wait anyway in heap_lock_updated_tuple_rec
 
 Oops.

I took a stab at fixing this.  However, it is not easy.  First you need
a way to reproduce the problem, which involves setting breakpoints in
GDB.  (Since a REPEATABLE READ transaction will fail to follow an update
chain due to tuple concurrently updated, you need to use a READ
COMMITTED transaction; but obviously the timing to insert the bunch of
updates in the middle is really short.  Hence I inserted a breakpoint at
the end of GetSnapshotData, had a SELECT FOR KEY SHARE NOWAIT get stuck
in it, and then launched a couple of updates in another session).  I was
able to reproduce the undesirable wait.

I quickly patched heap_lock_updated_tuple to pass down the nowait flag,
but this is actually not reached, because the update chain is first
followed by EvalPlanQual in ExecLockRows, and not by
heap_lock_updated_tuple directly.  And EPQ does not have the nowait
behavior.  So it still blocks.

Maybe what we need to do is prevent ExecLockRows from following the
update chain altogether -- after all, if heap_lock_tuple is going to do
it by itself maybe it's wholly redundant.

Not really sure what's the best way to approach this.  At this stage I'm
inclined to ignore the problem, unless some EPQ expert shows up and
tells me that (1) it's okay to patch EPQ in that way, or (2) we should
hack ExecLockRows (and remove EPQ?).


I pushed (to github) patches for a couple of other issues you raised.
Some others still need a bit more of my attention.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-11 Thread Andres Freund
On Friday, August 31, 2012 06:59:51 AM Alvaro Herrera wrote:
 v21 is merged to latest master.
Ok, I am starting to look at this.

(working with a git merge of alvherre/fklocks into todays master)

In a very first pass as somebody who hasn't followed the discussions in the 
past I took notice of the following things:

* compiles and survives make check
* README.tuplock jumps in quite fast without any introduction

* the reasons for the redesign aren't really included in the patch but just in 
the - very nice - pgcon paper.

* This is a bit of a performance regression, but since we expect FOR SHARE 
locks to be seldom used, we don’t feel this is a serious problem. makes me a 
bit uneasy, will look at performance separately

* Should RelationGetIndexattrBitmap check whether a unique index is referenced 
from a pg_constraint row? Currently we pay part of the overhead independent 
from the existance of foreign keys.

* mode == LockTupleKeyShare, == LockTupleShare, == LockTupleUpdate in 
heap_lock_tuple's BeingUpdated branch look like they should be an if/else if

* I don't really like HEAP_UPDATE_KEY_REVOKED as a name, what about 
HEAP_KEYS_UPDATED (akin to HEAP_HOT_UPDATED)

* how can we get infomask2  HEAP_UPDATE_KEY_REVOKED  infomask  
HEAP_XMAX_LOCK_ONLY?

* heap_lock_tuple with mode == LockTupleKeyShare  nowait looks like it would 
wait anyway in heap_lock_updated_tuple_rec

* rename HeapSatisfiesHOTUpdate, adjust comments?

* I wonder whether HeapSatisfiesHOTUpdate could be made to compute the result 
for key_attrs and hot_attrs at the same time, often enough they will do the 
same thing on the same column

* you didn't start it but I find that all those l$i jump labels make the code 
harder to follow

* shouldn't XmaxGetUpdateXid be rather called MultiXactIdGetUpdateXid or such? 

*
/*
 * If the existing lock mode is identical to or weaker than the 
new
 * one, we can act as though there is no existing lock and have 
the
 * upper block handle this.
 */
block?

* I am not yet sure whether the (xmax == add_to_xmax) optimization in 
compute_new_xmax_infomask is actually safe

* ConditionalMultiXactIdWait and MultiXactIdWait should probably rely on a 
common implementation

* I am not that convinced that moving the waiting functions away from 
multixact.c is a good idea, but I guess the required knowledge about lockmodes 
makes it hard not to do so

* Haven't looked yet, but I have a slightly uneasy feeling about Hot Standby 
interactions. Did you look at that?

* Hm?
HeapTupleSatisfiesVacuum
#if 0
ResetMultiHintBit(tuple, buffer, xmax, true);
#endif

This obviously is not a real review, but just learning what the problem  patch 
actually try to do. This is quite a bit to take in ;). I will let it sink in 
ant try to have a look at the architecture and performance next.


Greetings,

Andres

.oO(and people call catalog timetravel complicated)

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-10-11 Thread Alvaro Herrera
Andres Freund wrote:
 On Friday, August 31, 2012 06:59:51 AM Alvaro Herrera wrote:
  v21 is merged to latest master.
 Ok, I am starting to look at this.
 
 (working with a git merge of alvherre/fklocks into todays master)
 
 In a very first pass as somebody who hasn't followed the discussions in the 
 past I took notice of the following things:
 
 * compiles and survives make check

Please verify src/test/isolation's make installcheck as well.

 * README.tuplock jumps in quite fast without any introduction

Hmm .. I'll give that a second look.  Maybe some material from the pgcon
paper should be added as introduction.

 * This is a bit of a performance regression, but since we expect FOR SHARE 
 locks to be seldom used, we don’t feel this is a serious problem. makes me a 
 bit uneasy, will look at performance separately

Thanks.  Keep in mind though that the bits we really want good
performance for is FOR KEY SHARE, which is going to be used in foreign
keys.  FOR SHARE is staying just for hypothetical backwards
compatibility.  I just found that people is using it, see for example
http://stackoverflow.com/a/3243083

 * Should RelationGetIndexattrBitmap check whether a unique index is 
 referenced 
 from a pg_constraint row? Currently we pay part of the overhead independent 
 from the existance of foreign keys.

Noah also suggested something more or less in the along the same lines.
My answer is that I don't want to do it currently; maybe we can improve
on what indexes we use later, but since this can be seen as a modularity
violation, I prefer to keep it as simple as possible.

 * mode == LockTupleKeyShare, == LockTupleShare, == LockTupleUpdate in 
 heap_lock_tuple's BeingUpdated branch look like they should be an if/else if

Hm, will look at that.

 * I don't really like HEAP_UPDATE_KEY_REVOKED as a name, what about 
 HEAP_KEYS_UPDATED (akin to HEAP_HOT_UPDATED)

Thanks.

 * how can we get infomask2  HEAP_UPDATE_KEY_REVOKED  infomask  
 HEAP_XMAX_LOCK_ONLY?

SELECT FOR KEY UPDATE?  This didn't exist initially, but previous
reviewers (Noah) felt that it made sense to have it for consistency.  It
makes the lock conflict table make more sense, anyway.

 * heap_lock_tuple with mode == LockTupleKeyShare  nowait looks like it 
 would 
 wait anyway in heap_lock_updated_tuple_rec

Oops.

 * rename HeapSatisfiesHOTUpdate, adjust comments?

Yeah.

 * I wonder whether HeapSatisfiesHOTUpdate could be made to compute the result 
 for key_attrs and hot_attrs at the same time, often enough they will do the 
 same thing on the same column

Hmm, I will look at that.

 * you didn't start it but I find that all those l$i jump labels make the code 
 harder to follow

You mean you suggest to have them renamed?  That can be arranged.

 * shouldn't XmaxGetUpdateXid be rather called MultiXactIdGetUpdateXid or 
 such? 

Yes.

   /*
* If the existing lock mode is identical to or weaker than the 
 new
* one, we can act as though there is no existing lock and have 
 the
* upper block handle this.
*/
 block?

Yeah, as in the if {} block above.  Since this is not clear maybe it
needs rewording.

 * I am not yet sure whether the (xmax == add_to_xmax) optimization in 
 compute_new_xmax_infomask is actually safe

Will think about it and add a/some comment(s).

 * ConditionalMultiXactIdWait and MultiXactIdWait should probably rely on a 
 common implementation

Will look at that.

 * I am not that convinced that moving the waiting functions away from 
 multixact.c is a good idea, but I guess the required knowledge about 
 lockmodes 
 makes it hard not to do so

Yeah.  The line between multixact.c and heapam.c is a zigzag currently,
I think.  Maybe it needs to be more clear; I think the multixact modes
really belong into heapam.c, and they are just uint32 to multixact.c.  I
wasn't brave enough to attempt this; maybe I should.

 * Haven't looked yet, but I have a slightly uneasy feeling about Hot Standby 
 interactions. Did you look at that?
 
 * Hm?
 HeapTupleSatisfiesVacuum
 #if 0
   ResetMultiHintBit(tuple, buffer, xmax, true);
 #endif

Yeah.  I had a couple of ResetMultiHintBit calls sprinkled in some of
the tqual routines, with the idea that if they saw multis that were no
longer live they could be removed, and replaced by just the remaining
updater.  However, this doesn't really work because it's not safe to
change the Xmax when not holding exclusive lock, and the tqual routines
don't know that.  So we're stuck with keeping the multi in there, even
when we know they are no longer interesting.  This is a bit sad, because
it would be a performance benefit to remove them; but it's not strictly
necessary for correctness, so we can leave the optimization for a later
phase.  I let those #ifdefed out calls in place so that it's easy to see
where the reset needs to happen.


 This obviously is not a real review, but just learning what the 

Re: [HACKERS] foreign key locks

2012-08-24 Thread Robert Haas
On Wed, Aug 22, 2012 at 5:31 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Patch v19 contains some tweaks.  Most notably,

 1. if an Xid requests a lock A, and then a lock B which is stronger than
 A, then record only lock B and forget lock A.  This is important for
 performance, because EvalPlanQual obtains ForUpdate locks on the tuples
 that it chases on an update chain.  If EPQ was called by an update or
 delete, previously a MultiXact was being created.

 In a stock pgbench run this was causing lots of multis to be created,
 even when there are no FKs.

 This was most likely involved in the 9% performance decrease that was
 previously reported.

Ah-ha!  Neat.  I'll try to find some time to re-benchmark this during
the next CF, unless you did that already.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-07-05 Thread Alvaro Herrera

Excerpts from Alvaro Herrera's message of vie jun 29 19:17:02 -0400 2012:

 I was only testing migrating from an old version into patched, not
 same-version upgrades.
 
 I think I see what's happening here.

Okay, I have pushed the fix to github -- as I suspected, code-wise the
fix was simple.  I will post an updated consolidated patch later today.

make check of pg_upgrade now passes for me.

It would be nice that make installcheck would notify me that some
modules' tests are being skipped ...

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-06-29 Thread Alvaro Herrera

Excerpts from Kevin Grittner's message of mié jun 27 08:40:58 -0400 2012:
 Alvaro Herrera  wrote:
  
  here's fklocks v14, which also merges to new master as there were
  several other conflicts. It passes make installcheck-world now.
  
 Recent commits broke it again, so here's a rebased v15.  (No changes
 other than attempting to merge your work with the pg_controldata and
 pg_resetxlog changes and wrapping that FIXME in a comment.)

Here's v16, merged to latest stuff, before attempting to fix this:

 Using that patch, pg_upgrade completes, but pg_dumpall fails:

 + pg_ctl start -l
 /home/kevin/pg/master/contrib/pg_upgrade/log/postmaster2.log -w
 waiting for server to start done
 server started
 + pg_dumpall
 pg_dump: Dumping the contents of table hslot failed: PQgetResult()
 failed.
 pg_dump: Error message from server: ERROR:  MultiXactId 115 does no
 longer exist -- apparent wraparound

I was only testing migrating from an old version into patched, not
same-version upgrades.

I think I see what's happening here.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-06-27 Thread Alvaro Herrera

Excerpts from Kevin Grittner's message of mié jun 27 08:40:58 -0400 2012:
 Alvaro Herrera  wrote:
  
  here's fklocks v14, which also merges to new master as there were
  several other conflicts. It passes make installcheck-world now.
  
 Recent commits broke it again, so here's a rebased v15.  (No changes
 other than attempting to merge your work with the pg_controldata and
 pg_resetxlog changes and wrapping that FIXME in a comment.)

Oooh, so sorry -- I was supposed to have git-stashed that before
producing the patch.  This change cannot have caused the pg_dumpall
problem below though, because it only touched the multixact cache code.

 Using that patch, pg_upgrade completes, but pg_dumpall fails:

Hmm, something changed enough that requires more than just a code merge.
I'll look into it.

Thanks for the merge.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-06-24 Thread Alvaro Herrera

Excerpts from Kevin Grittner's message of sáb jun 23 18:38:10 -0400 2012:
 Alvaro Herrera  wrote:
  
  So here's the rebased version.
  
 I found a couple problems on `make check-world`.  Attached is a patch
 to fix one of them.  The other is on pg_upgrade, pasted below.

Ah, I mismerged pg_upgrade.  The fix is trivial, AFAICS -- a function
call is missing a log file name to be specified.  Strange that the
compiler didn't warn me of a broken printf format specifier.  However,
pg_upgrade itself has been broken by an unrelated commit and that fix is
not so trivial, so I'm stuck waiting for that to be fixed.  (We really
need automated pg_upgrade testing.)

Thanks for the patch for the other problem, I'll push it out.

 I'm marking it as Waiting for Author since this seems like a must
 fix, but I'll keep looking at it,

Great, thanks.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-06-23 Thread Kevin Grittner
Alvaro Herrera  wrote:
 
 So here's the rebased version.
 
I found a couple problems on `make check-world`.  Attached is a patch
to fix one of them.  The other is on pg_upgrade, pasted below.
 
+ pg_upgrade -d
/home/kevin/pg/master/contrib/pg_upgrade/tmp_check/data.old -D
/home/kevin/pg/master/contrib/pg_upgrade/tmp_check/data -b
/home/kevin/pg/master/contrib/pg_upgrade/tmp_check/install//home/
kevin/pg/master/Debug/bin -B
/home/kevin/pg/master/contrib/pg_upgrade/tmp_check/install//home/
kevin/pg/master/Debug/bin
Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions   ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok
Checking for reg* system OID user data typesok
Checking for contrib/isn with bigint-passing mismatch   ok
Creating catalog dump   ok
Checking for presence of required libraries ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
--
Analyzing all rows in the new cluster   ok
Freezing all rows on the new clusterok
Deleting files from new pg_clog ok
Copying old pg_clog to new server   ok
Setting next transaction ID for new cluster ok
Deleting files from new pg_multixact/offsetsok
Copying old pg_multixact/offsets to new server  ok
Deleting files from new pg_multixact/membersok
Copying old pg_multixact/members to new server  ok
Setting next multixact ID and offset for new clustersh:
/home/kevin/pg/master/contrib/pg_upgrade/tmp_check/install//home/
kevin/pg/master/Debug/bin: Permission denied
*failure*

Consult the last few lines of %s/pg_resetxlog -O %u -m %u,%u %s
 /dev/null for the probable cause of the failure.
Failure, exiting
make[2]: *** [check] Error 1
make[2]: Leaving directory `/home/kevin/pg/master/contrib/pg_upgrade'
make[1]: *** [check-pg_upgrade-recurse] Error 2
make[1]: Leaving directory `/home/kevin/pg/master/contrib'
make: *** [check-world-contrib-recurse] Error 2
 
I'm marking it as Waiting for Author since this seems like a must
fix, but I'll keep looking at it,
 
-Kevin




fklocks-13a.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-06-20 Thread Jaime Casanova
On Thu, Jun 14, 2012 at 11:41 AM, Alvaro Herrera
alvhe...@alvh.no-ip.org wrote:
 Hi,

 This is v12 of the foreign key locks patch.


Hi Álvaro,

Just noticed that this patch needs a rebase because of the refactoring
Tom did in ri_triggers.c

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-06-20 Thread Tom Lane
Jaime Casanova ja...@2ndquadrant.com writes:
 On Thu, Jun 14, 2012 at 11:41 AM, Alvaro Herrera
 alvhe...@alvh.no-ip.org wrote:
 This is v12 of the foreign key locks patch.

 Just noticed that this patch needs a rebase because of the refactoring
 Tom did in ri_triggers.c

Hold on a bit before you work on that code --- I've got one more bit of
hacking I want to try before walking away from it.  I did some oprofile
work on Dean's example from
caezatcwm8m00ra814o4dc2cd_aj44gqlb0tddxmha312qg7...@mail.gmail.com
and noticed that it looks like ri_FetchConstraintInfo is eating a
noticeable fraction of the runtime, which is happening because it is
called to deconstruct the relevant pg_constraint row for each tuple
we consider firing the trigger for (and then again, if we do fire the
trigger).  I'm thinking it'd be worth maintaining a backend-local cache
for the deconstructed data, and am going to go try that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-04-05 Thread Peter Geoghegan
On 25 March 2012 09:17, Simon Riggs si...@2ndquadrant.com wrote:
 The main thing we're waiting on are the performance tests to confirm
 the lack of regression.

I have extensively benchmarked the latest revision of the patch
(tpc-b.sql), which I pulled from Alvaro's github account. The
benchmark was of the feature branch's then-and-current HEAD, Don't
follow update chains unless caller requests it.

I've had to split these numbers out into two separate reports.
Incidentally, at some future point I hope that pgbench-tools can
handling testing across feature branches, initdb'ing and suchlike
automatically and as needed. That's something that's likely to happen
sooner rather than later.

The server used was kindly supplied by the University of Oregon open
source lab.

Server (It's virtualised, but apparently this is purely for sandboxing
purposes and the virtualisation technology is rather good):

IBM,8231-E2B POWER7 processor (8 cores).
Fedora 16
8GB Ram
Dedicated RAID1 disks. Exact configuration unknown.

postgresql.conf (this information is available when you drill down
into each test too, fwiw):
max_connections = 200
shared_buffers = 2GB
checkpoint_segments = 30
checkpoint_completion_target = 0.8
effective_cache_size = 6GB

Reports:

http://results_fklocks.staticloud.com/
http://results_master_for_fks.staticloud.com/

Executive summary: There is a clear regression of less than 10%. There
also appears to be a new source of contention at higher client counts.

I realise that the likely upshot of this, and other concerns that are
generally held at this late stage is that this patch will not make it
into 9.2 . For what it's worth, that comes as a big disappointment to
me. I would like to thank both Alvaro and Noah for their hard work
here.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-25 Thread Simon Riggs
On Sat, Mar 17, 2012 at 10:45 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 Here is v11.  This version is mainly updated to add pg_upgrade support,
 as discussed.  It also contains the README file that was posted earlier
 (plus wording fixes per Bruce), a couple of bug fixes, and some comment
 updates.

The main thing we're waiting on are the performance tests to confirm
the lack of regression.

You are working on that, right?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-17 Thread Alvaro Herrera

Excerpts from Simon Riggs's message of lun mar 05 15:28:59 -0300 2012:
 
 On Mon, Feb 27, 2012 at 2:47 AM, Robert Haas robertmh...@gmail.com wrote:

  Regarding performance, the good thing about this patch is that if you
  have an operation that used to block, it might now not block.  So maybe
  multixact-related operation is a bit slower than before, but if it
  allows you to continue operating rather than sit waiting until some
  other transaction releases you, it's much better.
 
  That's probably true, although there is some deferred cost that is
  hard to account for.  You might not block immediately, but then later
  somebody might block either because the mxact SLRU now needs fsyncs or
  because they've got to decode an mxid long after the relevant segment
  has been evicted from the SLRU buffers.  In general, it's hard to
  bound that latter cost, because you only avoid blocking once (when the
  initial update happens) but you might pay the extra cost of decoding
  the mxid as many times as the row is read, which could be arbitrarily
  many.  How much of a problem that is in practice, I'm not completely
  sure, but it has worried me before and it still does.  In the worst
  case scenario, a handful of frequently-accessed rows with MXIDs all of
  whose members are dead except for the UPDATE they contain could result
  in continual SLRU cache-thrashing.
 
 Cases I regularly see involve wait times of many seconds.
 
 When this patch helps, it will help performance by algorithmic gains,
 so perhaps x10-100.
 
 That can and should be demonstrated though, I agree.

BTW, the isolation tester cases have a few places that in the unpatched
code die with deadlocks and with the patched code continue without
dying.  Others are cases that block in unpatched master, and continue
without blocking in patched.  This should be enough proof that there are
algorithmic gains here.

There's also a test case that demostrates a fix for the problem (pointed
out by docs) that if you acquire a row lock, then a subxact upgrades it
(say by deleting the row) and the subxact aborts, the original row lock
is lost.  With the patched code, the original lock is no longer lost.

I completely agree with the idea that we need some mitigation against
repeated lookups of mxids that contain committed updates.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-17 Thread Alvaro Herrera

Excerpts from Simon Riggs's message of mar mar 06 18:33:13 -0300 2012:

 The lock modes are correct, appropriate and IMHO have meaningful
 names. No redesign required here.
 
 Not sure about the naming of some of the flag bits however.

Feel free to suggest improvements ... I've probably seen them for too
long to find them anything but what I intended them to mean.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-17 Thread Alvaro Herrera

Excerpts from Simon Riggs's message of mar mar 06 17:28:12 -0300 2012:
 On Tue, Mar 6, 2012 at 7:39 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 
  We provide four levels of tuple locking strength: SELECT FOR KEY UPDATE is
  super-exclusive locking (used to delete tuples and more generally to update
  tuples modifying the values of the columns that make up the key of the 
  tuple);
  SELECT FOR UPDATE is a standards-compliant exclusive lock; SELECT FOR SHARE
  implements shared locks; and finally SELECT FOR KEY SHARE is a super-weak 
  mode
  that does not conflict with exclusive mode, but conflicts with SELECT FOR 
  KEY
  UPDATE.  This last mode implements a mode just strong enough to implement RI
  checks, i.e. it ensures that tuples do not go away from under a check, 
  without
  blocking when some other transaction that want to update the tuple without
  changing its key.
 
 So there are 4 lock types, but we only have room for 3 on the tuple
 header, so we store the least common/deprecated of the 4 types as a
 multixactid. Some rewording would help there.

Hmm, I rewrote that paragraph two times.  I'll try to adjust it a bit
more.

 My understanding is that all of theses workloads will change
 
 * Users of explicit SHARE lockers will be slightly worse in the case
 of the 1st locker, but then after that they'll be the same as before.

Right.  (We're assuming that there *are* users of SHARE locks, which I'm
not sure to be a given.)

 * Updates against an RI locked table will be dramatically faster
 because of reduced lock waits

Correct.

 ...and that these previous workloads are effectively unchanged:
 
 * Stream of RI checks causes mxacts

Yes.

 * Multi row deadlocks still possible

Yes.

 * Queues of writers still wait in the same way

Yes.

 * Deletes don't cause mxacts unless by same transaction

Yeah .. there's no way for anyone to not conflict with a FOR KEY UPDATE
lock (the strength grabbed by a delete) unless you're the same
transaction.

  The possibility of having an update within a MultiXact means that they must
  persist across crashes and restarts: a future reader of the tuple needs to
  figure out whether the update committed or aborted.  So we have a 
  requirement
  that pg_multixact needs to retain pages of its data until we're certain that
  the MultiXacts in them are no longer of interest.
 
 I think the no longer of interest aspect needs to be tracked more
 closely because it will necessarily lead to more I/O.

Not sure what you mean here.

 If we store the LSN on each mxact page, as I think we need to, we can
 get rid of pages more quickly if we know they don't have an LSN set.
 So its possible we can optimise that more.

Hmm, I had originally thought that this was rather pointless because it
was unlikely that a segment would *never* have *all* multis not
containing updates.  But then, maybe Robert is right and there are users
out there that run a lot of RI checks and never update the masters ...
Hm.  I'm not sure that LSN tracking is the right tool to do that
optimization, however -- I mean, a single multi containing an update in
a whole segment will prevent that segment from being considered useless.

  VACUUM is in charge of removing old MultiXacts at the time of tuple 
  freezing.
 
 You mean mxact segments?

Well, both.  When a tuple is frozen, we both remove its Xmin/Xmax and
any possible multi that it might have in Xmax.  That's what I really
meant above.  But also, vacuum will remove pg_multixact segments just as
it will remove pg_clog segments.

(It is possible, and probably desirable, to remove a Multi much earlier
than freezing the tuple.  The patch does not (yet) do that, however.)

 Surely we set hint bits on tuples same as now? Hope so.

We set hint bits, but if a multi contains an update, we don't set
HEAP_XMAX_COMMITTED even when the update is known committed.  I think
we could do this in some cases.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-16 Thread Alvaro Herrera

Excerpts from Bruce Momjian's message of vie mar 16 00:04:06 -0300 2012:
 
 On Tue, Mar 13, 2012 at 02:35:02PM -0300, Alvaro Herrera wrote:
  
  Excerpts from Bruce Momjian's message of mar mar 13 14:00:52 -0300 2012:
   
   On Tue, Mar 06, 2012 at 04:39:32PM -0300, Alvaro Herrera wrote:
  
When there is a single locker in a tuple, we can just store the locking 
info
in the tuple itself.  We do this by storing the locker's Xid in XMAX, 
and
setting hint bits specifying the locking strength.  There is one 
exception
here: since hint bit space is limited, we do not provide a separate 
hint bit
for SELECT FOR SHARE, so we have to use the extended info in a 
MultiXact in
that case.  (The other cases, SELECT FOR UPDATE and SELECT FOR KEY 
SHARE, are
presumably more commonly used due to being the standards-mandated 
locking
mechanism, or heavily used by the RI code, so we want to provide fast 
paths
for those.)
   
   Are those tuple bits actually hint bits?  They seem quite a bit more
   powerful than a hint.
  
  I'm not sure what's your point.  We've had a hint bit for SELECT FOR
  UPDATE for ages.  Even 8.2 had HEAP_XMAX_EXCL_LOCK and
  HEAP_XMAX_SHARED_LOCK.  Maybe they are misnamed and aren't really
  hints, but it's not the job of this patch to fix that problem.
 
 Now I am confused.  Where do you see the word hint used by
 HEAP_XMAX_EXCL_LOCK and HEAP_XMAX_SHARED_LOCK.  These are tuple infomask
 bits, not hints, meaning they are not optional or there just for
 performance.

Okay, I think this is just a case of confusing terminology.  I have
always assumed (because I have not seen any evidence to the contrary)
that anything in t_infomask and t_infomask2 is a hint bit --
regardless of it being actually a hint or something with a stronger
significance.  HEAP_XMAX_EXCL_LOCK and HEAP_XMAX_SHARED_LOCK are
certainly not optional in the sense that if they are missing, the
meaning of the Xmax field is completely different.  So in all
correctness they are not hints, though we call them that.

Now, if we want to differentiate infomask bits that are just hints from
those that are something else, we can do that, but I'm not sure it's
useful -- AFAICS only XMAX_COMMITTED and XMIN_COMMITTED are proper
hints.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-16 Thread Alvaro Herrera

Excerpts from Alvaro Herrera's message of vie mar 16 10:36:11 -0300 2012:

  Now I am confused.  Where do you see the word hint used by
  HEAP_XMAX_EXCL_LOCK and HEAP_XMAX_SHARED_LOCK.  These are tuple infomask
  bits, not hints, meaning they are not optional or there just for
  performance.
 
 Okay, I think this is just a case of confusing terminology.  I have
 always assumed (because I have not seen any evidence to the contrary)
 that anything in t_infomask and t_infomask2 is a hint bit --
 regardless of it being actually a hint or something with a stronger
 significance.

Maybe this is just my mistake.  I see in
http://wiki.postgresql.org/wiki/Hint_Bits that we only call the
COMMITTED/INVALID infomask bits hints.

I think it's easy enough to correct the README to call them infomask
bits rather than hints .. I'll go do that.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-16 Thread Robert Haas
On Thu, Mar 15, 2012 at 11:09 PM, Bruce Momjian br...@momjian.us wrote:
 On Tue, Mar 13, 2012 at 01:46:24PM -0400, Robert Haas wrote:
 On Mon, Mar 12, 2012 at 3:28 PM, Simon Riggs si...@2ndquadrant.com wrote:
  I agree with you that some worst case performance tests should be
  done. Could you please say what you think the worst cases would be, so
  those can be tested? That would avoid wasting time or getting anything
  backwards.

 I've thought about this some and here's what I've come up with so far:

 I question whether we are in a position to do the testing necessary to
 commit this for 9.2.

Is anyone even working on testing it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-16 Thread Bruce Momjian
On Fri, Mar 16, 2012 at 10:40:01AM -0300, Alvaro Herrera wrote:
 
 Excerpts from Alvaro Herrera's message of vie mar 16 10:36:11 -0300 2012:
 
   Now I am confused.  Where do you see the word hint used by
   HEAP_XMAX_EXCL_LOCK and HEAP_XMAX_SHARED_LOCK.  These are tuple infomask
   bits, not hints, meaning they are not optional or there just for
   performance.
  
  Okay, I think this is just a case of confusing terminology.  I have
  always assumed (because I have not seen any evidence to the contrary)
  that anything in t_infomask and t_infomask2 is a hint bit --
  regardless of it being actually a hint or something with a stronger
  significance.
 
 Maybe this is just my mistake.  I see in
 http://wiki.postgresql.org/wiki/Hint_Bits that we only call the
 COMMITTED/INVALID infomask bits hints.
 
 I think it's easy enough to correct the README to call them infomask
 bits rather than hints .. I'll go do that.

OK, thanks.  I only brought it up so people would not be confused by
thinking these were optional pieces of information, and that the real
information is stored somewhere else.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-16 Thread Bruce Momjian
On Fri, Mar 16, 2012 at 10:08:07AM -0400, Robert Haas wrote:
 On Thu, Mar 15, 2012 at 11:09 PM, Bruce Momjian br...@momjian.us wrote:
  On Tue, Mar 13, 2012 at 01:46:24PM -0400, Robert Haas wrote:
  On Mon, Mar 12, 2012 at 3:28 PM, Simon Riggs si...@2ndquadrant.com wrote:
   I agree with you that some worst case performance tests should be
   done. Could you please say what you think the worst cases would be, so
   those can be tested? That would avoid wasting time or getting anything
   backwards.
 
  I've thought about this some and here's what I've come up with so far:
 
  I question whether we are in a position to do the testing necessary to
  commit this for 9.2.
 
 Is anyone even working on testing it?

No one I know of.  I am just trying to set expectations that this still
has a long way to go.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-16 Thread Alvaro Herrera

Excerpts from Bruce Momjian's message of vie mar 16 15:22:05 -0300 2012:
 
 On Fri, Mar 16, 2012 at 10:08:07AM -0400, Robert Haas wrote:
  On Thu, Mar 15, 2012 at 11:09 PM, Bruce Momjian br...@momjian.us wrote:
   On Tue, Mar 13, 2012 at 01:46:24PM -0400, Robert Haas wrote:
   On Mon, Mar 12, 2012 at 3:28 PM, Simon Riggs si...@2ndquadrant.com 
   wrote:
I agree with you that some worst case performance tests should be
done. Could you please say what you think the worst cases would be, so
those can be tested? That would avoid wasting time or getting anything
backwards.
  
   I've thought about this some and here's what I've come up with so far:
  
   I question whether we are in a position to do the testing necessary to
   commit this for 9.2.
  
  Is anyone even working on testing it?
 
 No one I know of.  I am just trying to set expectations that this still
 has a long way to go.

A Command Prompt colleague is on it.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Simon Riggs
On Wed, Mar 14, 2012 at 5:23 PM, Robert Haas robertmh...@gmail.com wrote:

 You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on mxid
 lookups, so I think something more sophisticated is needed to exercise that
 cost.  Not sure what.

 I don't think HEAP_XMAX_COMMITTED is much help, because committed !=
 all-visible.

So because committed does not equal all visible there will be
additional lookups on mxids? That's complete rubbish.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Simon Riggs
On Thu, Mar 15, 2012 at 2:15 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Mar 14, 2012 at 6:10 PM, Noah Misch n...@leadboat.com wrote:
 Well, post-release, the cat is out of the bag: we'll be stuck with
 this whether the performance characteristics are acceptable or not.
 That's why we'd better be as sure as possible before committing to
 this implementation that there's nothing we can't live with.  It's not
 like there's any reasonable way to turn this off if you don't like it.

 I disagree; we're only carving in stone the FOR KEY SHARE and FOR KEY UPDATE
 syntax additions.  We could even avoid doing that by not documenting them.  A
 later major release could implement them using a completely different
 mechanism or even reduce them to aliases, KEY SHARE = SHARE and KEY UPDATE =
 UPDATE.  To be sure, let's still do a good job the first time.

 What I mean is really that, once the release is out, we don't get to
 take it back.  Sure, the next release can fix things, but any
 regressions will become obstacles to upgrading and pain points for new
 users.

This comment is completely superfluous. It's a complete waste of time
to turn up on a thread and remind people that if they commit something
and it doesn't actually work that it would be a bad thing. Why, we
might ask do you think that thought needs to be expressed here?
Please, don't answer, lets spend the time on actually reviewing the
patch.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Simon Riggs
On Thu, Mar 15, 2012 at 2:26 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Mar 14, 2012 at 9:17 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
  Agreed.  But speaking of that, why exactly do we fsync the multixact SLRU 
  today?

 Good question.  So far, I can't think of a reason.  nextMulti is critical,
 but we already fsync it with pg_control.  We could delete the other 
 multixact
 state data at every startup and set OldestVisibleMXactId accordingly.

 Hmm, yeah.

 In a way, the fact that we don't do that is kind of fortuitous in this
 situation.  I had just assumed that we were not fsyncing it because
 there seems to be no reason to do so.  But since we are, we already
 know that the fsyncs resulting from frequent mxid allocation aren't a
 huge pain point.  If they were, somebody would have presumably
 complained about it and fixed it before now.  So that means that what
 we're really worrying about here is the overhead of fsyncing a little
 more often, which is a lot less scary than starting to do it when we
 weren't previously.

Good

 Now, we could look at this as an opportunity to optimize the existing
 implementation by removing the fsyncs, rather than adding the new
 infrastructure Alvaro is proposing.

This is not an exercise in tuning mxact code. There is a serious
algorithmic problem that is causing real world problems.

Removing the fsync will *not* provide a solution to the problem, so
there is no opportunity here.

 But that would only make sense if
 we thought that getting rid of the fsyncs would be more valuable than
 avoiding the blocking here, and I don't.

You're right that the existing code could use some optimisation.

I'm a little tired, but I can't see a reason to fsync this except at checkpoint.

Also seeing that we issue 2 WAL records for each RI check. We issue
one during MultiXactIdCreate/MultiXactIdExpand and then immediately
afterwards issue a XLOG_HEAP_LOCK record. The comments on both show
that each thinks it is doing it for the same reason and is the only
place its being done. Alvaro, any ideas why that is.


 I still think that someone needs to do some benchmarking here, because
 this is a big complicated performance patch, and we can't predict the
 impact of it on real-world scenarios without testing.  There is
 clearly some additional overhead, and it makes sense to measure it and
 hopefully discover that it isn't excessive.  Still, I'm a bit
 relieved.

Very much agreed.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Simon Riggs
On Thu, Mar 15, 2012 at 1:17 AM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 As things stand today

Can I confirm where we are now? Is there another version of the patch
coming out soon?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Alvaro Herrera

Excerpts from Simon Riggs's message of jue mar 15 18:38:53 -0300 2012:
 On Thu, Mar 15, 2012 at 2:26 AM, Robert Haas robertmh...@gmail.com wrote:

  But that would only make sense if
  we thought that getting rid of the fsyncs would be more valuable than
  avoiding the blocking here, and I don't.
 
 You're right that the existing code could use some optimisation.
 
 I'm a little tired, but I can't see a reason to fsync this except at 
 checkpoint.

Hang on.  What fsyncs are we talking about?  I don't see that the
multixact code calls any fsync except that checkpoint and shutdown.

 Also seeing that we issue 2 WAL records for each RI check. We issue
 one during MultiXactIdCreate/MultiXactIdExpand and then immediately
 afterwards issue a XLOG_HEAP_LOCK record. The comments on both show
 that each thinks it is doing it for the same reason and is the only
 place its being done. Alvaro, any ideas why that is.

AFAIR the XLOG_HEAP_LOCK log entry only records the fact that the row is
being locked by a multixact -- it doesn't record the contents (member
xids) of said multixact, which is what the other log entry records.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Alvaro Herrera

Excerpts from Simon Riggs's message of jue mar 15 18:46:44 -0300 2012:
 
 On Thu, Mar 15, 2012 at 1:17 AM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 
  As things stand today
 
 Can I confirm where we are now? Is there another version of the patch
 coming out soon?

Yes, another version is coming soon.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Simon Riggs
On Thu, Mar 15, 2012 at 9:54 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 Excerpts from Simon Riggs's message of jue mar 15 18:38:53 -0300 2012:
 On Thu, Mar 15, 2012 at 2:26 AM, Robert Haas robertmh...@gmail.com wrote:

  But that would only make sense if
  we thought that getting rid of the fsyncs would be more valuable than
  avoiding the blocking here, and I don't.

 You're right that the existing code could use some optimisation.

 I'm a little tired, but I can't see a reason to fsync this except at 
 checkpoint.

 Hang on.  What fsyncs are we talking about?  I don't see that the
 multixact code calls any fsync except that checkpoint and shutdown.

If a dirty page is evicted it will fsync.

 Also seeing that we issue 2 WAL records for each RI check. We issue
 one during MultiXactIdCreate/MultiXactIdExpand and then immediately
 afterwards issue a XLOG_HEAP_LOCK record. The comments on both show
 that each thinks it is doing it for the same reason and is the only
 place its being done. Alvaro, any ideas why that is.

 AFAIR the XLOG_HEAP_LOCK log entry only records the fact that the row is
 being locked by a multixact -- it doesn't record the contents (member
 xids) of said multixact, which is what the other log entry records.

Agreed. But issuing two records when we could issue just one seems a
little strange, especially when the two record types follow one
another so closely - so we end up queuing for the lock twice while
holding the lock on the data block.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Alvaro Herrera

Excerpts from Simon Riggs's message of jue mar 15 19:04:41 -0300 2012:
 
 On Thu, Mar 15, 2012 at 9:54 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 
  Excerpts from Simon Riggs's message of jue mar 15 18:38:53 -0300 2012:
  On Thu, Mar 15, 2012 at 2:26 AM, Robert Haas robertmh...@gmail.com wrote:
 
   But that would only make sense if
   we thought that getting rid of the fsyncs would be more valuable than
   avoiding the blocking here, and I don't.
 
  You're right that the existing code could use some optimisation.
 
  I'm a little tired, but I can't see a reason to fsync this except at 
  checkpoint.
 
  Hang on.  What fsyncs are we talking about?  I don't see that the
  multixact code calls any fsync except that checkpoint and shutdown.
 
 If a dirty page is evicted it will fsync.

Ah, right.

  Also seeing that we issue 2 WAL records for each RI check. We issue
  one during MultiXactIdCreate/MultiXactIdExpand and then immediately
  afterwards issue a XLOG_HEAP_LOCK record. The comments on both show
  that each thinks it is doing it for the same reason and is the only
  place its being done. Alvaro, any ideas why that is.
 
  AFAIR the XLOG_HEAP_LOCK log entry only records the fact that the row is
  being locked by a multixact -- it doesn't record the contents (member
  xids) of said multixact, which is what the other log entry records.
 
 Agreed. But issuing two records when we could issue just one seems a
 little strange, especially when the two record types follow one
 another so closely - so we end up queuing for the lock twice while
 holding the lock on the data block.

Hmm, that seems optimization that could be done separately.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Simon Riggs
On Thu, Mar 15, 2012 at 10:13 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 Excerpts from Simon Riggs's message of jue mar 15 19:04:41 -0300 2012:

 On Thu, Mar 15, 2012 at 9:54 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 
  Excerpts from Simon Riggs's message of jue mar 15 18:38:53 -0300 2012:
  On Thu, Mar 15, 2012 at 2:26 AM, Robert Haas robertmh...@gmail.com 
  wrote:
 
   But that would only make sense if
   we thought that getting rid of the fsyncs would be more valuable than
   avoiding the blocking here, and I don't.
 
  You're right that the existing code could use some optimisation.
 
  I'm a little tired, but I can't see a reason to fsync this except at 
  checkpoint.
 
  Hang on.  What fsyncs are we talking about?  I don't see that the
  multixact code calls any fsync except that checkpoint and shutdown.

 If a dirty page is evicted it will fsync.

 Ah, right.

  Also seeing that we issue 2 WAL records for each RI check. We issue
  one during MultiXactIdCreate/MultiXactIdExpand and then immediately
  afterwards issue a XLOG_HEAP_LOCK record. The comments on both show
  that each thinks it is doing it for the same reason and is the only
  place its being done. Alvaro, any ideas why that is.
 
  AFAIR the XLOG_HEAP_LOCK log entry only records the fact that the row is
  being locked by a multixact -- it doesn't record the contents (member
  xids) of said multixact, which is what the other log entry records.

 Agreed. But issuing two records when we could issue just one seems a
 little strange, especially when the two record types follow one
 another so closely - so we end up queuing for the lock twice while
 holding the lock on the data block.

 Hmm, that seems optimization that could be done separately.

Oh yes, definitely not something for you to add to the main patch.

Just some additional tuning to alleviate Robert's concerns.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Robert Haas
On Thu, Mar 15, 2012 at 5:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Wed, Mar 14, 2012 at 5:23 PM, Robert Haas robertmh...@gmail.com wrote:
 You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on mxid
 lookups, so I think something more sophisticated is needed to exercise that
 cost.  Not sure what.

 I don't think HEAP_XMAX_COMMITTED is much help, because committed !=
 all-visible.

 So because committed does not equal all visible there will be
 additional lookups on mxids? That's complete rubbish.

Noah seemed to be implying that once the updating transaction
committed, HEAP_XMAX_COMMITTED would get set and save the mxid lookup.
 But I think that's not true, because anyone who looks at the tuple
afterward will still need to know the exact xmax, to test it against
their snapshot.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Alvaro Herrera

Excerpts from Robert Haas's message of jue mar 15 21:37:36 -0300 2012:
 
 On Thu, Mar 15, 2012 at 5:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
  On Wed, Mar 14, 2012 at 5:23 PM, Robert Haas robertmh...@gmail.com wrote:
  You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on 
  mxid
  lookups, so I think something more sophisticated is needed to exercise 
  that
  cost.  Not sure what.
 
  I don't think HEAP_XMAX_COMMITTED is much help, because committed !=
  all-visible.
 
  So because committed does not equal all visible there will be
  additional lookups on mxids? That's complete rubbish.
 
 Noah seemed to be implying that once the updating transaction
 committed, HEAP_XMAX_COMMITTED would get set and save the mxid lookup.
  But I think that's not true, because anyone who looks at the tuple
 afterward will still need to know the exact xmax, to test it against
 their snapshot.

Yeah, we don't set HEAP_XMAX_COMMITTED on multis, even when there's an
update in it and it committed.  I think we could handle it, at least
some of the cases, but that'd require careful re-examination of all the
tqual.c code, which is not something I want to do right now.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Noah Misch
On Thu, Mar 15, 2012 at 08:37:36PM -0400, Robert Haas wrote:
 On Thu, Mar 15, 2012 at 5:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
  On Wed, Mar 14, 2012 at 5:23 PM, Robert Haas robertmh...@gmail.com wrote:
  You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on 
  mxid
  lookups, so I think something more sophisticated is needed to exercise 
  that
  cost. ?Not sure what.
 
  I don't think HEAP_XMAX_COMMITTED is much help, because committed !=
  all-visible.
 
  So because committed does not equal all visible there will be
  additional lookups on mxids? That's complete rubbish.
 
 Noah seemed to be implying that once the updating transaction
 committed, HEAP_XMAX_COMMITTED would get set and save the mxid lookup.
  But I think that's not true, because anyone who looks at the tuple
 afterward will still need to know the exact xmax, to test it against
 their snapshot.

Yeah, my comment above was wrong.  I agree that we'll need to retrieve the
mxid members during every MVCC scan until we either mark the page all-visible
or have occasion to simplify the mxid xmax to the updater xid.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 02:35:02PM -0300, Alvaro Herrera wrote:
 
 Excerpts from Bruce Momjian's message of mar mar 13 14:00:52 -0300 2012:
  
  On Tue, Mar 06, 2012 at 04:39:32PM -0300, Alvaro Herrera wrote:
 
   When there is a single locker in a tuple, we can just store the locking 
   info
   in the tuple itself.  We do this by storing the locker's Xid in XMAX, and
   setting hint bits specifying the locking strength.  There is one exception
   here: since hint bit space is limited, we do not provide a separate hint 
   bit
   for SELECT FOR SHARE, so we have to use the extended info in a MultiXact 
   in
   that case.  (The other cases, SELECT FOR UPDATE and SELECT FOR KEY SHARE, 
   are
   presumably more commonly used due to being the standards-mandated locking
   mechanism, or heavily used by the RI code, so we want to provide fast 
   paths
   for those.)
  
  Are those tuple bits actually hint bits?  They seem quite a bit more
  powerful than a hint.
 
 I'm not sure what's your point.  We've had a hint bit for SELECT FOR
 UPDATE for ages.  Even 8.2 had HEAP_XMAX_EXCL_LOCK and
 HEAP_XMAX_SHARED_LOCK.  Maybe they are misnamed and aren't really
 hints, but it's not the job of this patch to fix that problem.

Now I am confused.  Where do you see the word hint used by
HEAP_XMAX_EXCL_LOCK and HEAP_XMAX_SHARED_LOCK.  These are tuple infomask
bits, not hints, meaning they are not optional or there just for
performance.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Bruce Momjian
On Thu, Mar 15, 2012 at 11:04:06PM -0400, Bruce Momjian wrote:
 On Tue, Mar 13, 2012 at 02:35:02PM -0300, Alvaro Herrera wrote:
  
  Excerpts from Bruce Momjian's message of mar mar 13 14:00:52 -0300 2012:
   
   On Tue, Mar 06, 2012 at 04:39:32PM -0300, Alvaro Herrera wrote:
  
When there is a single locker in a tuple, we can just store the locking 
info
in the tuple itself.  We do this by storing the locker's Xid in XMAX, 
and
setting hint bits specifying the locking strength.  There is one 
exception
here: since hint bit space is limited, we do not provide a separate 
hint bit
for SELECT FOR SHARE, so we have to use the extended info in a 
MultiXact in
that case.  (The other cases, SELECT FOR UPDATE and SELECT FOR KEY 
SHARE, are
presumably more commonly used due to being the standards-mandated 
locking
mechanism, or heavily used by the RI code, so we want to provide fast 
paths
for those.)
   
   Are those tuple bits actually hint bits?  They seem quite a bit more
   powerful than a hint.
  
  I'm not sure what's your point.  We've had a hint bit for SELECT FOR
  UPDATE for ages.  Even 8.2 had HEAP_XMAX_EXCL_LOCK and
  HEAP_XMAX_SHARED_LOCK.  Maybe they are misnamed and aren't really
  hints, but it's not the job of this patch to fix that problem.
 
 Now I am confused.  Where do you see the word hint used by
 HEAP_XMAX_EXCL_LOCK and HEAP_XMAX_SHARED_LOCK.  These are tuple infomask
 bits, not hints, meaning they are not optional or there just for
 performance.

Are you saying that the bit is only a guide and is there only for
performance?  If so, I understand why it is called hint.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-15 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 01:46:24PM -0400, Robert Haas wrote:
 On Mon, Mar 12, 2012 at 3:28 PM, Simon Riggs si...@2ndquadrant.com wrote:
  I agree with you that some worst case performance tests should be
  done. Could you please say what you think the worst cases would be, so
  those can be tested? That would avoid wasting time or getting anything
  backwards.
 
 I've thought about this some and here's what I've come up with so far:

I question whether we are in a position to do the testing necessary to
commit this for 9.2.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-14 Thread Robert Haas
On Tue, Mar 13, 2012 at 11:42 PM, Noah Misch n...@leadboat.com wrote:
 More often than that; each 2-member mxid takes 4 bytes in an offsets file and
 10 bytes in a members file.  So, more like one fsync per ~580 mxids.  Note
 that we already fsync the multixact SLRUs today, so any increase will arise
 from the widening of member entries from 4 bytes to 5.  The realism of this
 test is attractive.  Nearly-static parent tables are plenty common, and this
 test will illustrate the impact on those designs.

Agreed.  But speaking of that, why exactly do we fsync the multixact SLRU today?

 You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on mxid
 lookups, so I think something more sophisticated is needed to exercise that
 cost.  Not sure what.

I don't think HEAP_XMAX_COMMITTED is much help, because committed !=
all-visible.  HEAP_XMAX_INVALID will obviously help, when it happens.

 This isn't exactly a test case, but from Noah's previous comments I
 gather that there is a theoretical risk of mxid consumption running
 ahead of xid consumption.  We should try to think about whether there
 are any realistic workloads where that might actually happen.  I'm
 willing to believe that there aren't, but not just because somebody
 asserts it.  The reason I'm concerned about this is because, if it
 should happen, the result will be more frequent anti-wraparound
 vacuums on every table in the cluster.  Those are already quite
 painful for some users.

 Yes.  Pre-release, what can we really do here other than have more people
 thinking about ways it might happen in practice?  Post-release, we could
 suggest monitoring methods or perhaps have VACUUM emit a WARNING when a table
 is using more mxid space than xid space.

Well, post-release, the cat is out of the bag: we'll be stuck with
this whether the performance characteristics are acceptable or not.
That's why we'd better be as sure as possible before committing to
this implementation that there's nothing we can't live with.  It's not
like there's any reasonable way to turn this off if you don't like it.

 Also consider a benchmark that does plenty of non-key updates on a parent
 table with no activity on the child table.  We'll pay the overhead of
 determining that the key column(s) have not changed, but it will never pay off
 by preventing a lock wait.

Good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-14 Thread Noah Misch
On Wed, Mar 14, 2012 at 01:23:14PM -0400, Robert Haas wrote:
 On Tue, Mar 13, 2012 at 11:42 PM, Noah Misch n...@leadboat.com wrote:
  More often than that; each 2-member mxid takes 4 bytes in an offsets file 
  and
  10 bytes in a members file. ?So, more like one fsync per ~580 mxids. ?Note
  that we already fsync the multixact SLRUs today, so any increase will arise
  from the widening of member entries from 4 bytes to 5. ?The realism of this
  test is attractive. ?Nearly-static parent tables are plenty common, and this
  test will illustrate the impact on those designs.
 
 Agreed.  But speaking of that, why exactly do we fsync the multixact SLRU 
 today?

Good question.  So far, I can't think of a reason.  nextMulti is critical,
but we already fsync it with pg_control.  We could delete the other multixact
state data at every startup and set OldestVisibleMXactId accordingly.

  You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on mxid
  lookups, so I think something more sophisticated is needed to exercise that
  cost. ?Not sure what.
 
 I don't think HEAP_XMAX_COMMITTED is much help, because committed !=
 all-visible.  HEAP_XMAX_INVALID will obviously help, when it happens.

True.  The patch (see ResetMultiHintBit()) also replaces a multixact xmax with
the updater xid when all transactions of the multixact have ended.  You would
need a test workload with long-running multixacts that delay such replacement.
However, the workloads that come to mind are the very workloads for which this
patch eliminates lock waits; they wouldn't illustrate a worst-case.

  This isn't exactly a test case, but from Noah's previous comments I
  gather that there is a theoretical risk of mxid consumption running
  ahead of xid consumption. ?We should try to think about whether there
  are any realistic workloads where that might actually happen. ?I'm
  willing to believe that there aren't, but not just because somebody
  asserts it. ?The reason I'm concerned about this is because, if it
  should happen, the result will be more frequent anti-wraparound
  vacuums on every table in the cluster. ?Those are already quite
  painful for some users.
 
  Yes. ?Pre-release, what can we really do here other than have more people
  thinking about ways it might happen in practice? ?Post-release, we could
  suggest monitoring methods or perhaps have VACUUM emit a WARNING when a 
  table
  is using more mxid space than xid space.
 
 Well, post-release, the cat is out of the bag: we'll be stuck with
 this whether the performance characteristics are acceptable or not.
 That's why we'd better be as sure as possible before committing to
 this implementation that there's nothing we can't live with.  It's not
 like there's any reasonable way to turn this off if you don't like it.

I disagree; we're only carving in stone the FOR KEY SHARE and FOR KEY UPDATE
syntax additions.  We could even avoid doing that by not documenting them.  A
later major release could implement them using a completely different
mechanism or even reduce them to aliases, KEY SHARE = SHARE and KEY UPDATE =
UPDATE.  To be sure, let's still do a good job the first time.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-14 Thread Alvaro Herrera

Excerpts from Noah Misch's message of mié mar 14 19:10:00 -0300 2012:
 
 On Wed, Mar 14, 2012 at 01:23:14PM -0400, Robert Haas wrote:
  On Tue, Mar 13, 2012 at 11:42 PM, Noah Misch n...@leadboat.com wrote:
   More often than that; each 2-member mxid takes 4 bytes in an offsets file 
   and
   10 bytes in a members file. ?So, more like one fsync per ~580 mxids. ?Note
   that we already fsync the multixact SLRUs today, so any increase will 
   arise
   from the widening of member entries from 4 bytes to 5. ?The realism of 
   this
   test is attractive. ?Nearly-static parent tables are plenty common, and 
   this
   test will illustrate the impact on those designs.
  
  Agreed.  But speaking of that, why exactly do we fsync the multixact SLRU 
  today?
 
 Good question.  So far, I can't think of a reason.  nextMulti is critical,
 but we already fsync it with pg_control.  We could delete the other multixact
 state data at every startup and set OldestVisibleMXactId accordingly.

Hmm, yeah.

   You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on 
   mxid
   lookups, so I think something more sophisticated is needed to exercise 
   that
   cost. ?Not sure what.
  
  I don't think HEAP_XMAX_COMMITTED is much help, because committed !=
  all-visible.  HEAP_XMAX_INVALID will obviously help, when it happens.
 
 True.  The patch (see ResetMultiHintBit()) also replaces a multixact xmax with
 the updater xid when all transactions of the multixact have ended.

I have noticed that this code is not correct, because we don't know that
we're holding an appropriate lock on the page, so we can't simply change
the Xmax and reset those hint bits.  As things stand today, mxids
persist longer.  (We could do some cleanup at HOT-style page prune, for
example, though the lock we need is even weaker than that.)  Overall
this means that coming up with a test case demonstrating this pressure
probably isn't that hard.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 6:10 PM, Noah Misch n...@leadboat.com wrote:
 Well, post-release, the cat is out of the bag: we'll be stuck with
 this whether the performance characteristics are acceptable or not.
 That's why we'd better be as sure as possible before committing to
 this implementation that there's nothing we can't live with.  It's not
 like there's any reasonable way to turn this off if you don't like it.

 I disagree; we're only carving in stone the FOR KEY SHARE and FOR KEY UPDATE
 syntax additions.  We could even avoid doing that by not documenting them.  A
 later major release could implement them using a completely different
 mechanism or even reduce them to aliases, KEY SHARE = SHARE and KEY UPDATE =
 UPDATE.  To be sure, let's still do a good job the first time.

What I mean is really that, once the release is out, we don't get to
take it back.  Sure, the next release can fix things, but any
regressions will become obstacles to upgrading and pain points for new
users.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-14 Thread Robert Haas
On Wed, Mar 14, 2012 at 9:17 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
  Agreed.  But speaking of that, why exactly do we fsync the multixact SLRU 
  today?

 Good question.  So far, I can't think of a reason.  nextMulti is critical,
 but we already fsync it with pg_control.  We could delete the other multixact
 state data at every startup and set OldestVisibleMXactId accordingly.

 Hmm, yeah.

In a way, the fact that we don't do that is kind of fortuitous in this
situation.  I had just assumed that we were not fsyncing it because
there seems to be no reason to do so.  But since we are, we already
know that the fsyncs resulting from frequent mxid allocation aren't a
huge pain point.  If they were, somebody would have presumably
complained about it and fixed it before now.  So that means that what
we're really worrying about here is the overhead of fsyncing a little
more often, which is a lot less scary than starting to do it when we
weren't previously.

Now, we could look at this as an opportunity to optimize the existing
implementation by removing the fsyncs, rather than adding the new
infrastructure Alvaro is proposing.  But that would only make sense if
we thought that getting rid of the fsyncs would be more valuable than
avoiding the blocking here, and I don't.

I still think that someone needs to do some benchmarking here, because
this is a big complicated performance patch, and we can't predict the
impact of it on real-world scenarios without testing.  There is
clearly some additional overhead, and it makes sense to measure it and
hopefully discover that it isn't excessive.  Still, I'm a bit
relieved.

 I have noticed that this code is not correct, because we don't know that
 we're holding an appropriate lock on the page, so we can't simply change
 the Xmax and reset those hint bits.  As things stand today, mxids
 persist longer.  (We could do some cleanup at HOT-style page prune, for
 example, though the lock we need is even weaker than that.)  Overall
 this means that coming up with a test case demonstrating this pressure
 probably isn't that hard.

What would such a test case look like?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-13 Thread Bruce Momjian
On Tue, Mar 06, 2012 at 04:39:32PM -0300, Alvaro Herrera wrote:
 Here's a first attempt at a README illustrating this.  I intend this to
 be placed in src/backend/access/heap/README.tuplock; the first three
 paragraphs are stolen from the comment in heap_lock_tuple, so I'd remove
 those from there, directing people to this new file instead.  Is there
 something that you think should be covered more extensively (or at all)
 here?
...
 
 When there is a single locker in a tuple, we can just store the locking info
 in the tuple itself.  We do this by storing the locker's Xid in XMAX, and
 setting hint bits specifying the locking strength.  There is one exception
 here: since hint bit space is limited, we do not provide a separate hint bit
 for SELECT FOR SHARE, so we have to use the extended info in a MultiXact in
 that case.  (The other cases, SELECT FOR UPDATE and SELECT FOR KEY SHARE, are
 presumably more commonly used due to being the standards-mandated locking
 mechanism, or heavily used by the RI code, so we want to provide fast paths
 for those.)

Are those tuple bits actually hint bits?  They seem quite a bit more
powerful than a hint.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-13 Thread Robert Haas
On Mon, Mar 12, 2012 at 9:24 PM, Noah Misch n...@leadboat.com wrote:
 When we lock an update-in-progress row, we walk the t_ctid chain and lock all
 descendant tuples.  They may all have uncommitted xmins.  This is essential to
 ensure that the final outcome of the updating transaction does not affect
 whether the locking transaction has its KEY SHARE lock.  Similarly, when we
 update a previously-locked tuple, we copy any locks (always KEY SHARE locks)
 to the new version.  That new tuple is both uncommitted and has locks, and we
 cannot easily sacrifice either property.  Do you see a way to extend your
 scheme to cover these needs?

No, I think that sinks it.  Good analysis.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-13 Thread Alvaro Herrera

Excerpts from Bruce Momjian's message of mar mar 13 14:00:52 -0300 2012:
 
 On Tue, Mar 06, 2012 at 04:39:32PM -0300, Alvaro Herrera wrote:

  When there is a single locker in a tuple, we can just store the locking info
  in the tuple itself.  We do this by storing the locker's Xid in XMAX, and
  setting hint bits specifying the locking strength.  There is one exception
  here: since hint bit space is limited, we do not provide a separate hint bit
  for SELECT FOR SHARE, so we have to use the extended info in a MultiXact in
  that case.  (The other cases, SELECT FOR UPDATE and SELECT FOR KEY SHARE, 
  are
  presumably more commonly used due to being the standards-mandated locking
  mechanism, or heavily used by the RI code, so we want to provide fast paths
  for those.)
 
 Are those tuple bits actually hint bits?  They seem quite a bit more
 powerful than a hint.

I'm not sure what's your point.  We've had a hint bit for SELECT FOR
UPDATE for ages.  Even 8.2 had HEAP_XMAX_EXCL_LOCK and
HEAP_XMAX_SHARED_LOCK.  Maybe they are misnamed and aren't really
hints, but it's not the job of this patch to fix that problem.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-13 Thread Robert Haas
On Mon, Mar 12, 2012 at 3:28 PM, Simon Riggs si...@2ndquadrant.com wrote:
 I agree with you that some worst case performance tests should be
 done. Could you please say what you think the worst cases would be, so
 those can be tested? That would avoid wasting time or getting anything
 backwards.

I've thought about this some and here's what I've come up with so far:

1. SELECT FOR SHARE on a large table on a system with no write cache.

2. A small parent table (say 30 rows or so) and a larger child table
with a many-to-one FK relationship to the parent (say 100 child rows
per parent row), with heavy update activity on the child table, on a
system where fsyncs are very slow.  This should generate lots of mxid
consumption, and every 1600 or so mxids (I think) we've got to fsync;
does that generate a noticeable performance hit?

3. It would be nice to test the impact of increased mxid lookups in
the parent, but I've realized that the visibility map will probably
mask a good chunk of that effect, which is a good thing.  Still, maybe
something like this: a fairly large parent table, say a million rows,
but narrow rows, so that many of them fit on a page, with frequent
reads and occasional updates (if there are only reads, autovacuum
might end with all the visibility map bits set); plus a child table
with one or a few rows per parent which is heavily updated.  In theory
this ought to be good for the patch, since the the more fine-grained
locking will avoid blocking, but in this case the parent table is
large enough that you shouldn't get much blocking anyway, yet you'll
still pay the cost of mxid lookups because the occasional updates on
the parent will clear VM bits.  This might not be the exactly right
workload to measure this effect, but if it's not maybe someone can
devote a little time to thinking about what would be.

4. A plain old pgbench run or two, to see whether there's any
regression when none of this matters at all...

This isn't exactly a test case, but from Noah's previous comments I
gather that there is a theoretical risk of mxid consumption running
ahead of xid consumption.  We should try to think about whether there
are any realistic workloads where that might actually happen.  I'm
willing to believe that there aren't, but not just because somebody
asserts it.  The reason I'm concerned about this is because, if it
should happen, the result will be more frequent anti-wraparound
vacuums on every table in the cluster.  Those are already quite
painful for some users.

It would be nice if Noah or someone else who has reviewed this patch
in detail could comment further.  I am shooting from the hip here, a
bit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-13 Thread Noah Misch
On Tue, Mar 13, 2012 at 01:46:24PM -0400, Robert Haas wrote:
 On Mon, Mar 12, 2012 at 3:28 PM, Simon Riggs si...@2ndquadrant.com wrote:
  I agree with you that some worst case performance tests should be
  done. Could you please say what you think the worst cases would be, so
  those can be tested? That would avoid wasting time or getting anything
  backwards.
 
 I've thought about this some and here's what I've come up with so far:
 
 1. SELECT FOR SHARE on a large table on a system with no write cache.

Easy enough that we may as well check it.  Share-locking an entire large table
is impractical in a real application, so I would not worry if this shows a
substantial regression.

 2. A small parent table (say 30 rows or so) and a larger child table
 with a many-to-one FK relationship to the parent (say 100 child rows
 per parent row), with heavy update activity on the child table, on a
 system where fsyncs are very slow.  This should generate lots of mxid
 consumption, and every 1600 or so mxids (I think) we've got to fsync;
 does that generate a noticeable performance hit?

More often than that; each 2-member mxid takes 4 bytes in an offsets file and
10 bytes in a members file.  So, more like one fsync per ~580 mxids.  Note
that we already fsync the multixact SLRUs today, so any increase will arise
from the widening of member entries from 4 bytes to 5.  The realism of this
test is attractive.  Nearly-static parent tables are plenty common, and this
test will illustrate the impact on those designs.

 3. It would be nice to test the impact of increased mxid lookups in
 the parent, but I've realized that the visibility map will probably
 mask a good chunk of that effect, which is a good thing.  Still, maybe
 something like this: a fairly large parent table, say a million rows,
 but narrow rows, so that many of them fit on a page, with frequent
 reads and occasional updates (if there are only reads, autovacuum
 might end with all the visibility map bits set); plus a child table
 with one or a few rows per parent which is heavily updated.  In theory
 this ought to be good for the patch, since the the more fine-grained
 locking will avoid blocking, but in this case the parent table is
 large enough that you shouldn't get much blocking anyway, yet you'll
 still pay the cost of mxid lookups because the occasional updates on
 the parent will clear VM bits.  This might not be the exactly right
 workload to measure this effect, but if it's not maybe someone can
 devote a little time to thinking about what would be.

You still have HEAP_XMAX_{INVALID,COMMITTED} to reduce the pressure on mxid
lookups, so I think something more sophisticated is needed to exercise that
cost.  Not sure what.

 4. A plain old pgbench run or two, to see whether there's any
 regression when none of this matters at all...

Might as well.

 This isn't exactly a test case, but from Noah's previous comments I
 gather that there is a theoretical risk of mxid consumption running
 ahead of xid consumption.  We should try to think about whether there
 are any realistic workloads where that might actually happen.  I'm
 willing to believe that there aren't, but not just because somebody
 asserts it.  The reason I'm concerned about this is because, if it
 should happen, the result will be more frequent anti-wraparound
 vacuums on every table in the cluster.  Those are already quite
 painful for some users.

Yes.  Pre-release, what can we really do here other than have more people
thinking about ways it might happen in practice?  Post-release, we could
suggest monitoring methods or perhaps have VACUUM emit a WARNING when a table
is using more mxid space than xid space.


Also consider a benchmark that does plenty of non-key updates on a parent
table with no activity on the child table.  We'll pay the overhead of
determining that the key column(s) have not changed, but it will never pay off
by preventing a lock wait.  Granted, this is barely representative of
application behavior.  Perhaps, too, we already have a good sense of this cost
from the HOT benchmarking efforts and have no cause to revisit it.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-12 Thread Robert Haas
On Sun, Feb 26, 2012 at 9:47 PM, Robert Haas robertmh...@gmail.com wrote:
 Regarding performance, the good thing about this patch is that if you
 have an operation that used to block, it might now not block.  So maybe
 multixact-related operation is a bit slower than before, but if it
 allows you to continue operating rather than sit waiting until some
 other transaction releases you, it's much better.

 That's probably true, although there is some deferred cost that is
 hard to account for.  You might not block immediately, but then later
 somebody might block either because the mxact SLRU now needs fsyncs or
 because they've got to decode an mxid long after the relevant segment
 has been evicted from the SLRU buffers.  In general, it's hard to
 bound that latter cost, because you only avoid blocking once (when the
 initial update happens) but you might pay the extra cost of decoding
 the mxid as many times as the row is read, which could be arbitrarily
 many.  How much of a problem that is in practice, I'm not completely
 sure, but it has worried me before and it still does.  In the worst
 case scenario, a handful of frequently-accessed rows with MXIDs all of
 whose members are dead except for the UPDATE they contain could result
 in continual SLRU cache-thrashing.

 From a performance standpoint, we really need to think not only about
 the cases where the patch wins, but also, and maybe more importantly,
 the cases where it loses.  There are some cases where the current
 mechanism, use SHARE locks for foreign keys, is adequate.  In
 particular, it's adequate whenever the parent table is not updated at
 all, or only very lightly.  I believe that those people will pay
 somewhat more with this patch, and especially in any case where
 backends end up waiting for fsyncs in order to create new mxids, but
 also just because I think this patch will have the effect of
 increasing the space consumed by each individual mxid, which imposes a
 distributed cost of its own.

I spent some time thinking about this over the weekend, and I have an
observation, and an idea.  Here's the observation: I believe that
locking a tuple whose xmin is uncommitted is always a noop, because if
it's ever possible for a transaction to wait for an XID that is part
of its own transaction (exact same XID, or sub-XIDs of the same top
XID), then a transaction could deadlock against itself.  I believe
that this is not possible: if a transaction were to wait for an XID
assigned to that same backend, then the lock manager would observe
that an ExclusiveLock on the xid is already held, so the request for a
ShareLock would be granted immediately.  I also don't believe there's
any situation in which the existence of an uncommitted tuple fails to
block another backend, but a lock on that same uncommitted tuple would
have caused another backend to block.  If any of that sounds wrong,
you can stop reading here (but please tell me why it sounds wrong).

If it's right, then here's the idea: what if we stored mxids using
xmin rather than xmax?  This would mean that, instead of making mxids
contain the tuple's original xmax, they'd need to instead contain the
tuple's original xmin.  This might seem like rearranging the deck
chairs on the titanic, but I think it actually works out substantially
better, because if we can assume that the xmin is committed, then we
only need to know its exact value until it becomes older than
RecentGlobalXmin.  This means that a tuple can be both updated and
locked at the same time without the MultiXact SLRU needing to be
crash-safe, because if we crash and restart, any mxids that are still
around from before the crash are known to contain only xmins that are
now all-visible.  We therefore don't need their exact values, so it
doesn't matter if that data actually made it to disk.  Furthermore, in
the case where a previously-locked tuple is read repeatedly, we only
need to keep doing SLRU lookups until the xmin becomes all-visible;
after that, we can set a hint bit indicating that the tuple's xmin is
all-visible, and any future readers (or writers) can use that to skip
the SLRU lookup.  In the degenerate (and probably common) case where a
tuple is already all-visible at the time it's locked, we don't really
need to record the original xmin at all; we can still do so if
convenient, but we can set the xmin-all-visible hint right away, so
nobody needs to probe the SLRU just to get xmin.

In other words, we'd entirely avoid needing to make mxacts crash-safe,
and we'd avoid most of the extra SLRU lookups that the current
implementation requires; they'd only be needed when (and for as long
as) the locked tuple was not yet all-visible.

This also seems like it would make the anti-wraparound issues simpler
to handle - once an mxid is old enough that any xmin it contains must
be all-visible, we can simply overwrite the tuple's xmin with
FrozenXID, which is pretty much what we're already doing anyway.  It's
already the case that a 

Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-12 Thread Simon Riggs
On Mon, Mar 12, 2012 at 5:28 PM, Robert Haas robertmh...@gmail.com wrote:

 In other words, we'd entirely avoid needing to make mxacts crash-safe,
 and we'd avoid most of the extra SLRU lookups that the current
 implementation requires; they'd only be needed when (and for as long
 as) the locked tuple was not yet all-visible.

The current implementation only requires additional lookups in the
update/check case, which is the case that doesn't do anything other
than block right now. Since we're replacing lock contention with
physical access contention even the worst case situation is still
better than what we have now. Please feel free to point out worst case
situations and show that isn't true.

I've also pointed out how to avoid overhead of making mxacts crash
safe when the new facilities are not in use, so I don't see problems
with the proposed mechanism. Given that I am still myself reviewing
the actual code.

So those things are not something we need to avoid.

My feeling is that overwriting xmin is a clever idea, but arrives too
late to require sensible analysis in this stage of the CF. It's not
solving a problem, its just an alternate mechanism and at best an
optimisation of the mechanism. Were we to explore it now, it seems
certain that another person would observe that design were taking
place and so the patch should be rejected, which would be unnecessary
and wasteful. I also think it would alter our ability to diagnose
problems, not least the normal test that xmax matches xmin across an
update.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-12 Thread Robert Haas
On Mon, Mar 12, 2012 at 1:50 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, Mar 12, 2012 at 5:28 PM, Robert Haas robertmh...@gmail.com wrote:
 In other words, we'd entirely avoid needing to make mxacts crash-safe,
 and we'd avoid most of the extra SLRU lookups that the current
 implementation requires; they'd only be needed when (and for as long
 as) the locked tuple was not yet all-visible.

 The current implementation only requires additional lookups in the
 update/check case, which is the case that doesn't do anything other
 than block right now. Since we're replacing lock contention with
 physical access contention even the worst case situation is still
 better than what we have now. Please feel free to point out worst case
 situations and show that isn't true.

I think I already have:

http://archives.postgresql.org/pgsql-hackers/2012-02/msg01258.php

The case I'm worried about is where we are allocating mxids quickly,
and we end up having to wait for fsyncs on mxact segments.  That might
be very slow, but you could argue that it could *possibly* be still
worthwhile if it avoids blocking.  That doesn't strike me as a
slam-dunk, though, because we've already seen and fixed cases where
too many fsyncs causes the performance of the entire system to go down
the tubes (cf. commit 7f242d880b5b5d9642675517466d31373961cf98).  But
it's really bad if there are no updates on the parent table - then,
whatever extra overhead there is will be all for naught, since the
more fine-grained locking doesn't help anyway.

 I've also pointed out how to avoid overhead of making mxacts crash
 safe when the new facilities are not in use, so I don't see problems
 with the proposed mechanism. Given that I am still myself reviewing
 the actual code.

The closest thing I can find to a proposal from you in that regard is
this comment:

# I was really thinking we could skip the fsync of a page if we've not
# persisted anything important on that page, since that was one of
# Robert's performance points.

It might be possible to do something with that idea, but at the moment
I'm not seeing how to make it work.

 So those things are not something we need to avoid.

 My feeling is that overwriting xmin is a clever idea, but arrives too
 late to require sensible analysis in this stage of the CF. It's not
 solving a problem, its just an alternate mechanism and at best an
 optimisation of the mechanism. Were we to explore it now, it seems
 certain that another person would observe that design were taking
 place and so the patch should be rejected, which would be unnecessary
 and wasteful.

Considering that nobody's done any work to resolve the uncertainty
about whether the worst-case performance characteristics of this patch
are acceptable, and considering further that it was undergoing massive
code churn for more than a month after the final CommitFest, I think
it's not that unreasonable to think it might not be ready for prime
time at this point.  In any event, your argument is exactly backwards:
we need to first decide whether the patch needs a redesign and then,
if it does, postpone it.  Deciding that we don't want to postpone it
first, and therefore we're not going to redesign it even if that is
what's really needed makes no sense.

 I also think it would alter our ability to diagnose
 problems, not least the normal test that xmax matches xmin across an
 update.

There's nothing stopping the new tuple from being frozen before the
old one, even today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-12 Thread Simon Riggs
On Mon, Mar 12, 2012 at 6:14 PM, Robert Haas robertmh...@gmail.com wrote:

 Considering that nobody's done any work to resolve the uncertainty
 about whether the worst-case performance characteristics of this patch
 are acceptable, and considering further that it was undergoing massive
 code churn for more than a month after the final CommitFest, I think
 it's not that unreasonable to think it might not be ready for prime
 time at this point.

Thank you for cutting to the chase.

The uncertainty of which you speak is a theoretical point you
raised. It has been explained, but nobody has yet shown the
performance numbers to illustrate the point but only because they
seemed so clear. I would point out that you haven't demonstrated the
existence of a problem either, so redesigning something without any
proof of a problem seems a strange.

Let me explain again what this patch does and why it has such major
performance benefit.

This feature give us a step change in lock reductions from FKs. A real
world best case might be to examine the benefit this patch has on a
large batch load that inserts many new orders for existing customers.
In my example case the orders table has a FK to the customer table. At
the same time as the data load, we attempt to update a customer's
additional details, address or current balance etc. The large load
takes locks on the customer table and keeps them for the whole
transaction. So the customer updates are locked out for multiple
seconds, minutes or maybe hours, depending upon how far you want to
stretch the example. With this patch the customer updates don't cause
lock conflicts but they require mxact lookups in *some* cases, so they
might take 1-10ms extra, rather than 1-10 minutes more. 1000x faster.
The only case that causes the additional lookups is the case that
otherwise would have been locked. So producing best case results is
trivial and can be as enormous as you like.

I agree with you that some worst case performance tests should be
done. Could you please say what you think the worst cases would be, so
those can be tested? That would avoid wasting time or getting anything
backwards.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-12 Thread Noah Misch
On Mon, Mar 12, 2012 at 01:28:11PM -0400, Robert Haas wrote:
 I spent some time thinking about this over the weekend, and I have an
 observation, and an idea.  Here's the observation: I believe that
 locking a tuple whose xmin is uncommitted is always a noop, because if
 it's ever possible for a transaction to wait for an XID that is part
 of its own transaction (exact same XID, or sub-XIDs of the same top
 XID), then a transaction could deadlock against itself.  I believe
 that this is not possible: if a transaction were to wait for an XID
 assigned to that same backend, then the lock manager would observe
 that an ExclusiveLock on the xid is already held, so the request for a
 ShareLock would be granted immediately.  I also don't believe there's
 any situation in which the existence of an uncommitted tuple fails to
 block another backend, but a lock on that same uncommitted tuple would
 have caused another backend to block.  If any of that sounds wrong,
 you can stop reading here (but please tell me why it sounds wrong).

When we lock an update-in-progress row, we walk the t_ctid chain and lock all
descendant tuples.  They may all have uncommitted xmins.  This is essential to
ensure that the final outcome of the updating transaction does not affect
whether the locking transaction has its KEY SHARE lock.  Similarly, when we
update a previously-locked tuple, we copy any locks (always KEY SHARE locks)
to the new version.  That new tuple is both uncommitted and has locks, and we
cannot easily sacrifice either property.  Do you see a way to extend your
scheme to cover these needs?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-07 Thread Gokulakannan Somasundaram
I feel sad, that i followed this topic very late. But i still want to put
forward my views.
Have we thought on the lines of how Robert has implemented relation level
locks. In short it should go like this

a) The locks for enforcing Referential integrity should be taken only when
the rarest of the events( that would cause the integrity failure) occur.
That would be the update of the referenced column. Other cases of update,
delete and insert should not be required to take locks. In this way, we can
reduce a lot of lock traffic.

So if we have a table like employee( empid, empname, ... depid references
dept(deptid)) and table dept(depid depname).

Currently we are taking shared locks on referenced rows in dept table,
whenever we are updating something in the employee table. This should not
happen. Instead any insert / update of referenced column / delete should
check for some lock in its PGPROC structure, which will only get created
when the depid gets updated / deleted( rare event )

b) But the operation of update of the referenced column will be made more
costly. May be it can create something like a predicate lock(used for
enforcing serializable) and keep it in all the PG_PROC structures.

I know this is a abstract idea, but just wanted to know, whether we have
thought on those lines.

Thanks,
Gokul.


Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-07 Thread Simon Riggs
On Wed, Mar 7, 2012 at 9:24 AM, Gokulakannan Somasundaram
gokul...@gmail.com wrote:
 I feel sad, that i followed this topic very late. But i still want to put
 forward my views.
 Have we thought on the lines of how Robert has implemented relation level
 locks. In short it should go like this

 a) The locks for enforcing Referential integrity should be taken only when
 the rarest of the events( that would cause the integrity failure) occur.
 That would be the update of the referenced column. Other cases of update,
 delete and insert should not be required to take locks. In this way, we can
 reduce a lot of lock traffic.

Insert, Update and Delete don't take locks they simply mark the tuples
they change with an xid. Anybody else wanting to wait on the lock
just waits on the xid. We do insert a lock row for each xid, but not
one per row changed.

 So if we have a table like employee( empid, empname, ... depid references
 dept(deptid)) and table dept(depid depname).

 Currently we are taking shared locks on referenced rows in dept table,
 whenever we are updating something in the employee table. This should not
 happen. Instead any insert / update of referenced column / delete should
 check for some lock in its PGPROC structure, which will only get created
 when the depid gets updated / deleted( rare event )

It's worked that way for 5 years, so its too late to modify it now and
this patch won't change that.

The way we do RI locking is designed to prevent holding that in memory
and then having the lock table overflow, which then either requires us
to revert to the current design or upgrade to table level locks to
save space in the lock table - which is a total disaster, if you've
ever worked with DB2.

What you're suggesting is that we store the locks in memory only as a
way of avoiding updating the row.

My understanding is we have two optimisation choices. A single set of
xids can be used in many places, since the same set of transactions
may do roughly the same thing.

1. We could assign a new mxactid every time we touch a new row. That
way there is no correspondence between sets of xids, and we may hold
the same set many times. OTOH since each set is unique we can expand
it easily and we don't need to touch each row once for each lock. That
saves on row touches but it also greatly increases the mxactid
creation rate, which causes cache scrolling.

2. We assign a new mxactid each time we create a new unique set of
rows. We have a separate cache for local sets. This way reduces the
mxactid creation rate but causes row updates each time we lock the
row, which then needs WAL.

(2) is how we currently handle the very difficult decision of how to
optimise this for the general case. I'm not sure that is right in all
cases, but it is at least scalable and it is the devil we know.

 b) But the operation of update of the referenced column will be made more
 costly. May be it can create something like a predicate lock(used for
 enforcing serializable) and keep it in all the PG_PROC structures.

No, updates of referenced columns are exactly the same as now when no
RI checks happening.

If the update occurs when an RI check takes place there is more work
to do, but previously it would have just blocked and done nothing. So
that path is relatively heavyweight but much better than nothing.


 I know this is a abstract idea, but just wanted to know, whether we have
 thought on those lines.

Thanks for your thoughts.

The most useful way to help with this patch right now is to run
performance investigations and see if there are non-optimal cases. We
can then see how the patch handles those. Theory is good, but it needs
to drive experimentation, as I myself re-discover continually.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >