Re: [HACKERS] synchronized snapshots

2011-10-23 Thread Thom Brown
On 23 October 2011 03:15, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown t...@linux.com writes:
 Can I ask why it doesn't return the same snapshot ID each time?
 Surely it can't change since you can only export the snapshot of a
 serializable or repeatable read transaction?

 No, that's incorrect.  You can export from a READ COMMITTED transaction;
 indeed, you'd more or less have to, if you want the control transaction
 to be able to see what the slaves do.

My bad.  I didn't read the documentation carefully enough.  I can make
sense of it now.

Thanks

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-22 Thread Simon Riggs
On Fri, Oct 21, 2011 at 4:36 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 I can see a few alternatives, none of them very pleasant:

 1. Restrict exported snapshots to be loaded only by transactions running
 in the same database as the exporter.  This would fix the problem, but
 it cuts out one of the main use-cases for sync snapshots, namely getting
 cluster-wide-consistent dumps in pg_dumpall.

 4. Somehow mark the xmin of a process that has exported a snapshot so
 that it will be honored in all DBs not just the current one.  The
 difficulty here is that we'd need to know *at the time the snap is
 taken* that it's going to be exported.  (Consider the scenario above,
 except that A doesn't get around to exporting the snapshot it took in
 step 3 until between steps 5 and 6.  If the xmin wasn't already marked
 as globally applicable when vacuum looked at it in step 5, we lose.)
 This is do-able but it will contort the user-visible API of the sync
 snapshots feature.  One way we could do it is to require that
 transactions that want to export snapshots set a transaction mode
 before they take their first snapshot.

1 *and* 4 please.

So, unless explicitly requested, an exported snapshot is limited to
just one database. If explicitly requested to be transportable, we can
use the snapshot in other databases.

This allows us to do parallel pg_dump in both 1+ databases, as well as
allowing pg_dumpall to be fully consistent across all dbs.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-22 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 1 *and* 4 please.

Given the lack of enthusiasm I'm not going to do anything about #4 now.
Somebody else can add it later.

 So, unless explicitly requested, an exported snapshot is limited to
 just one database. If explicitly requested to be transportable, we can
 use the snapshot in other databases.

Yeah, we could make it work like that when it gets added.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-22 Thread Tom Lane
Joachim Wieland j...@mcknight.de writes:
 [ synchronized snapshots patch ]

Applied with, um, rather extensive editorialization.

I'm not convinced that the SSI case is bulletproof yet, but it'll be
easier to test with the code committed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-22 Thread Thom Brown
On 23 October 2011 00:25, Tom Lane t...@sss.pgh.pa.us wrote:
 Joachim Wieland j...@mcknight.de writes:
 [ synchronized snapshots patch ]

 Applied with, um, rather extensive editorialization.

 I'm not convinced that the SSI case is bulletproof yet, but it'll be
 easier to test with the code committed.

Can I ask why it doesn't return the same snapshot ID each time?
Surely it can't change since you can only export the snapshot of a
serializable or repeatable read transaction?  A SELECT
count(pg_export_snapshot()) FROM generate_series(1,1000); would
quickly bork the pg_snapshots directory which any user can run.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-22 Thread Tom Lane
Thom Brown t...@linux.com writes:
 Can I ask why it doesn't return the same snapshot ID each time?
 Surely it can't change since you can only export the snapshot of a
 serializable or repeatable read transaction?

No, that's incorrect.  You can export from a READ COMMITTED transaction;
indeed, you'd more or less have to, if you want the control transaction
to be able to see what the slaves do.

 A SELECT
 count(pg_export_snapshot()) FROM generate_series(1,1000); would
 quickly bork the pg_snapshots directory which any user can run.

Shrug ... you can create a much more severe DOS problem by making
zillions of tables, if the filesystem doesn't handle lots-o-files
well.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Tom Lane
I've thought of another nasty problem for the sync-snapshots patch.
Consider the following sequence of events:

1. Transaction A, which is about to export a snapshot, is running in
   database X.
2. Transaction B is making some changes in database Y.
3. A takes and exports a snapshot showing B's xid as running.
4. Transaction B ends.
5. Autovacuum launches in database Y.  It sees nothing running in Y,
   so it decides it can vacuum dead rows right up to nextXid, including
   anything B deleted.
6. Transaction C starts in database Y, and imports the snapshot from A.
   Now it thinks it can see rows deleted by B ... but vacuum is busy
   removing them, or maybe already finished doing so.

The problem here is that A's xmin is ignored by GetOldestXmin when
calculating cutoff XIDs for non-shared tables in database Y, so it
doesn't protect would-be adoptees of the exported snapshot.

I can see a few alternatives, none of them very pleasant:

1. Restrict exported snapshots to be loaded only by transactions running
in the same database as the exporter.  This would fix the problem, but
it cuts out one of the main use-cases for sync snapshots, namely getting
cluster-wide-consistent dumps in pg_dumpall.

2. Allow a snapshot exported from another database to be loaded so long
as this doesn't cause the DB-local value of GetOldestXmin to go
backwards.  However, in scenarios such as the above, C is certain to
fail such a test.  To make it work, pg_dumpall would have to start
advance guard transactions in each database before it takes the
intended-to-be-shared snapshot, and probably even wait for these to be
oldest.  Ick.

3. Remove the optimization that lets GetOldestXmin ignore XIDs outside
the current database.  This sounds bad, but OTOH I don't think there's
ever been any proof that this optimization is worth much in real-world
usage.  We've already had to lobotomize that optimization for walsender
processes, anyway.

4. Somehow mark the xmin of a process that has exported a snapshot so
that it will be honored in all DBs not just the current one.  The
difficulty here is that we'd need to know *at the time the snap is
taken* that it's going to be exported.  (Consider the scenario above,
except that A doesn't get around to exporting the snapshot it took in
step 3 until between steps 5 and 6.  If the xmin wasn't already marked
as globally applicable when vacuum looked at it in step 5, we lose.)
This is do-able but it will contort the user-visible API of the sync
snapshots feature.  One way we could do it is to require that
transactions that want to export snapshots set a transaction mode
before they take their first snapshot.

Thoughts, better ideas?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Florian Pflug
On Oct21, 2011, at 17:36 , Tom Lane wrote:
 1. Restrict exported snapshots to be loaded only by transactions running
 in the same database as the exporter.  This would fix the problem, but
 it cuts out one of the main use-cases for sync snapshots, namely getting
 cluster-wide-consistent dumps in pg_dumpall.

Isn't the use-case getting consistent *parallel* dumps of a single database
rather than consistent dump of multiple databases? Since we don't have atomic
cross-database commits, what does using the same snapshot to dump multiple
databases buy us?

On that grounds, +1 for option 1 here.

 3. Remove the optimization that lets GetOldestXmin ignore XIDs outside
 the current database.  This sounds bad, but OTOH I don't think there's
 ever been any proof that this optimization is worth much in real-world
 usage.  We've already had to lobotomize that optimization for walsender
 processes, anyway.

Hm, we've told people who wanted cross-database access to tables in the
past to either

  * use dblink or

  * not split their tables over multiple databases in the first place,
and to use schemas instead

If we remove the GetOldestXmin optimization, we're essentially reversing
course on this. Do we really wanna go there?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Andrew Dunstan



On 10/21/2011 12:05 PM, Florian Pflug wrote:

On Oct21, 2011, at 17:36 , Tom Lane wrote:

1. Restrict exported snapshots to be loaded only by transactions running
in the same database as the exporter.  This would fix the problem, but
it cuts out one of the main use-cases for sync snapshots, namely getting
cluster-wide-consistent dumps in pg_dumpall.

Isn't the use-case getting consistent *parallel* dumps of a single database
rather than consistent dump of multiple databases? Since we don't have atomic
cross-database commits, what does using the same snapshot to dump multiple
databases buy us?


That was my understanding of the use case.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 10/21/2011 12:05 PM, Florian Pflug wrote:
 On Oct21, 2011, at 17:36 , Tom Lane wrote:
 1. Restrict exported snapshots to be loaded only by transactions running
 in the same database as the exporter.  This would fix the problem, but
 it cuts out one of the main use-cases for sync snapshots, namely getting
 cluster-wide-consistent dumps in pg_dumpall.

 Isn't the use-case getting consistent *parallel* dumps of a single database
 rather than consistent dump of multiple databases? Since we don't have atomic
 cross-database commits, what does using the same snapshot to dump multiple
 databases buy us?

 That was my understanding of the use case.

Um, which one are you supporting?

Anyway, the value of using the same snapshot across all of a pg_dumpall
run would be that you could be sure that what you'd dumped concerning
role and tablespace objects was consistent with what you then dump about
database-local objects.  (In principle, anyway --- I'm not sure how
much of that happens under SnapshotNow rules because of use of backend
functions.  But you'll most certainly never be able to guarantee it if
pg_dumpall can't export its snapshot to each subsidiary pg_dump run.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 On Oct21, 2011, at 17:36 , Tom Lane wrote:
 3. Remove the optimization that lets GetOldestXmin ignore XIDs outside
 the current database.  This sounds bad, but OTOH I don't think there's
 ever been any proof that this optimization is worth much in real-world
 usage.  We've already had to lobotomize that optimization for walsender
 processes, anyway.

 Hm, we've told people who wanted cross-database access to tables in the
 past to either

   * use dblink or

   * not split their tables over multiple databases in the first place,
 and to use schemas instead

 If we remove the GetOldestXmin optimization, we're essentially reversing
 course on this. Do we really wanna go there?

Huh?  The behavior of GetOldestXmin is purely a backend-internal matter.
I don't see how it's related to cross-database access --- or at least,
changing this would not represent a significant move towards supporting
that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Robert Haas
On Fri, Oct 21, 2011 at 11:36 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I've thought of another nasty problem for the sync-snapshots patch.

 1. Restrict exported snapshots to be loaded only by transactions running
 in the same database as the exporter.  This would fix the problem, but
 it cuts out one of the main use-cases for sync snapshots, namely getting
 cluster-wide-consistent dumps in pg_dumpall.

 2. Allow a snapshot exported from another database to be loaded so long
 as this doesn't cause the DB-local value of GetOldestXmin to go
 backwards.  However, in scenarios such as the above, C is certain to
 fail such a test.  To make it work, pg_dumpall would have to start
 advance guard transactions in each database before it takes the
 intended-to-be-shared snapshot, and probably even wait for these to be
 oldest.  Ick.

 3. Remove the optimization that lets GetOldestXmin ignore XIDs outside
 the current database.  This sounds bad, but OTOH I don't think there's
 ever been any proof that this optimization is worth much in real-world
 usage.  We've already had to lobotomize that optimization for walsender
 processes, anyway.

 4. Somehow mark the xmin of a process that has exported a snapshot so
 that it will be honored in all DBs not just the current one.  The
 difficulty here is that we'd need to know *at the time the snap is
 taken* that it's going to be exported.  (Consider the scenario above,
 except that A doesn't get around to exporting the snapshot it took in
 step 3 until between steps 5 and 6.  If the xmin wasn't already marked
 as globally applicable when vacuum looked at it in step 5, we lose.)
 This is do-able but it will contort the user-visible API of the sync
 snapshots feature.  One way we could do it is to require that
 transactions that want to export snapshots set a transaction mode
 before they take their first snapshot.

I am unexcited by #2 on usability grounds.  I agree with you that #3
might end up being a fairly small pessimization in practice, but I'd
be inclined to just do #1 for now and revisit the issue when and if
someone shows an interest in revamping pg_dumpall to do what you're
proposing (and hopefully a bunch of other cleanup too).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Florian Pflug
On Oct21, 2011, at 19:09 , Tom Lane wrote:
 Florian Pflug f...@phlo.org writes:
 On Oct21, 2011, at 17:36 , Tom Lane wrote:
 3. Remove the optimization that lets GetOldestXmin ignore XIDs outside
 the current database.  This sounds bad, but OTOH I don't think there's
 ever been any proof that this optimization is worth much in real-world
 usage.  We've already had to lobotomize that optimization for walsender
 processes, anyway.
 
 Hm, we've told people who wanted cross-database access to tables in the
 past to either
 
  * use dblink or
 
  * not split their tables over multiple databases in the first place,
and to use schemas instead
 
 If we remove the GetOldestXmin optimization, we're essentially reversing
 course on this. Do we really wanna go there?
 
 Huh?  The behavior of GetOldestXmin is purely a backend-internal matter.
 I don't see how it's related to cross-database access --- or at least,
 changing this would not represent a significant move towards supporting
 that.

AFAIR, the performance hit we'd take by making the vacuum cutoff point
(i.e. GetOldestXmin()) global instead of database-local has been repeatedly
used in the past as an against against cross-database queries. I have to
admit that I currently cannot seem to find an entry in the archives to
back that up, though.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Robert Haas
On Fri, Oct 21, 2011 at 1:40 PM, Florian Pflug f...@phlo.org wrote:
 AFAIR, the performance hit we'd take by making the vacuum cutoff point
 (i.e. GetOldestXmin()) global instead of database-local has been repeatedly
 used in the past as an against against cross-database queries. I have to
 admit that I currently cannot seem to find an entry in the archives to
 back that up, though.

I think the main argument against cross-database queries is that every
place in the backend that, for example, uses an OID to identify a
table would need to be modified to use a database OID and a table OID.
 Even if the distributed performance penalty of such a change doesn't
bother you, the amount of code churn that it would take to make such a
change is mind-boggling.

I haven't seen anyone explain why they really need this feature
anyway, and I think it's going in the wrong direction.  IMHO, anyone
who wants to be doing cross-database queries should be using schemas
instead, and if that's not workable for some reason, then we should
improve the schema implementation until it becomes workable.  I think
that the target use case for separate databases ought to be
multi-tenancy, but what is needed there is actually more isolation
(e.g. wrt/role names, cluster-wide visibility of pg_database contents,
etc.), not less.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Andrew Dunstan



On 10/21/2011 01:06 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 10/21/2011 12:05 PM, Florian Pflug wrote:

On Oct21, 2011, at 17:36 , Tom Lane wrote:

1. Restrict exported snapshots to be loaded only by transactions running
in the same database as the exporter.  This would fix the problem, but
it cuts out one of the main use-cases for sync snapshots, namely getting
cluster-wide-consistent dumps in pg_dumpall.

Isn't the use-case getting consistent *parallel* dumps of a single database
rather than consistent dump of multiple databases? Since we don't have atomic
cross-database commits, what does using the same snapshot to dump multiple
databases buy us?

That was my understanding of the use case.

Um, which one are you supporting?



#1 seemed OK from this POV. Everything else looks ickier and/or more 
fragile, at first glance anyway.



Anyway, the value of using the same snapshot across all of a pg_dumpall
run would be that you could be sure that what you'd dumped concerning
role and tablespace objects was consistent with what you then dump about
database-local objects.  (In principle, anyway --- I'm not sure how
much of that happens under SnapshotNow rules because of use of backend
functions.  But you'll most certainly never be able to guarantee it if
pg_dumpall can't export its snapshot to each subsidiary pg_dump run.)




For someone who is concerned with that, maybe pg_dumpall could have an 
option to take an EXCLUSIVE lock on the shared catalogs?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Florian Pflug
On Oct21, 2011, at 19:47 , Robert Haas wrote:
 On Fri, Oct 21, 2011 at 1:40 PM, Florian Pflug f...@phlo.org wrote:
 AFAIR, the performance hit we'd take by making the vacuum cutoff point
 (i.e. GetOldestXmin()) global instead of database-local has been repeatedly
 used in the past as an against against cross-database queries. I have to
 admit that I currently cannot seem to find an entry in the archives to
 back that up, though.

 I haven't seen anyone explain why they really need this feature
 anyway, and I think it's going in the wrong direction.  IMHO, anyone
 who wants to be doing cross-database queries should be using schemas
 instead, and if that's not workable for some reason, then we should
 improve the schema implementation until it becomes workable.  I think
 that the target use case for separate databases ought to be
 multi-tenancy, but what is needed there is actually more isolation
 (e.g. wrt/role names, cluster-wide visibility of pg_database contents,
 etc.), not less.

Agreed. I wasn't trying to argue for cross-database queries - quite the 
opposite,
actually. My point was more that since we've used database isolation as an
argument against cross-database queries in the past, we shouldn't sacrifice
it now for synchronized snapshots.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 AFAIR, the performance hit we'd take by making the vacuum cutoff point
 (i.e. GetOldestXmin()) global instead of database-local has been repeatedly
 used in the past as an against against cross-database queries. I have to
 admit that I currently cannot seem to find an entry in the archives to
 back that up, though.

To my mind, the main problem with cross-database queries is that none of
the backend is set up to deal with more than one set of system catalogs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Robert Haas
On Fri, Oct 21, 2011 at 2:06 PM, Florian Pflug f...@phlo.org wrote:
 On Oct21, 2011, at 19:47 , Robert Haas wrote:
 On Fri, Oct 21, 2011 at 1:40 PM, Florian Pflug f...@phlo.org wrote:
 AFAIR, the performance hit we'd take by making the vacuum cutoff point
 (i.e. GetOldestXmin()) global instead of database-local has been repeatedly
 used in the past as an against against cross-database queries. I have to
 admit that I currently cannot seem to find an entry in the archives to
 back that up, though.

 I haven't seen anyone explain why they really need this feature
 anyway, and I think it's going in the wrong direction.  IMHO, anyone
 who wants to be doing cross-database queries should be using schemas
 instead, and if that's not workable for some reason, then we should
 improve the schema implementation until it becomes workable.  I think
 that the target use case for separate databases ought to be
 multi-tenancy, but what is needed there is actually more isolation
 (e.g. wrt/role names, cluster-wide visibility of pg_database contents,
 etc.), not less.

 Agreed. I wasn't trying to argue for cross-database queries - quite the 
 opposite,
 actually. My point was more that since we've used database isolation as an
 argument against cross-database queries in the past, we shouldn't sacrifice
 it now for synchronized snapshots.

Right, I agree.  It might be nice to take a cluster-wide dump that is
guaranteed to be transactionally consistent, but I bet a lot of people
would actually be happier to see us go the opposite direction - e.g.
give each database its own XID space, so that activity in one database
doesn't accelerate the need for anti-wraparound vacuums in another
database.  Not sure that could ever actually happen, but the point is
that people probably should not be relying on serializability across
databases too much, because the whole point of the multiple databases
feature is to have multiple, independent databases in one cluster that
are thoroughly isolated from each other, and any future changes we
make should probably lean in that direction.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronized snapshots versus multiple databases

2011-10-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Oct 21, 2011 at 11:36 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 1. Restrict exported snapshots to be loaded only by transactions running
 in the same database as the exporter.  This would fix the problem, but
 it cuts out one of the main use-cases for sync snapshots, namely getting
 cluster-wide-consistent dumps in pg_dumpall.

 I am unexcited by #2 on usability grounds.  I agree with you that #3
 might end up being a fairly small pessimization in practice, but I'd
 be inclined to just do #1 for now and revisit the issue when and if
 someone shows an interest in revamping pg_dumpall to do what you're
 proposing (and hopefully a bunch of other cleanup too).

Seems like that is the consensus view, so that's what I'll do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Tom Lane
Joachim Wieland j...@mcknight.de writes:
 [ synchronized-snapshots patch ]

Looking through this code, it strikes me that SET TRANSACTION SNAPSHOT
is fundamentally incompatible with SERIALIZABLE READ ONLY DEFERRABLE
mode.  That mode assumes that you should be able to just take a new
snapshot, repeatedly, until you get one that's safe.  With the patch
as written, if the supplied snapshot is unsafe, GetSafeSnapshot()
will just go into an infinite loop.

AFAICS we should just throw an error if SET TRANSACTION SNAPSHOT is done
in a transaction with those properties.  Has anyone got another
interpretation?  Would it be better to silently ignore the DEFERRABLE
property?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Heikki Linnakangas

On 19.10.2011 19:17, Tom Lane wrote:

Joachim Wielandj...@mcknight.de  writes:

[ synchronized-snapshots patch ]


Looking through this code, it strikes me that SET TRANSACTION SNAPSHOT
is fundamentally incompatible with SERIALIZABLE READ ONLY DEFERRABLE
mode.  That mode assumes that you should be able to just take a new
snapshot, repeatedly, until you get one that's safe.  With the patch
as written, if the supplied snapshot is unsafe, GetSafeSnapshot()
will just go into an infinite loop.

AFAICS we should just throw an error if SET TRANSACTION SNAPSHOT is done
in a transaction with those properties.  Has anyone got another
interpretation?  Would it be better to silently ignore the DEFERRABLE
property?


An error seems appropriate to me.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Florian Pflug
On Oct19, 2011, at 18:17 , Tom Lane wrote:
 Joachim Wieland j...@mcknight.de writes:
 [ synchronized-snapshots patch ]
 
 Looking through this code, it strikes me that SET TRANSACTION SNAPSHOT
 is fundamentally incompatible with SERIALIZABLE READ ONLY DEFERRABLE
 mode.  That mode assumes that you should be able to just take a new
 snapshot, repeatedly, until you get one that's safe.  With the patch
 as written, if the supplied snapshot is unsafe, GetSafeSnapshot()
 will just go into an infinite loop.
 
 AFAICS we should just throw an error if SET TRANSACTION SNAPSHOT is done
 in a transaction with those properties.  Has anyone got another
 interpretation?  Would it be better to silently ignore the DEFERRABLE
 property?

Hm, both features are meant to be used by pg_dump, so think we should
make the combination work. It'd say SET TRANSACTION SNAPSHOT should throw
an error only if the transaction is marked READ ONLY DEFERRABLE *and*
the provided snapshot isn't safe.

This allows a deferrable snapshot to be used on a second connection (
by e.g. pg_dump), and still be marked as DEFERRABLE. If we throw an
error unconditionally, the second connection has to import the snapshot
without marking it DEFERRABLE, which I think has consequences for
performance. AFAIR, the SERIALIZABLE implementation is able to skip
almost all (or all? Kevin?) SIREAD lock acquisitions in DEFERRABLE READ
ONLY transaction, because those cannot participate in dangerous (i.e.
non-serializable) dependency structures.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 On Oct19, 2011, at 18:17 , Tom Lane wrote:
 AFAICS we should just throw an error if SET TRANSACTION SNAPSHOT is done
 in a transaction with those properties.  Has anyone got another
 interpretation?  Would it be better to silently ignore the DEFERRABLE
 property?

 Hm, both features are meant to be used by pg_dump, so think we should
 make the combination work. It'd say SET TRANSACTION SNAPSHOT should throw
 an error only if the transaction is marked READ ONLY DEFERRABLE *and*
 the provided snapshot isn't safe.

Um, no, I don't think so.  It would be sensible for the leader
transaction to use READ ONLY DEFERRABLE and then export the snapshot it
got (possibly after waiting).  It doesn't follow that the child
transactions should use DEFERRABLE too.  They're not going to wait.

 This allows a deferrable snapshot to be used on a second connection (
 by e.g. pg_dump), and still be marked as DEFERRABLE. If we throw an
 error unconditionally, the second connection has to import the snapshot
 without marking it DEFERRABLE, which I think has consequences for
 performance.

No, I don't believe that either.  AIUI the performance benefit comes if
the snapshot is recognized as safe.  DEFERRABLE only means to keep
retrying until you get a safe one.  This is nonsense when you're
importing the snapshot.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Robert Haas
On Wed, Oct 19, 2011 at 1:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 On Oct19, 2011, at 18:17 , Tom Lane wrote:
 AFAICS we should just throw an error if SET TRANSACTION SNAPSHOT is done
 in a transaction with those properties.  Has anyone got another
 interpretation?  Would it be better to silently ignore the DEFERRABLE
 property?

 Hm, both features are meant to be used by pg_dump, so think we should
 make the combination work. It'd say SET TRANSACTION SNAPSHOT should throw
 an error only if the transaction is marked READ ONLY DEFERRABLE *and*
 the provided snapshot isn't safe.

 Um, no, I don't think so.  It would be sensible for the leader
 transaction to use READ ONLY DEFERRABLE and then export the snapshot it
 got (possibly after waiting).  It doesn't follow that the child
 transactions should use DEFERRABLE too.  They're not going to wait.

 This allows a deferrable snapshot to be used on a second connection (
 by e.g. pg_dump), and still be marked as DEFERRABLE. If we throw an
 error unconditionally, the second connection has to import the snapshot
 without marking it DEFERRABLE, which I think has consequences for
 performance.

 No, I don't believe that either.  AIUI the performance benefit comes if
 the snapshot is recognized as safe.  DEFERRABLE only means to keep
 retrying until you get a safe one.  This is nonsense when you're
 importing the snapshot.

I think the requirement is that we need to do the appropriate push-ups
so that the people who import the snapshot know that it's safe, and
that the SSI stuff can all be skipped.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 
 This allows a deferrable snapshot to be used on a second
 connection (by e.g. pg_dump), and still be marked as DEFERRABLE.
 If we throw an error unconditionally, the second connection has
 to import the snapshot without marking it DEFERRABLE, which I
 think has consequences for performance.

 No, I don't believe that either.  AIUI the performance benefit
 comes if the snapshot is recognized as safe.  DEFERRABLE only
 means to keep retrying until you get a safe one.
 
Right, there are other circumstances in which a READ ONLY
transaction's snapshot may be recognized as safe, and it can opt out
of all the additional SSI logic.  As you say, DEFERRABLE means we
*wait* for that.
 
 This is nonsense when you're importing the snapshot.
 
Agreed.
 
 I think the requirement is that we need to do the appropriate
 push-ups so that the people who import the snapshot know that it's
 safe, and that the SSI stuff can all be skipped.
 
If the snapshot was safe in the first process, it will be safe for
any others with which it is shared.  Basically, a SERIALIZABLE READ
ONLY DEFERRABLE transaction waits for a snapshot which, as a READ
ONLY transaction, can't see any serialization anomalies.  It can run
exactly like a REPEATABLE READ transaction.  In fact, it would be OK
from a functional perspective if the first transaction in pg_dump
got a safe snapshot through DEFERRABLE techniques and then shared it
with REPEATABLE READ transactions.
 
I don't know which is the best way to implement this, but it should
be fine to skip the DEFERRABLE logic for secondary users, as long as
they are READ ONLY.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Florian Pflug
On Oct19, 2011, at 19:49 , Kevin Grittner wrote:
 Robert Haas robertmh...@gmail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 
 This allows a deferrable snapshot to be used on a second
 connection (by e.g. pg_dump), and still be marked as DEFERRABLE.
 If we throw an error unconditionally, the second connection has
 to import the snapshot without marking it DEFERRABLE, which I
 think has consequences for performance.
 
 No, I don't believe that either.  AIUI the performance benefit
 comes if the snapshot is recognized as safe.  DEFERRABLE only
 means to keep retrying until you get a safe one.
 
 Right, there are other circumstances in which a READ ONLY
 transaction's snapshot may be recognized as safe, and it can opt out
 of all the additional SSI logic.  As you say, DEFERRABLE means we
 *wait* for that.

Oh, cool. I thought the opt-out only works for explicitly DEFERRABLE
transactions.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Kevin Grittner
Florian Pflug f...@phlo.org wrote:
 
 Oh, cool. I thought the opt-out only works for explicitly
 DEFERRABLE transactions.
 
Basically, if there is no serializable read-write transaction active
which overlaps the read-only transaction and also overlaps a
serializable transaction which wrote something and committed in time
to be visible to the read-only transaction, then the read-only
transaction's snapshot is safe and it can stop worrying about SSI
logic.  If these conditions happen to exist when a read-only
transaction is starting, it never needs to set up for SSI; it can
run just like a REPEATABLE READ transaction and still be safe from
serialization anomalies.  We make some effort to spot the transition
to this state while a read-only transaction is running, allowing it
to drop out of SSI while running.
 
The fact that a read-only transaction can often skip some or all of
the SSI overhead (beyond determining that opting out is safe) is why
declaring transactions to be READ ONLY when possible is #1 on my
list of performance considerations for SSI.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-18 Thread Simon Riggs
On Mon, Oct 3, 2011 at 7:09 AM, Marko Tiikkaja
marko.tiikk...@2ndquadrant.com wrote:
 On 2011-09-28 15:25, Joachim Wieland wrote:

 Yes, that's the desired behaviour, the patch add this paragraph to the
 documentation already:

 I can't believe I missed that.  My apologies.

 On 2011-09-29 05:16, Joachim Wieland wrote:

 The attached patch addresses the reported issues.

 Thanks, this one looks good to me.  Going to mark this patch as ready for
 committer.


I don't see any tests with this patch, so I personally won't be the
committer on this just yet.

Also, not sure why the snapshot id syntax has leading zeroes on first
part of the number, but not on second part. It will still sort
incorrectly if that's what we were trying to achieve. Either leave off
the leading zeroes on first part of add them to second.

We probably need some more discussion added to the README about this.

I'm also concerned that we are adding this to the BEGIN statement as
the only option. I don't have a problem with it being there, but I do
think it is a problem to make it the *only* way to use this. Altering
BEGIN gives clear problems with any API that does the begin and commit
for you, such as perl DBI, java JDBC to name just two. I can't really
see its a good implementation if we say this won't work until client
APIs follow our new non-standard syntax. I wouldn't block commit on
this point, but I think we should work on alternative ways to invoke
this feature as well.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-18 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Mon, Oct 3, 2011 at 7:09 AM, Marko Tiikkaja
 Thanks, this one looks good to me.  Going to mark this patch as ready for
 committer.

 I don't see any tests with this patch, so I personally won't be the
 committer on this just yet.

I've already taken it according to the commitfest app.  There's a lot of
things I don't like stylistically, but they all seem fixable, and I'm
working through them now.  The only actual bug I've found so far is a
race condition while setting MyProc-xmin (you can't do that separately
from verifying that the source transaction is still running, else
somebody else could see a global xmin that's gone backwards).

 Also, not sure why the snapshot id syntax has leading zeroes on first
 part of the number, but not on second part. It will still sort
 incorrectly if that's what we were trying to achieve. Either leave off
 the leading zeroes on first part of add them to second.

The first part is of fixed length, the second not so much.  I'm not
wedded to the syntax but I don't see anything wrong with it either.

 I'm also concerned that we are adding this to the BEGIN statement as
 the only option.

Huh?  The last version of the patch has it only as SET TRANSACTION
SNAPSHOT, which I think is the right way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-18 Thread Simon Riggs
On Tue, Oct 18, 2011 at 6:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 I'm also concerned that we are adding this to the BEGIN statement as
 the only option.

 Huh?  The last version of the patch has it only as SET TRANSACTION
 SNAPSHOT, which I think is the right way.

Sorry Tom, didn't see your name on it earlier, thats not shown on the
main CF display and I didn't think to check on the detail. Please
continue.

I misread the SET TRANSACTION docs changes. Happy with that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-10-03 Thread Marko Tiikkaja

On 2011-09-28 15:25, Joachim Wieland wrote:

Yes, that's the desired behaviour, the patch add this paragraph to the
documentation already:


I can't believe I missed that.  My apologies.

On 2011-09-29 05:16, Joachim Wieland wrote:

The attached patch addresses the reported issues.


Thanks, this one looks good to me.  Going to mark this patch as ready 
for committer.



--
Marko Tiikkajahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-09-28 Thread Marko Tiikkaja

Hi Joachim,

On 14/09/2011 05:37, Joachim Wieland wrote:

Here is a new version of this patch


In a sequence such as this:

  BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
  INSERT INTO foo VALUES (-1);
  SELECT pg_export_snapshot();

the row added to foo is not visible in the exported snapshot.  If 
that's the desired behaviour, I think it should be mentioned in the 
documentation.


I can make a patched backend die with an assertion failure by trying to 
export a snapshot after rolling back a transaction which exported a 
snapshot.  Seems like no cleanup is done at transaction abort.


I think that trying to import a snapshot that doesn't exist deserves a 
better error message.  There's currently no way for the user to know 
that the snapshot didn't exist, other than looking at the SQLSTATE 
(22023), and even that doesn't tell me a whole lot without looking at 
the manual.


Finally, the comment in ImportSnapshot() still mentions the old syntax.

Other than these four problems, the patch looks good.


--
Marko Tiikkajahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-09-28 Thread Joachim Wieland
Hi Marko,

On Wed, Sep 28, 2011 at 2:29 AM, Marko Tiikkaja
marko.tiikk...@2ndquadrant.com wrote:
 In a sequence such as this:

  BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
  INSERT INTO foo VALUES (-1);
  SELECT pg_export_snapshot();

 the row added to foo is not visible in the exported snapshot.  If that's
 the desired behaviour, I think it should be mentioned in the documentation.

Yes, that's the desired behaviour, the patch add this paragraph to the
documentation already:

Also note that even after the synchronization both clients still run
their own independent transactions. As a consequence, even though
synchronized with respect to reading pre-existing data, both
transactions won't be able to see each other's uncommitted data.

I'll take a look at the other issues and update the patch either
tonight or tomorrow.


Thank you,
Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-24 Thread Peter Eisentraut
On lör, 2011-08-20 at 09:56 -0400, Bruce Momjian wrote:
 Peter Eisentraut wrote:
  On tis, 2011-08-16 at 20:35 -0400, Tom Lane wrote:
   In fact, now that I think about it, setting the transaction snapshot
   from a utility statement would be functionally useful because then you
   could take locks beforehand.
  
  Another issue is that in some client interfaces, BEGIN and COMMIT are
  hidden behind API calls, which cannot easily be changed or equipped with
  new parameters.  So in order to have this functionality available
  through those interfaces, we'd need a separately callable command.
 
 How do they set a transaction to SERIALIZABLE?  Seem the same syntax
 should be used here.

The API typically has parameters to set the isolation level, since
that's a standardized property.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-20 Thread Bruce Momjian
Peter Eisentraut wrote:
 On tis, 2011-08-16 at 20:35 -0400, Tom Lane wrote:
  In fact, now that I think about it, setting the transaction snapshot
  from a utility statement would be functionally useful because then you
  could take locks beforehand.
 
 Another issue is that in some client interfaces, BEGIN and COMMIT are
 hidden behind API calls, which cannot easily be changed or equipped with
 new parameters.  So in order to have this functionality available
 through those interfaces, we'd need a separately callable command.

How do they set a transaction to SERIALIZABLE?  Seem the same syntax
should be used here.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-17 Thread Peter Eisentraut
On tis, 2011-08-16 at 20:35 -0400, Tom Lane wrote:
 In fact, now that I think about it, setting the transaction snapshot
 from a utility statement would be functionally useful because then you
 could take locks beforehand.

Another issue is that in some client interfaces, BEGIN and COMMIT are
hidden behind API calls, which cannot easily be changed or equipped with
new parameters.  So in order to have this functionality available
through those interfaces, we'd need a separately callable command.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Robert Haas
On Mon, Aug 15, 2011 at 6:46 PM, Joachim Wieland j...@mcknight.de wrote:
 On Mon, Aug 15, 2011 at 6:09 PM, Jim Nasby j...@nasby.net wrote:
 I suspect that all the other cases of BEGIN failing would be syntax errors, 
 so
 you would immediately know in testing that something was wrong. A missing 
 file
 is definitely not a syntax error, so we can't really depend on user testing 
 to ensure
 this is handled correctly. IMO, that makes it critical that that error puts 
 us in an
 aborted transaction.

 Why can we not just require the user to verify if his BEGIN query
 failed or succeeded?
 Is that really too much to ask for?

 Also see what Robert wrote about proxies in between that keep track of
 the transaction
 state. Consider they see a BEGIN query that fails. How would they know
 if the session
 is now in an aborted transaction or not in a transaction at all?

I think the point here is that we should be consistent.  Currently,
you can make BEGIN fail by doing it on the standby, and asking for
READ WRITE mode:

rhaas=# begin transaction read write;
ERROR:  cannot set transaction read-write mode during recovery

After doing that, you are NOT in a transaction context:

rhaas=# select 1;
 ?column?
--
1
(1 row)

So whatever this does should be consistent with that, at least IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mar ago 16 09:59:04 -0400 2011:
 On Mon, Aug 15, 2011 at 6:46 PM, Joachim Wieland j...@mcknight.de wrote:

  Also see what Robert wrote about proxies in between that keep track
  of the transaction state. Consider they see a BEGIN query that
  fails. How would they know if the session is now in an aborted
  transaction or not in a transaction at all?
 
 I think the point here is that we should be consistent.  Currently,
 you can make BEGIN fail by doing it on the standby, and asking for
 READ WRITE mode:
 
 rhaas=# begin transaction read write;
 ERROR:  cannot set transaction read-write mode during recovery
 
 After doing that, you are NOT in a transaction context:
 
 rhaas=# select 1;
  ?column?
 --
 1
 (1 row)
 
 So whatever this does should be consistent with that, at least IMHO.

I think we argued about a very similar problem years ago and the outcome
was that you should be left in an aborted transaction block; otherwise
running a dumb SQL script (which has no way to abort if it fails)
could wreak serious havoc (?).  I think this failure to behave in that
fashion on the standby is something to be fixed, not imitated.

What this says is that a driver or app seeing BEGIN fail should issue
ROLLBACK before going further -- which seems the intuitive way to behave
to me.  No?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Robert Haas
On Tue, Aug 16, 2011 at 10:43 AM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of mar ago 16 09:59:04 -0400 2011:
 On Mon, Aug 15, 2011 at 6:46 PM, Joachim Wieland j...@mcknight.de wrote:

  Also see what Robert wrote about proxies in between that keep track
  of the transaction state. Consider they see a BEGIN query that
  fails. How would they know if the session is now in an aborted
  transaction or not in a transaction at all?

 I think the point here is that we should be consistent.  Currently,
 you can make BEGIN fail by doing it on the standby, and asking for
 READ WRITE mode:

 rhaas=# begin transaction read write;
 ERROR:  cannot set transaction read-write mode during recovery

 After doing that, you are NOT in a transaction context:

 rhaas=# select 1;
  ?column?
 --
         1
 (1 row)

 So whatever this does should be consistent with that, at least IMHO.

 I think we argued about a very similar problem years ago and the outcome
 was that you should be left in an aborted transaction block; otherwise
 running a dumb SQL script (which has no way to abort if it fails)
 could wreak serious havoc (?).  I think this failure to behave in that
 fashion on the standby is something to be fixed, not imitated.

 What this says is that a driver or app seeing BEGIN fail should issue
 ROLLBACK before going further -- which seems the intuitive way to behave
 to me.  No?

Maybe.  But if we're going to change the behavior of BEGIN, then (1)
we need to think about backward compatibility and (2) we should change
it across the board.  It's not for this patch to go invent something
that's inconsistent with what we're already doing elsewhere.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Jim Nasby
On Aug 15, 2011, at 5:46 PM, Joachim Wieland wrote:
 On Mon, Aug 15, 2011 at 6:09 PM, Jim Nasby j...@nasby.net wrote:
 I suspect that all the other cases of BEGIN failing would be syntax errors, 
 so
 you would immediately know in testing that something was wrong. A missing 
 file
 is definitely not a syntax error, so we can't really depend on user testing 
 to ensure
 this is handled correctly. IMO, that makes it critical that that error puts 
 us in an
 aborted transaction.
 
 Why can we not just require the user to verify if his BEGIN query
 failed or succeeded?
 Is that really too much to ask for?

It's something else that you have to remember to get right. psql, for example, 
will blindly continue on unless you remembered to tell it to exit on an error.

Also, an invalid transaction seems to be the result of least surprise... if you 
cared enough to begin a transaction, you're going to expect that either 
everything between that and the COMMIT succeeds or fails, not something 
in-between.

 Also see what Robert wrote about proxies in between that keep track of
 the transaction
 state. Consider they see a BEGIN query that fails. How would they know
 if the session
 is now in an aborted transaction or not in a transaction at all?

AFAIK a proxy can tell if a transaction is in progress or not via libpq. 
Worst-case, it just needs to send an extra ROLLBACK.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Jeff Davis
On Tue, 2011-08-16 at 11:01 -0500, Jim Nasby wrote:
 Also, an invalid transaction seems to be the result of least
 surprise... if you cared enough to begin a transaction, you're going
 to expect that either everything between that and the COMMIT succeeds
 or fails, not something in-between.

Agreed.

Perhaps we need a new utility command to set the snapshot to make the
error handling a little more obvious?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Jim Nasby
On Aug 16, 2011, at 5:40 PM, Jeff Davis wrote:
 On Tue, 2011-08-16 at 11:01 -0500, Jim Nasby wrote:
 Also, an invalid transaction seems to be the result of least
 surprise... if you cared enough to begin a transaction, you're going
 to expect that either everything between that and the COMMIT succeeds
 or fails, not something in-between.
 
 Agreed.
 
 Perhaps we need a new utility command to set the snapshot to make the
 error handling a little more obvious?

Well, it appears we have a larger problem, as Robert pointed out that trying to 
start a writable transaction on a hot standby leaves you not in a transaction 
(which I feel is a problem).

So IMHO the right thing to do here is make it so that runtime errors in BEGIN 
leave you in an invalid transaction. Then we can decide on the API for 
synchronized snapshots that makes sense instead of working around the behavior 
of BEGIN.

I guess the big question to answer now is: what's the backwards compatibility 
impact of changing how BEGIN deals with runtime errors?
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Tom Lane
Jim Nasby j...@nasby.net writes:
 Well, it appears we have a larger problem, as Robert pointed out that trying 
 to start a writable transaction on a hot standby leaves you not in a 
 transaction (which I feel is a problem).

 So IMHO the right thing to do here is make it so that runtime errors in BEGIN 
 leave you in an invalid transaction. Then we can decide on the API for 
 synchronized snapshots that makes sense instead of working around the 
 behavior of BEGIN.

I'm not convinced by the above argument, because it requires that
you pretend there's a significant difference between syntax errors and
run time errors (whatever those are).  Syntax errors in a BEGIN
command are not going to leave you in an aborted transaction, because
the backend is not going to recognize the command as a BEGIN at all.
This means that frontends *must* be capable of dealing with the case
that a failed BEGIN didn't start a transaction.  (Either that, or
they just assume their commands are always syntactically perfect,
which seems like pretty fragile programming to me; and the more weird
nonstandard options we load onto BEGIN, the less tenable the position
becomes.  For example, if you feed BEGIN option-foo to a server that's
a bit older than you thought it was, you will get a syntax error.)
If we have some failure cases that start a transaction and some that do
not, we just have a mess, IMO.

I think we'd be far better off to maintain the position that a failed
BEGIN does not start a transaction, under any circumstances.  To do
that, we cannot have this new option attached to the BEGIN, which is a
good thing anyway IMO from a standards compatibility point of view.
It'd be better to make it a separate utility statement.  There is no
logical problem in doing that, and we already have a precedent for
utility statements that do something special before the transaction
snapshot is taken: see LOCK.

In fact, now that I think about it, setting the transaction snapshot
from a utility statement would be functionally useful because then you
could take locks beforehand.

And as a bonus, we don't have a backwards compatibility problem to solve.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Robert Haas
On Tue, Aug 16, 2011 at 8:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jim Nasby j...@nasby.net writes:
 Well, it appears we have a larger problem, as Robert pointed out that trying 
 to start a writable transaction on a hot standby leaves you not in a 
 transaction (which I feel is a problem).

 So IMHO the right thing to do here is make it so that runtime errors in 
 BEGIN leave you in an invalid transaction. Then we can decide on the API for 
 synchronized snapshots that makes sense instead of working around the 
 behavior of BEGIN.

 I'm not convinced by the above argument, because it requires that
 you pretend there's a significant difference between syntax errors and
 run time errors (whatever those are).  Syntax errors in a BEGIN
 command are not going to leave you in an aborted transaction, because
 the backend is not going to recognize the command as a BEGIN at all.
 This means that frontends *must* be capable of dealing with the case
 that a failed BEGIN didn't start a transaction.  (Either that, or
 they just assume their commands are always syntactically perfect,
 which seems like pretty fragile programming to me; and the more weird
 nonstandard options we load onto BEGIN, the less tenable the position
 becomes.  For example, if you feed BEGIN option-foo to a server that's
 a bit older than you thought it was, you will get a syntax error.)
 If we have some failure cases that start a transaction and some that do
 not, we just have a mess, IMO.

More or less agreed.

 I think we'd be far better off to maintain the position that a failed
 BEGIN does not start a transaction, under any circumstances.

Also agreed.

 To do
 that, we cannot have this new option attached to the BEGIN, ...

Eh, why not?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Aug 16, 2011 at 8:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I think we'd be far better off to maintain the position that a failed
 BEGIN does not start a transaction, under any circumstances.

 Also agreed.

 To do
 that, we cannot have this new option attached to the BEGIN, ...

 Eh, why not?

Maybe I wasn't paying close enough attention to the thread, but I had
the idea that there was some implementation reason why not.  If not,
we could still load the option onto BEGIN ... but I still find myself
liking the idea of a separate command better, because of the locking
issue.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Robert Haas
On Tue, Aug 16, 2011 at 8:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Aug 16, 2011 at 8:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I think we'd be far better off to maintain the position that a failed
 BEGIN does not start a transaction, under any circumstances.

 Also agreed.

 To do
 that, we cannot have this new option attached to the BEGIN, ...

 Eh, why not?

 Maybe I wasn't paying close enough attention to the thread, but I had
 the idea that there was some implementation reason why not.  If not,
 we could still load the option onto BEGIN ... but I still find myself
 liking the idea of a separate command better, because of the locking
 issue.

Why does it matter whether you take the locks before or after the snapshot?

If you're concerned with minimizing the race, what you should do is
take all relevant locks in the parent before exporting the snapshot.

I am not wild about adding another toplevel command for this.  It
seems a rather narrow use case, and attaching it to BEGIN feels
natural to me.  There may be some small benefit also in terms of
minimizing the amount of sanity checking that must be done - for
example, at BEGIN time, you don't have to check for the case where a
snapshot has already been set.

If we did add another toplevel command, what would we call it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Jeff Davis
On Tue, 2011-08-16 at 20:35 -0400, Tom Lane wrote:
 I'm not convinced by the above argument, because it requires that
 you pretend there's a significant difference between syntax errors and
 run time errors (whatever those are).

After a syntax error like COMMMIT the transaction will remain inside
the failed transaction block, but an error during COMMIT (e.g. deferred
constraint check failure) will exit the transaction block.

 I think we'd be far better off to maintain the position that a failed
 BEGIN does not start a transaction, under any circumstances.  To do
 that, we cannot have this new option attached to the BEGIN, which is a
 good thing anyway IMO from a standards compatibility point of view.
 It'd be better to make it a separate utility statement.

+1 for a utility statement. Much clearer from the user's standpoint what
kind of errors they might expect, and whether the session will remain in
a transaction block.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Jeff Davis
On Tue, 2011-08-16 at 21:08 -0400, Robert Haas wrote: 
 attaching it to BEGIN feels natural to me.

My only objection is that people have different expectations about
whether the session will remain in a transaction block when they
encounter an error. So, it's hard to make this work without surprising
about half the users.

And there are some fairly significant consequences to users who guess
that they will remain in a transaction block in case of an error; or who
are just careless and don't consider that an error may occur.

 If we did add another toplevel command, what would we call it?

SET TRANSACTION SNAPSHOT perhaps?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Robert Haas
On Tue, Aug 16, 2011 at 9:54 PM, Jeff Davis pg...@j-davis.com wrote:
 If we did add another toplevel command, what would we call it?

 SET TRANSACTION SNAPSHOT perhaps?

Hmm, that's not bad, but I think we'd have to partially reserve
TRANSACTION to make it work, which doesn't seem worth the pain it
would cause.

We could do something like TRANSACTION SNAPSHOT IS 'xyz', but that's a
bit awkard.

I still like BEGIN SNAPSHOT 'xyz' -- or even adding a generic options
list like we use for some other commands, i.e. BEGIN (snapshot 'xyz'),
which would leave the door open to arbitrary amounts of future
nonstandard fiddling without the need for any more keywords.  I
understand the point about the results of a BEGIN failure leaving you
outside a transaction, but that really only matters if you're doing
psql  dumb_script.sql, which is impractical for this feature
anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Simon Riggs
On Mon, Aug 15, 2011 at 2:31 AM, Joachim Wieland j...@mcknight.de wrote:

 In short, this is how it works:

 SELECT pg_export_snapshot();
  pg_export_snapshot
 
  03A1-1
 (1 row)


 (and then in a different session)

 BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT = '03A1-1');

I don't see the need to change the BEGIN command, which is SQL
Standard. We don't normally do that.

If we have pg_export_snapshot() why not pg_import_snapshot() as well?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Heikki Linnakangas

On 15.08.2011 04:31, Joachim Wieland wrote:

The one thing that it does not implement is leaving the transaction in
an aborted state if the BEGIN TRANSACTION command failed for an
invalid snapshot identifier.


So what if the snapshot is invalid, the SNAPSHOT clause silently 
ignored? That sounds really bad.



I can certainly see that this would be
useful but I am not sure if it justifies introducing this
inconsistency. We would have a BEGIN TRANSACTION command that left the
session in a different state depending on why it failed...


I don't understand what inconsistency you're talking about. What else 
can cause BEGIN TRANSACTION to fail? Is there currently any failure mode 
that doesn't leave the transaction in aborted state?



I am wondering if pg_export_snapshot() is still the right name, since
the snapshot is no longer exported to the user. It is exported to a
file but that's an implementation detail.


It's still exporting the snapshot to other sessions, that name still 
seems appropriate to me.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Heikki Linnakangas

On 15.08.2011 10:40, Simon Riggs wrote:

On Mon, Aug 15, 2011 at 2:31 AM, Joachim Wielandj...@mcknight.de  wrote:


In short, this is how it works:

SELECT pg_export_snapshot();
  pg_export_snapshot

  03A1-1
(1 row)


(and then in a different session)

BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT = '03A1-1');


I don't see the need to change the BEGIN command, which is SQL
Standard. We don't normally do that.

If we have pg_export_snapshot() why not pg_import_snapshot() as well?


It would be nice a symmetry, but you'd need a limitation that 
pg_import_snapshot() must be the first thing you do in the session. And 
it might be hard to enforce that, as once you get control into the 
function, you've already acquired another snapshot in the transaction to 
run the SELECT pg_import_snapshot() query with. Specifying the 
snapshot in the BEGIN command makes sense.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Andres Freund
On Monday, August 15, 2011 08:40:34 Simon Riggs wrote:
 On Mon, Aug 15, 2011 at 2:31 AM, Joachim Wieland j...@mcknight.de wrote:
  In short, this is how it works:
  
  SELECT pg_export_snapshot();
   pg_export_snapshot
  
   03A1-1
  (1 row)
  
  
  (and then in a different session)
  
  BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT =
  '03A1-1');
 I don't see the need to change the BEGIN command, which is SQL
 Standard. We don't normally do that.
Uhm. There already are several extensions to begin transaction. Like the just 
added DEFERRABLE.

 If we have pg_export_snapshot() why not pg_import_snapshot() as well?
Using BEGIN has the advantage of making it explicit that it cannot be used 
inside an existing transaction. Which I do find advantageous.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread PostgreSQL - Hans-Jürgen Schönig

On Aug 15, 2011, at 9:40 AM, Simon Riggs wrote:

 On Mon, Aug 15, 2011 at 2:31 AM, Joachim Wieland j...@mcknight.de wrote:
 
 In short, this is how it works:
 
 SELECT pg_export_snapshot();
  pg_export_snapshot
 
  03A1-1
 (1 row)
 
 
 (and then in a different session)
 
 BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT = '03A1-1');
 
 I don't see the need to change the BEGIN command, which is SQL
 Standard. We don't normally do that.
 
 If we have pg_export_snapshot() why not pg_import_snapshot() as well?
 
 -- 
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



i would definitely argue for a syntax like the one proposed by Joachim.. i 
could stay the same if this is turned into some sort of flashback 
implementation some day.

regards,

hans

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Florian Weimer
* Simon Riggs:

 I don't see the need to change the BEGIN command, which is SQL
 Standard. We don't normally do that.

Some language bindings treat BEGIN specially, so it might be difficult
to use this feature.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Joachim Wieland
On Mon, Aug 15, 2011 at 3:47 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 15.08.2011 04:31, Joachim Wieland wrote:

 The one thing that it does not implement is leaving the transaction in
 an aborted state if the BEGIN TRANSACTION command failed for an
 invalid snapshot identifier.

 So what if the snapshot is invalid, the SNAPSHOT clause silently ignored?
 That sounds really bad.

No, the command would fail, but since it fails, it doesn't change the
transaction state.

What was proposed originally was to start a transaction but throw an
error that leaves the transaction in the aborted state. But then the
command had some effect because it started a transaction block, even
though it failed.


 I can certainly see that this would be
 useful but I am not sure if it justifies introducing this
 inconsistency. We would have a BEGIN TRANSACTION command that left the
 session in a different state depending on why it failed...

 I don't understand what inconsistency you're talking about. What else can
 cause BEGIN TRANSACTION to fail? Is there currently any failure mode that
 doesn't leave the transaction in aborted state?

Granted, it might only fail for parse errors so far, but that would
include for example sending BEGIN DEFERRABLE to a pre-9.1 server. It
wouldn't start a transaction and leave it in an aborted state, but it
would just fail.


 I am wondering if pg_export_snapshot() is still the right name, since
 the snapshot is no longer exported to the user. It is exported to a
 file but that's an implementation detail.

 It's still exporting the snapshot to other sessions, that name still seems
 appropriate to me.

ok.


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Joachim Wieland
On Mon, Aug 15, 2011 at 6:41 AM, Florian Weimer fwei...@bfk.de wrote:
 * Simon Riggs:

 I don't see the need to change the BEGIN command, which is SQL
 Standard. We don't normally do that.

 Some language bindings treat BEGIN specially, so it might be difficult
 to use this feature.

It's true, the command might require explicit support from language
bindings. However I used some perl test scripts, where you can also
send a START TRANSACTION command in an $dbh-do(...).

The intended use case of this feature is still pg_dump btw...


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Robert Haas
On Mon, Aug 15, 2011 at 3:51 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 It would be nice a symmetry, but you'd need a limitation that
 pg_import_snapshot() must be the first thing you do in the session. And it
 might be hard to enforce that, as once you get control into the function,
 you've already acquired another snapshot in the transaction to run the
 SELECT pg_import_snapshot() query with. Specifying the snapshot in the
 BEGIN command makes sense.

+1.  Also, I am pretty sure that there are drivers out there, and
connection poolers, that keep track of the transaction state by
watching commands go by.  Right now you can tell by the first word of
the command whether it's something that might change the transaction
state; I wouldn't like to make that harder.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote:
 Joachim Wieland j...@mcknight.de wrote:
 
 BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT =
 '03A1-1');
 
 I don't see the need to change the BEGIN command, which is SQL
 Standard.
 
No, it's not standard.
 
To quote from our docs at:
 
http://www.postgresql.org/docs/9.0/interactive/sql-begin.html#AEN58214
 
| BEGIN is a PostgreSQL language extension. It is equivalent to the
| SQL-standard command START TRANSACTION, whose reference page
| contains additional compatibility information.
| 
| Incidentally, the BEGIN key word is used for a different purpose
| in embedded SQL. You are advised to be careful about the
| transaction semantics when porting database applications. 
 
In checking the most recent standards draft I have available, it
appears that besides embedded SQL, this keyword is also used in the
standard trigger declaration syntax.  Using BEGIN to start a
transaction is a PostgreSQL extension to the standard.  That said,
if we support a feature on the nonstandard BEGIN statement, we
typically add it as an extension to the standard START TRANSACTION
and SET TRANSACTION statements.  Through 9.0 that consisted of
having a non-standard default for isolation level and the ability to
omit commas required by the standard.  In 9.1 we added another
optional transaction property which defaults to standard behavior:
DEFERRABLE.
 
If we're talking about a property of a transaction, like the
transaction snapshot, it seems to me to be best to support it using
the same statements we use for other transaction properties.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Jim Nasby
On Aug 15, 2011, at 6:23 AM, Joachim Wieland wrote:
 On Mon, Aug 15, 2011 at 3:47 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 On 15.08.2011 04:31, Joachim Wieland wrote:
 
 The one thing that it does not implement is leaving the transaction in
 an aborted state if the BEGIN TRANSACTION command failed for an
 invalid snapshot identifier.
 
 So what if the snapshot is invalid, the SNAPSHOT clause silently ignored?
 That sounds really bad.
 
 No, the command would fail, but since it fails, it doesn't change the
 transaction state.
 
 What was proposed originally was to start a transaction but throw an
 error that leaves the transaction in the aborted state. But then the
 command had some effect because it started a transaction block, even
 though it failed.

It certainly seems safer to me to set the transaction to an aborted state; you 
were expecting a set of commands to run with one snapshot, but if we don't 
abort the transaction they'll end up running anyway and doing so with the 
*wrong* snapshot. That could certainly lead to data corruption.

I suspect that all the other cases of BEGIN failing would be syntax errors, so 
you would immediately know in testing that something was wrong. A missing file 
is definitely not a syntax error, so we can't really depend on user testing to 
ensure this is handled correctly. IMO, that makes it critical that that error 
puts us in an aborted transaction.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-15 Thread Joachim Wieland
On Mon, Aug 15, 2011 at 6:09 PM, Jim Nasby j...@nasby.net wrote:
 I suspect that all the other cases of BEGIN failing would be syntax errors, so
 you would immediately know in testing that something was wrong. A missing file
 is definitely not a syntax error, so we can't really depend on user testing 
 to ensure
 this is handled correctly. IMO, that makes it critical that that error puts 
 us in an
 aborted transaction.

Why can we not just require the user to verify if his BEGIN query
failed or succeeded?
Is that really too much to ask for?

Also see what Robert wrote about proxies in between that keep track of
the transaction
state. Consider they see a BEGIN query that fails. How would they know
if the session
is now in an aborted transaction or not in a transaction at all?


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2010-02-10 Thread Markus Wanner

Hi Joachim,

On Wed, 10 Feb 2010 11:36:41 +0100, Joachim Wieland j...@mcknight.de
wrote:

http://www.postgresql.org/docs/8.4/static/backup-dump.html already
states about pg_dump: In particular, it must have read access to all
tables that you want to back up, so in practice you almost always have
to run it as a database superuser. so I think there is not a big loss
here...


Hm.. I doubt somewhat that's common practice. After all, read access to 
all tables is still a *lot* less than superuser privileges. But yeah, 
the documentation currently states that.



They more or less get it by chance :-)  They acquire a snapshot when
they call pg_synchronize_snapshot_taken()


Oh, I see, calling the function by itself already acquires a snapshot.
Even in case of a fast path call, it seems. Then your approach is correct.

(I'd still feel more comfortable, it I had seen a 
GetTransactionSnapshot() or something akin in there).



and if all the backends do
it while the other backend holds the lock in shared mode, we know that
the snapshot won't change, so they all get the same snapshot.


Agreed, that works.

(Ab)using the ProcArrayLock for synchronization is probably acceptable 
for pg_dump, however, I'd rather take another approach for a more 
general implementation.



Also, you should probably ensure the calling transactions don't have a
snapshot already (let alone a transaction id).


True...


Hm.. realizing that a function call per-se acquires a snapshot, I fail 
to see how we could check if we really acquired a snapshot. Consider the 
following (admittedly stupid) example:


BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT version();
 ... time goes by ...
SELECT pg_synchronize_snapshot_taken(..);

As it stands, your function would silently fail to synchronize the
snapshots, if other transactions committed in between the two function 
calls.



It seemed more robust and convenient to have an expiration in the
backend itself. What would happen if you called
pg_synchronize_snapshots() and if right after that your network
connection dropped? Without the server noticing, it would continue to
hold the lock and you could not log in anymore...


Hm.. that's a point. Given this approach uses the ProcArrayLock, it's
probably better to use an explicit timeout.


But you are right: The proposed feature is a pragmatic and quick
solution for pg_dump and similar but we might want to have a more
general snapshot cloning procedure instead. Not having a delay for
other activities at all and not requiring superuser privileges would
be a big advantage over what I have proposed.


Agreed.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2010-02-10 Thread Heikki Linnakangas
Markus Wanner wrote:
 On Wed, 10 Feb 2010 11:36:41 +0100, Joachim Wieland j...@mcknight.de
 wrote:
 http://www.postgresql.org/docs/8.4/static/backup-dump.html already
 states about pg_dump: In particular, it must have read access to all
 tables that you want to back up, so in practice you almost always have
 to run it as a database superuser. so I think there is not a big loss
 here...
 
 Hm.. I doubt somewhat that's common practice. After all, read access to
 all tables is still a *lot* less than superuser privileges. But yeah,
 the documentation currently states that.

I think running as database owner gets you pretty far as far as pg_dump
goes. It would be good to lift the limitation that you have to be superuser.

 But you are right: The proposed feature is a pragmatic and quick
 solution for pg_dump and similar but we might want to have a more
 general snapshot cloning procedure instead. Not having a delay for
 other activities at all and not requiring superuser privileges would
 be a big advantage over what I have proposed.
 
 Agreed.

Yeah, a big advantage of the proposed approach is that it's pretty
simple to implement as an external module, allowing you to write scripts
using it for older versions too.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2010-02-05 Thread Markus Wanner

Hello Joachim,

a little daughter eats lots of spare cycles - among other things. Sorry 
it took that long to review.


On Fri, 8 Jan 2010 20:36:44 +0100, Joachim Wieland j...@mcknight.de
wrote:

The attached patch implements the idea of Heikki / Simon published in

http://archives.postgresql.org/pgsql-hackers/2009-11/msg00271.php


I must admit I didn't read that up front, but thought your patch could 
be useful for implementing parallel querying.


So, let's first concentrate on the intended use case: allowing parallel 
pg_dump. To me it seems like a pragmatic and quick solution, however, 
I'm not sure if requiring superuser privileges is acceptable.


The patch currently compiles (modulo some OID changes in pg_proc.h to 
prevent duplicates) and the test suite runs through fine. I haven't 
tested the new functions, though.



Reading the code, I'm missing the part that actually acquires the 
snapshot for the transaction(s). After setting up multiple transactions 
with pg_synchronize_snapshot and pg_synchronize_snapshot_taken, they 
still don't have a snapshot, do they?


Also, you should probably ensure the calling transactions don't have a 
snapshot already (let alone a transaction id).


In a similar vein, and answering your question in a comment: yes, I'd 
say you want to ensure your transactions are in SERIALIZABLE isolation 
mode. There's no other isolation level for which that kind of snapshot 
serialization makes sense, is there?



Using the exposed functions in a more general sense, I think it's 
important to note that the patch only intents to synchronize snapshots 
at the start of the transaction, not contiguously. Thus, normal 
transaction isolation applies for concurrent writes and each of the 
transactions can commit or rollback independently.


The timeout is nice, but is it really required? Isn't the normal query
cancellation infrastructure sufficient?

Hope that helps. Thanks for working on this issue.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2010-01-10 Thread Simon Riggs
On Fri, 2010-01-08 at 20:36 +0100, Joachim Wieland wrote:
 The attached patch implements the idea of Heikki / Simon published in
 
 http://archives.postgresql.org/pgsql-hackers/2009-11/msg00271.php
 
 Since nobody objected to the idea in general, I have implemented it.
 
 As this is not currently used anywhere it doesn't give immediate benefit, it
 is however a prerequisite for a parallel version of pg_dump that quite some
 people (including myself) seem to be interested in.

I'm interested in this, but realistically won't have time to review this
personally in this release. Sorry about that.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2010-01-09 Thread Markus Wanner
Hi

Joachim Wieland wrote:
 Since nobody objected to the idea in general, I have implemented it.

Great! I hope to get some spare cycles within the next few days to
review it.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2010-01-09 Thread Marcin Mańk



Dnia 2010-01-09 o godz. 20:37 Markus Wanner mar...@bluegap.ch napisał 
(a):



Hi

Joachim Wieland wrote:

Since nobody objected to the idea in general, I have implemented it.


How cool it would be if we could synchronize snapshots between the  
master and the (sr) standby?


The connection poolers could use that to send read-only queries to the  
standby, and when the first dml/ddl statement in a transaction comes  
up, they could switch to the master.


If it is hard to tell from the statement if it writes anything, the  
pooler could catch the error, and retry on the master


Regards
Marcin Mańk 
--

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] synchronized snapshots

2010-01-08 Thread Joachim Wieland
The attached patch implements the idea of Heikki / Simon published in

http://archives.postgresql.org/pgsql-hackers/2009-11/msg00271.php

Since nobody objected to the idea in general, I have implemented it.

As this is not currently used anywhere it doesn't give immediate benefit, it
is however a prerequisite for a parallel version of pg_dump that quite some
people (including myself) seem to be interested in.

Here's the comment from the patch explaining it in more detail:

/*
 * This function is for synchronization of snapshots: It can be called by
 * new transactions to get the same snapshots. It's signature is
 *
 * pg_synchronize_snapshots(text, int, int);
 *
 * The first parameter is an identifier so that several groups can request
 * synchronized snapshots concurrently.
 *
 * The second parameter is the number of backends that are expected to connect
 * in the current group (i.e.  same identifier).
 *
 * The third parameter is the timeout in milliseconds.
 *
 * Note that once we are holding the ProcArrayLock in shared mode we are
 * severely hitting the usability of the database server: for example, nobody
 * can commit nontrivial transactions during that time nor can you establish a
 * new connection! This is why you need to be superuser to use this function.
 *
 * The idea is that from one connection you call for example
 *
 * pg_synchronize_snapshot('7bd0320c4ff9252716972e160fb33b8a', 4, 1000)
 *
 * and then have 1000ms to call with some other four (already connected)
 * sessions
 *
 * BEGIN TRANSACTION;
 * SELECT pg_synchronize_snapshot_taken('7bd0320c4ff9252716972e160fb33b8a');
 *
 * If all four pg_synchronize_snapshot_taken() calls return true and the
 * function pg_synchronize_snapshot() returns true as well, you can go on and
 * all four transactions now see the same snapshot (which in general is not the
 * snapshot that the transaction saw that has initially called
 * pg_synchronize_snapshot()).
 */

Thoughts?


Joachim
diff -cr cvs.head/src/backend/storage/ipc/procarray.c cvs.build/src/backend/storage/ipc/procarray.c
*** cvs.head/src/backend/storage/ipc/procarray.c	2010-01-05 12:39:28.0 +0100
--- cvs.build/src/backend/storage/ipc/procarray.c	2010-01-08 20:50:31.0 +0100
***
*** 81,86 
--- 81,96 
  
  static ProcArrayStruct *procArray;
  
+ typedef struct SyncSnapshotGroupStruct
+ {
+ 	char		id[NAMEDATALEN];
+ 	int			refCount;
+ } SyncSnapshotGroupStruct;
+ 
+ static SyncSnapshotGroupStruct *syncSnapshotGroups;
+ #define SYNC_SNAPSHOT_GROUPS_MAX	5
+ static void freeSyncSnapshotGroup(int idx);
+ 
  /*
   * Bookkeeping for tracking emulated transactions in recovery
   */
***
*** 182,187 
--- 192,199 
  CreateSharedProcArray(void)
  {
  	bool		found;
+ 	bool		foundSyncSnapshot;
+ 	Size		syncSnapshotSize;
  
  	/* Create or attach to the ProcArray shared structure */
  	procArray = (ProcArrayStruct *)
***
*** 189,194 
--- 201,212 
  			mul_size(sizeof(PGPROC *), PROCARRAY_MAXPROCS),
  			found);
  
+ 	syncSnapshotSize = mul_size(sizeof(SyncSnapshotGroupStruct),
+ SYNC_SNAPSHOT_GROUPS_MAX);
+ 	syncSnapshotGroups = (SyncSnapshotGroupStruct *)
+ 		 ShmemInitStruct(Sync Snapshot Groups,
+ 		 syncSnapshotSize, foundSyncSnapshot);
+ 
  	if (!found)
  	{
  		/*
***
*** 201,206 
--- 219,230 
  		if (XLogRequestRecoveryConnections)
  			KnownAssignedXidsInit(TOTAL_MAX_CACHED_SUBXIDS);
  	}
+ 	if (!foundSyncSnapshot)
+ 	{
+ 		int i;
+ 		for (i = 0; i  SYNC_SNAPSHOT_GROUPS_MAX; i++)
+ 			freeSyncSnapshotGroup(i);
+ 	}
  }
  
  /*
***
*** 2499,2501 
--- 2523,2755 
  
  	pfree(buf.data);
  }
+ 
+ static void
+ freeSyncSnapshotGroup(int idx)
+ {
+ 	/* caller needs to take care of proper locking */
+ 	syncSnapshotGroups[idx].id[0] = '\0';
+ 	syncSnapshotGroups[idx].refCount = 0;
+ }
+ 
+ /*
+  * This function is for synchronization of snapshots: It can be called by
+  * new transactions to get the same snapshots. It's signature is
+  *
+  * pg_synchronize_snapshots(text, int, int);
+  *
+  * The first parameter is an identifier so that several groups can request
+  * synchronized snapshots concurrently.
+  *
+  * The second parameter is the number of backends that are expected to connect
+  * in the current group (i.e.  same identifier).
+  *
+  * The third parameter is the timeout in milliseconds.
+  *
+  * Note that once we are holding the ProcArrayLock in shared mode we are
+  * severely hitting the usability of the database server: for example, nobody
+  * can commit nontrivial transactions during that time nor can you establish a
+  * new connection! This is why you need to be superuser to use this function.
+  *
+  * The idea is that from one connection you call for example
+  *
+  * pg_synchronize_snapshot('7bd0320c4ff9252716972e160fb33b8a', 4, 1000)
+  *
+  * and then have 1000ms to call with some other four (already connected)
+  * sessions
+  *
+  * BEGIN 

Re: [HACKERS] synchronized snapshots

2010-01-08 Thread Greg Stark
On Fri, Jan 8, 2010 at 7:36 PM, Joachim Wieland j...@mcknight.de wrote:
  * If all four pg_synchronize_snapshot_taken() calls return true and the


If we must have a timeout I think you should throw an error if the
timeout expires.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers