Re: [HACKERS] serializable read only deferrable
Dan Ports wrote: On Tue, Dec 07, 2010 at 10:14:24AM -0600, Kevin Grittner wrote: The only thing I'm worried about here is how much risk of starvation remains. You'd need to wait until there are no running r/w transactions accessing overlapping data sets; for some applications that might not be any better than waiting for the system to be idle. But I think there's no way around that, it's just the price you have to pay to get a snapshot that can never see an anomaly. Right -- this can't be a default behavior because of that, but it rounds out the options for backups and big reports. Without it you have the choice between the potential for other transactions to cancel because a cycle was completed by the READ ONLY transaction or getting a view of data which may not be consistent with the later state of the database[1]. This guarantees consistency without causing rollbacks, with the additional benefit of faster runtime by skipping SSI logic. Pseudo-code of idea (conveniently ignoring locking issues and non-serializable transactions): This seems reasonable to me. Let me know if you need help implementing it; I have some spare cycles right now. That would be great. I'm getting on a train today to go spend a week on vacation in New Orleans, and I've been fretting about where this patch is at compared to the release cycle. :-( I can suck down my hurricanes with a calmer mind if I know you're on this. :-) In conjunction with this feature, it would be great if you could take a look at how to recognize these conditions for a READ ONLY transaction which is running under SSI, and back it out of SSI when it hits that condition. SIRead predicate locks, conflicts, and other structures can be released, and we can stop checking the MVCC data on reads. Basically, we should be able to get to the DEFERRABLE type of state while running -- it's just that we might cause some number of transactions to cancel along the way before we hit that state. (These two seem likely to be less work if done at the same time.) -Kevin [1] It has struck me that the receipting example is one case of a more general pattern which I've frequently seen in business software which is vulnerable to this sort of anomaly -- batch processing. Basically, any time you have a control record which controls the batch into which detail is placed, if the control information is updated and that is committed while detail is still in flight, you can have this class of anomaly. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
On Tue, Dec 07, 2010 at 10:14:24AM -0600, Kevin Grittner wrote: Essentially, instead of adding dependencies as you go along and abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERRED transactions would assume the worst case from the start and thus be able to bypass the more detailed checks later on. Right -- such a transaction, having acquired a good snapshot, could release all SSI resources and run without any of the SSI overhead. Yes, this makes sense. If no running transaction has ever read, and will never read before COMMIT, any value that's modified by a concurrent transaction, then they will not create snapshot anomalies, and the current snapshot has a place in the serial ordering. With this scheme, you'd at least stand some chance of eventually acquiring a consistent snapshot, even in the case of an endless stream of overlapping READ WRITE transactions. Yeah, I'd been twisting ideas around trying to find a good way to do this; you've got it right at the conceptual level, I think. The only thing I'm worried about here is how much risk of starvation remains. You'd need to wait until there are no running r/w transactions accessing overlapping data sets; for some applications that might not be any better than waiting for the system to be idle. But I think there's no way around that, it's just the price you have to pay to get a snapshot that can never see an anomaly. Pseudo-code of idea (conveniently ignoring locking issues and non-serializable transactions): This seems reasonable to me. Let me know if you need help implementing it; I have some spare cycles right now. Dan -- Dan R. K. Ports MIT CSAILhttp://drkp.net/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Tom Lane t...@sss.pgh.pa.us wrote: I agree that letting it be changed back to read/write after that is surprising and unnecessary. Perhaps locking down the setting at the time of first grabbing a snapshot would be appropriate. IIRC that's how it works for transaction isolation level, and this seems like it ought to work the same. Agreed. I can create a patch today to implement this. The thing which jumps out first is that assign_transaction_read_only probably needs to move to variable.c so that it can reference FirstSnapshotSet as the transaction isolation code does. The alternative would be to include snapmgr.h in guc.c, which seems less appealing. Agreed? Other ideas? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
I wrote: Tom Lane t...@sss.pgh.pa.us wrote: I agree that letting it be changed back to read/write after that is surprising and unnecessary. Perhaps locking down the setting at the time of first grabbing a snapshot would be appropriate. IIRC that's how it works for transaction isolation level, and this seems like it ought to work the same. Agreed. I can create a patch today to implement this. Attached. Accomplished more through mimicry (based on setting transaction isolation level) than profound understanding of the code involved; but it passes all regression tests on both `make check` and `make installcheck-world`. This includes a new regression test that an attempt to change it after a query fails. I've poked at it with various ad hoc tests, and it is behaving as expected in those. I wasn't too confident how to word the new failure messages. -Kevin *** a/src/backend/commands/variable.c --- b/src/backend/commands/variable.c *** *** 544,549 show_log_timezone(void) --- 544,580 /* + * SET TRANSACTION READ ONLY and SET TRANSACTION READ WRITE + * + * These should be transaction properties which can be set in exactly the + * same points in time that transaction isolation may be set. + */ + bool + assign_transaction_read_only(bool value, bool doit, GucSource source) + { + if (FirstSnapshotSet) + { + ereport(GUC_complaint_elevel(source), + (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), +errmsg(read-only property must be set before any query))); + /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ + if (source != PGC_S_OVERRIDE) + return false; + } + else if (IsSubTransaction()) + { + ereport(GUC_complaint_elevel(source), + (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), +errmsg(read-only propery may not be changed in a subtransaction))); + /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ + if (source != PGC_S_OVERRIDE) + return false; + } + + return true; + } + + /* * SET TRANSACTION ISOLATION LEVEL */ *** a/src/backend/utils/misc/guc.c --- b/src/backend/utils/misc/guc.c *** *** 169,175 static bool assign_bonjour(bool newval, bool doit, GucSource source); static bool assign_ssl(bool newval, bool doit, GucSource source); static bool assign_stage_log_stats(bool newval, bool doit, GucSource source); static bool assign_log_stats(bool newval, bool doit, GucSource source); - static bool assign_transaction_read_only(bool newval, bool doit, GucSource source); static const char *assign_canonical_path(const char *newval, bool doit, GucSource source); static const char *assign_timezone_abbreviations(const char *newval, bool doit, GucSource source); static const char *show_archive_command(void); --- 169,174 *** *** 7837,7870 assign_log_stats(bool newval, bool doit, GucSource source) return true; } - static bool - assign_transaction_read_only(bool newval, bool doit, GucSource source) - { - /* Can't go to r/w mode inside a r/o transaction */ - if (newval == false XactReadOnly IsSubTransaction()) - { - ereport(GUC_complaint_elevel(source), - (errcode(ERRCODE_INVALID_PARAMETER_VALUE), -errmsg(cannot set transaction read-write mode inside a read-only transaction))); - /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ - if (source != PGC_S_OVERRIDE) - return false; - } - - /* Can't go to r/w mode while recovery is still active */ - if (newval == false XactReadOnly RecoveryInProgress()) - { - ereport(GUC_complaint_elevel(source), - (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(cannot set transaction read-write mode during recovery))); - /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ - if (source != PGC_S_OVERRIDE) - return false; - } - - return true; - } - static const char * assign_canonical_path(const char *newval, bool doit, GucSource source) { --- 7836,7841 *** a/src/include/commands/variable.h --- b/src/include/commands/variable.h *** *** 21,26 extern const char *show_timezone(void); --- 21,28 extern const char *assign_log_timezone(const char *value, bool doit, GucSource source); extern const char *show_log_timezone(void); + extern bool assign_transaction_read_only(bool value, + bool doit, GucSource source); extern
Re: [HACKERS] serializable read only deferrable
Kevin Grittner kevin.gritt...@wicourts.gov writes: Attached. Accomplished more through mimicry (based on setting transaction isolation level) than profound understanding of the code involved; but it passes all regression tests on both `make check` and `make installcheck-world`. This includes a new regression test that an attempt to change it after a query fails. I've poked at it with various ad hoc tests, and it is behaving as expected in those. Hmm. This patch disallows the case of creating a read-only subtransaction of a read-write parent. That's a step backwards. I'm not sure how we could enforce that the property not change after the first query of a subxact, but maybe we don't care that much? Do your optimizations pay attention to local read-only in a subxact? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Tom Lane t...@sss.pgh.pa.us wrote: Hmm. This patch disallows the case of creating a read-only subtransaction of a read-write parent. That's a step backwards. I'm not sure how we could enforce that the property not change after the first query of a subxact, but maybe we don't care that much? Do your optimizations pay attention to local read-only in a subxact? No, it's all about the top level transaction, as long as the subtransaction doesn't do anything which violates the requirements of the top level. (That is, if the top level is not READ ONLY, I can't do the optimizations, but it would cause no problem if a subtransaction was READ ONLY -- it just wouldn't allow any special optimizations.) I noticed that the standard seems (if I'm reading it correctly) to allow subtransactions to switch to more restrictive settings for both transaction isolation and read only status than the enclosing transaction, but not looser. I don't think it makes sense in PostgreSQL to say (for example) that the top level transaction is READ COMMITTED but the subtransaction is SERIALIZABLE, but it might make sense to say that the top level transaction is READ WRITE but the subtransaction is READ ONLY. And I see where I broke support for that in the patch. I can fix up the patch if to support it again if you like. (I think it's just a matter of replacing a few lines that I replaced in the original patch.) If you'd rather do it, I'll stay out of your way. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Kevin Grittner kevin.gritt...@wicourts.gov writes: I noticed that the standard seems (if I'm reading it correctly) to allow subtransactions to switch to more restrictive settings for both transaction isolation and read only status than the enclosing transaction, but not looser. Yeah. My recollection is that we've discussed exactly this point with respect to isolation level, and decided that we couldn't (or it wasn't worthwhile to) allow serializable subxacts inside repeatable read. I don't know whether your patch will change that tradeoff. But I don't think it's really been discussed much with respect to read-only, perhaps because nobody's paid all that much attention to read-only at all. In any case, the behavior you state seems obviously correct, so let's see what we can do about getting closer to that. My guess is that a reasonable fix is to remember the read-only setting as of snapshot lockdown, and thereafter to allow changing from read-write to read-only but not vice versa. One thing to watch for is allowing subxact exit to restore the previous read-write state. (BTW it looks like assign_XactIsoLevel emits a rather useless debug message in that case, so that code could stand some cleanup too. Also, that code is set so that it will throw an error even if you're assigning the currently active setting, which maybe is overly strict? Not sure.) I can fix up the patch if to support it again if you like. (I think it's just a matter of replacing a few lines that I replaced in the original patch.) If you'd rather do it, I'll stay out of your way. Feel free to have a go at it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: One thing to watch for is allowing subxact exit to restore the previous read-write state. OK. BTW it looks like assign_XactIsoLevel emits a rather useless debug message in that case, so that code could stand some cleanup too. I'll take a look. Also, that code is set so that it will throw an error even if you're assigning the currently active setting, which maybe is overly strict? Not sure. The standard is tricky to read, but my reading of it is that only LOCAL changes are allowed after the transaction is underway (which I *think* effectively means a subtransaction), and those can't make the setting less strict -- you're allowed to specify the same level or more strict. There would be no harm from the perspective of anything I'm working on to allow an in-progress transaction to be set to what it already has, but that seems to invite confusion and error more than provide a helpful feature, as far as I can tell. I'm inclined not to allow it except at the start of a subtransaction, but don't feel strongly about it. I can fix up the patch if to support it again if you like. Feel free to have a go at it. Will do. I notice that I also removed the check for RecoveryInProgress(), which I will also restore. And maybe a regression test or two for the subtransaction usages -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Kevin Grittner kevin.gritt...@wicourts.gov writes: Tom Lane t...@sss.pgh.pa.us wrote: Also, that code is set so that it will throw an error even if you're assigning the currently active setting, which maybe is overly strict? Not sure. The standard is tricky to read, but my reading of it is that only LOCAL changes are allowed after the transaction is underway (which I *think* effectively means a subtransaction), and those can't make the setting less strict -- you're allowed to specify the same level or more strict. There would be no harm from the perspective of anything I'm working on to allow an in-progress transaction to be set to what it already has, but that seems to invite confusion and error more than provide a helpful feature, as far as I can tell. I'm inclined not to allow it except at the start of a subtransaction, but don't feel strongly about it. Hmm ... (1) If the standard says that you're allowed to apply a redundant setting, I think we'd better accept that. (2) I'm not thrilled about the idea of tracking the equivalent of FirstSnapshotSet for each subtransaction, which I think would be necessary infrastructure to error-check this as tightly as you seem to have in mind. I'd prefer to be a bit laxer in order to have less overhead for what is in the end just nanny-ism. So the implementation I had in mind would allow SET TRANSACTION operations to occur later in a subxact, as long as they were redundant and weren't actually trying to change the active value. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Tom Lane t...@sss.pgh.pa.us wrote: If the standard says that you're allowed to apply a redundant setting, I think we'd better accept that. OK So the implementation I had in mind would allow SET TRANSACTION operations to occur later in a subxact, as long as they were redundant and weren't actually trying to change the active value. It's easy to see how I can allow changes in the subtransaction as long as they don't specify READ WRITE when the top level is READ ONLY, but it isn't obvious to me how to only allow it at the start of the subtransaction. I'm OK with taking the easy route on this aspect of things, but if someone needs READ ONLY to stick for the duration of a subtransaction, I'm not sure how to do that. (And I'm not sure you were actually suggesting that, either.) To restate, since I'm not sure how clear that is, what I have at the moment is: (1) A top level transaction can only set READ ONLY or READ WRITE until it has acquired its first snapshot. (2) A subtransaction can set it at will, as many times as desired, to match the top level or specify READ ONLY. (3) During recovery the setting cannot be changed from READ ONLY to READ WRITE. I'm not clear whether #2 is OK or how to do it only at the start. I haven't yet looked at the other issues you mentioned. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
On Dec8, 2010, at 20:39 , Kevin Grittner wrote: The standard is tricky to read, but my reading of it is that only LOCAL changes are allowed after the transaction is underway (which I *think* effectively means a subtransaction), and those can't make the setting less strict -- you're allowed to specify the same level or more strict. There would be no harm from the perspective of anything I'm working on to allow an in-progress transaction to be set to what it already has, but that seems to invite confusion and error more than provide a helpful feature, as far as I can tell. I'm inclined not to allow it except at the start of a subtransaction, but don't feel strongly about it. Hm, I think being able to assert that the isolation level really is SERIALIZABLE by simply doing SET TRANSACTION ISOLATION LEVEL SERIALIZABLE would be a great feature for SSI. Say you've written a trigger which enforces some complex constraint, but is correct only for SERIALIZABLE transactions. By simply sticking a SET TRANSACTION ISOLATION LEVEL SERIALIZABLE at the top of the trigger you'd both document that fact it is correct only for SERIALIZABLE transactions *and* prevent corruption should the isolation level be something else due to a pilot error. Nice, simply and quite effective. BTW, I hope to find some time this evening to review your more detailed proposal for serializable read only deferrable best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Kevin Grittner kevin.gritt...@wicourts.gov writes: Tom Lane t...@sss.pgh.pa.us wrote: So the implementation I had in mind would allow SET TRANSACTION operations to occur later in a subxact, as long as they were redundant and weren't actually trying to change the active value. It's easy to see how I can allow changes in the subtransaction as long as they don't specify READ WRITE when the top level is READ ONLY, but it isn't obvious to me how to only allow it at the start of the subtransaction. I'm OK with taking the easy route on this aspect of things, but if someone needs READ ONLY to stick for the duration of a subtransaction, I'm not sure how to do that. (And I'm not sure you were actually suggesting that, either.) What I suggested was to not allow read-only - read-write state transitions except (1) before first snapshot in the main xact and (2) at subxact exit (the OVERRIDE case). That seems to accomplish the goal. Now it will also allow dropping down to read-only mid-subtransaction, but I don't think forbidding that case is worth extra complexity. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Florian Pflug f...@phlo.org wrote: Hm, I think being able to assert that the isolation level really is SERIALIZABLE by simply doing SET TRANSACTION ISOLATION LEVEL SERIALIZABLE would be a great feature for SSI. Say you've written a trigger which enforces some complex constraint, but is correct only for SERIALIZABLE transactions. By simply sticking a SET TRANSACTION ISOLATION LEVEL SERIALIZABLE at the top of the trigger you'd both document that fact it is correct only for SERIALIZABLE transactions *and* prevent corruption should the isolation level be something else due to a pilot error. Nice, simply and quite effective. It would be great to have a way within a trigger, or possibly other functions, to assert that the transaction isolation level is serializable. What gives me pause here is that the standard allows you to specify a more strict transaction isolation level within a subtransaction without error, so this way of spelling the feature is flirting with rather nonstandard behavior. Is there maybe a better way to check this? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Kevin Grittner kevin.gritt...@wicourts.gov writes: Florian Pflug f...@phlo.org wrote: Say you've written a trigger which enforces some complex constraint, but is correct only for SERIALIZABLE transactions. By simply sticking a SET TRANSACTION ISOLATION LEVEL SERIALIZABLE at the top of the trigger you'd both document that fact it is correct only for SERIALIZABLE transactions *and* prevent corruption should the isolation level be something else due to a pilot error. Nice, simply and quite effective. It would be great to have a way within a trigger, or possibly other functions, to assert that the transaction isolation level is serializable. What gives me pause here is that the standard allows you to specify a more strict transaction isolation level within a subtransaction without error, so this way of spelling the feature is flirting with rather nonstandard behavior. Yes. This is not the way to provide a feature like that. Is there maybe a better way to check this? You can always read the current setting and throw an error if you don't like it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Tom Lane t...@sss.pgh.pa.us wrote: What I suggested was to not allow read-only - read-write state transitions except (1) before first snapshot in the main xact and (2) at subxact exit (the OVERRIDE case). That seems to accomplish the goal. Now it will also allow dropping down to read-only mid-subtransaction, but I don't think forbidding that case is worth extra complexity. Attached is version 2. I think it does everything we discussed, except that I'm not sure whether I addressed the assign_XactIsoLevel issue you mentioned, since you mentioned a debug message and I only see things that look like errors to me. If I did miss something, I'll be happy to take another look if you can point me to the right place. -Kevin *** a/src/backend/commands/variable.c --- b/src/backend/commands/variable.c *** *** 544,551 show_log_timezone(void) --- 544,595 /* + * SET TRANSACTION READ ONLY and SET TRANSACTION READ WRITE + * + * These should be transaction properties which can be set in exactly the + * same points in time that transaction isolation may be set. + */ + bool + assign_transaction_read_only(bool newval, bool doit, GucSource source) + { + /* Can't go to r/w mode inside a r/o transaction */ + if (newval == false XactReadOnly IsSubTransaction()) + { + ereport(GUC_complaint_elevel(source), + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg(cannot set transaction read-write mode inside a read-only transaction))); + /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ + if (source != PGC_S_OVERRIDE) + return false; + } + /* Top level transaction can't change this after first snapshot. */ + else if (FirstSnapshotSet !IsSubTransaction()) + { + ereport(GUC_complaint_elevel(source), + (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), +errmsg(read-only property must be set before any query))); + /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ + if (source != PGC_S_OVERRIDE) + return false; + } + /* Can't go to r/w mode while recovery is still active */ + else if (newval == false XactReadOnly RecoveryInProgress()) + { + ereport(GUC_complaint_elevel(source), + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg(cannot set transaction read-write mode during recovery))); + /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ + if (source != PGC_S_OVERRIDE) + return false; + } + + return true; + } + + /* * SET TRANSACTION ISOLATION LEVEL */ + extern char *XactIsoLevel_string; /* in guc.c */ const char * assign_XactIsoLevel(const char *value, bool doit, GucSource source) *** *** 559,565 assign_XactIsoLevel(const char *value, bool doit, GucSource source) if (source != PGC_S_OVERRIDE) return NULL; } ! else if (IsSubTransaction()) { ereport(GUC_complaint_elevel(source), (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), --- 603,610 if (source != PGC_S_OVERRIDE) return NULL; } ! else if (IsSubTransaction() ! strcmp(value, XactIsoLevel_string) != 0) { ereport(GUC_complaint_elevel(source), (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), *** a/src/backend/utils/misc/guc.c --- b/src/backend/utils/misc/guc.c *** *** 169,175 static bool assign_bonjour(bool newval, bool doit, GucSource source); static bool assign_ssl(bool newval, bool doit, GucSource source); static bool assign_stage_log_stats(bool newval, bool doit, GucSource source); static bool assign_log_stats(bool newval, bool doit, GucSource source); - static bool assign_transaction_read_only(bool newval, bool doit, GucSource source); static const char *assign_canonical_path(const char *newval, bool doit, GucSource source); static const char *assign_timezone_abbreviations(const char *newval, bool doit, GucSource source); static const char *show_archive_command(void); --- 169,174 *** *** 426,432 static int server_version_num; static char *timezone_string; static char *log_timezone_string; static char *timezone_abbreviations_string; - static char *XactIsoLevel_string; static char *custom_variable_classes; static intmax_function_args; static intmax_index_keys; --- 425,430 *** *** 441,446 static int effective_io_concurrency; --- 439,445 /* should be
Re: [HACKERS] serializable read only deferrable
Kevin Grittner kevin.gritt...@wicourts.gov writes: except that I'm not sure whether I addressed the assign_XactIsoLevel issue you mentioned, since you mentioned a debug message and I only see things that look like errors to me. If I did miss something, I'll be happy to take another look if you can point me to the right place. GUC_complaint_elevel() can return DEBUGn, and in fact will do so in the PGC_S_OVERRIDE case. I'm thinking that the code ought to look more like /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ if (source != PGC_S_OVERRIDE) { check and report all the complaint-worthy cases; } The original coding was probably sane for initial development, but now it just results in useless debug-log traffic for predictable perfectly valid cases. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Tom Lane t...@sss.pgh.pa.us wrote: GUC_complaint_elevel() can return DEBUGn, and in fact will do so in the PGC_S_OVERRIDE case. I'm thinking that the code ought to look more like /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ if (source != PGC_S_OVERRIDE) { check and report all the complaint-worthy cases; } Done. Version 3 attached. I think I've covered all bases now. -Kevin *** a/src/backend/commands/variable.c --- b/src/backend/commands/variable.c *** *** 544,572 show_log_timezone(void) /* * SET TRANSACTION ISOLATION LEVEL */ const char * assign_XactIsoLevel(const char *value, bool doit, GucSource source) { ! if (FirstSnapshotSet) { ! ereport(GUC_complaint_elevel(source), ! (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), !errmsg(SET TRANSACTION ISOLATION LEVEL must be called before any query))); ! /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ ! if (source != PGC_S_OVERRIDE) return NULL; ! } ! else if (IsSubTransaction()) ! { ! ereport(GUC_complaint_elevel(source), ! (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), !errmsg(SET TRANSACTION ISOLATION LEVEL must not be called in a subtransaction))); ! /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ ! if (source != PGC_S_OVERRIDE) return NULL; } if (strcmp(value, serializable) == 0) --- 544,615 /* + * SET TRANSACTION READ ONLY and SET TRANSACTION READ WRITE + * + * These should be transaction properties which can be set in exactly the + * same points in time that transaction isolation may be set. + */ + bool + assign_transaction_read_only(bool newval, bool doit, GucSource source) + { + /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ + if (source != PGC_S_OVERRIDE) + { + /* Can't go to r/w mode inside a r/o transaction */ + if (newval == false XactReadOnly IsSubTransaction()) + { + ereport(GUC_complaint_elevel(source), + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg(cannot set transaction read-write mode inside a read-only transaction))); + return false; + } + /* Top level transaction can't change this after first snapshot. */ + else if (FirstSnapshotSet !IsSubTransaction()) + { + ereport(GUC_complaint_elevel(source), + (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), +errmsg(read-only property must be set before any query))); + return false; + } + /* Can't go to r/w mode while recovery is still active */ + else if (newval == false XactReadOnly RecoveryInProgress()) + { + ereport(GUC_complaint_elevel(source), + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg(cannot set transaction read-write mode during recovery))); + return false; + } + } + + return true; + } + + /* * SET TRANSACTION ISOLATION LEVEL */ + extern char *XactIsoLevel_string; /* in guc.c */ const char * assign_XactIsoLevel(const char *value, bool doit, GucSource source) { ! /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ ! if (source != PGC_S_OVERRIDE) { ! if (FirstSnapshotSet) ! { ! ereport(GUC_complaint_elevel(source), ! (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), !errmsg(SET TRANSACTION ISOLATION LEVEL must be called before any query))); return NULL; ! } ! else if (IsSubTransaction() ! strcmp(value, XactIsoLevel_string) != 0) ! { ! ereport(GUC_complaint_elevel(source), ! (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), !errmsg(SET TRANSACTION ISOLATION LEVEL must not be called in a subtransaction))); return NULL; + } } if (strcmp(value, serializable) == 0) *** a/src/backend/utils/misc/guc.c --- b/src/backend/utils/misc/guc.c *** *** 169,175 static bool assign_bonjour(bool newval, bool
Re: [HACKERS] serializable read only deferrable
Kevin Grittner kevin.gritt...@wicourts.gov wrote: Done. Version 3 attached. My final tweaks didn't make it into that diff. Attached is 3a as a patch over the version 3 patch. -Kevin *** a/src/backend/commands/variable.c --- b/src/backend/commands/variable.c *** *** 564,570 assign_transaction_read_only(bool newval, bool doit, GucSource source) return false; } /* Top level transaction can't change this after first snapshot. */ ! else if (FirstSnapshotSet !IsSubTransaction()) { ereport(GUC_complaint_elevel(source), (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), --- 564,570 return false; } /* Top level transaction can't change this after first snapshot. */ ! if (FirstSnapshotSet !IsSubTransaction()) { ereport(GUC_complaint_elevel(source), (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), *** *** 572,578 assign_transaction_read_only(bool newval, bool doit, GucSource source) return false; } /* Can't go to r/w mode while recovery is still active */ ! else if (newval == false XactReadOnly RecoveryInProgress()) { ereport(GUC_complaint_elevel(source), (errcode(ERRCODE_INVALID_PARAMETER_VALUE), --- 572,578 return false; } /* Can't go to r/w mode while recovery is still active */ ! if (newval == false XactReadOnly RecoveryInProgress()) { ereport(GUC_complaint_elevel(source), (errcode(ERRCODE_INVALID_PARAMETER_VALUE), *** *** 602,609 assign_XactIsoLevel(const char *value, bool doit, GucSource source) errmsg(SET TRANSACTION ISOLATION LEVEL must be called before any query))); return NULL; } ! else if (IsSubTransaction() ! strcmp(value, XactIsoLevel_string) != 0) { ereport(GUC_complaint_elevel(source), (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), --- 602,609 errmsg(SET TRANSACTION ISOLATION LEVEL must be called before any query))); return NULL; } ! /* We ignore a subtransaction setting it to the existing value. */ ! if (IsSubTransaction() strcmp(value, XactIsoLevel_string) != 0) { ereport(GUC_complaint_elevel(source), (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
On Dec6, 2010, at 22:53 , Kevin Grittner wrote: The alternative seems to be to drop the guarantee that a SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a stream of overlapping non-READ ONLY transactions. Then a flag in the proc array that marks non-READ ONLY transactions should be sufficient, plus a wait-and-retry loop to take snapshots for SERIALIZABLE READ ONLY DEFERRABLE transactions. If I can find a way to pause an active process I already have functions in which I maintain the count of active SERIALIZABLE READ WRITE transactions as they begin and end -- I could release pending DEFERRABLE transactions when the count hits zero without any separate loop. That has the added attraction of being a path to the more complex checking which could allow the deferrable process to start sooner in some circumstances. The simple solution with the heavyweight lock would not have been a good path to that. I'm starting to wonder if you couldn't get a weaker form of the non-starvation guarantee back by doing the waiting *after* you acquire the snapshot of a SERIALIZABLE RAD ONLY transaction instead of before. AFAICS, the main reason for a SERIALIZABLE RAD ONLY transaction's snapshot to be inconsistent that it sees some transaction A as committed and B as uncommitted when on the other hand B must happen before A in any serial schedule. In other words, if there is no dangerous structure even if you add an rw-dependency edge from the SERIALIZABLE RAD ONLY transaction to every concurrent transaction, the SERIALIZABLE RAD ONLY transaction's snapshot is consistent. I'm thus envisioning something along the line of 1) Take a snapshot, flag the transaction as SERIALIZABLE READ ONLY DEFERRED, and add a rw-dependency to every other running READ WRITE transaction 2) Wait for all these concurrent transaction to either COMMIT or ABORT 3) Check if the transaction has been marked INCONSISTENT. If not, let the transaction proceed. If it was, start over with (1) *) During conflict detection, you'd check if one of the participating transaction is flagged as SERIALIZABLE READ ONLY DEFERRED and mark it INCONSISTENT if it is. Essentially, instead of adding dependencies as you go along and abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERRED transactions would assume the worst case from the start and thus be able to bypass the more detailed checks later on. With this scheme, you'd at least stand some chance of eventually acquiring a consistent snapshot, even in the case of an endless stream of overlapping READ WRITE transactions. I have to admit though that I didn't really think this through thoroughly yet, it was more of a quick idea I got after pondering this for a bit before I went to bed yesterday. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Florian Pflug f...@phlo.org wrote: reason for a SERIALIZABLE READ ONLY transaction's snapshot to be inconsistent that it sees some transaction A as committed and B as uncommitted when on the other hand B must happen before A in any serial schedule. Precisely right, and very well stated. I'm thus envisioning something along the line of 1) Take a snapshot, flag the transaction as SERIALIZABLE READ ONLY DEFERRED, and add a rw-dependency to every other running READ WRITE transaction 2) Wait for all these concurrent transaction to either COMMIT or ABORT 3) Check if the transaction has been marked INCONSISTENT. If not, let the transaction proceed. If it was, start over with (1) *) During conflict detection, you'd check if one of the participating transaction is flagged as SERIALIZABLE READ ONLY DEFERRED and mark it INCONSISTENT if it is. That is brilliant. Essentially, instead of adding dependencies as you go along and abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERRED transactions would assume the worst case from the start and thus be able to bypass the more detailed checks later on. Right -- such a transaction, having acquired a good snapshot, could release all SSI resources and run without any of the SSI overhead. With this scheme, you'd at least stand some chance of eventually acquiring a consistent snapshot, even in the case of an endless stream of overlapping READ WRITE transactions. Yeah, I'd been twisting ideas around trying to find a good way to do this; you've got it right at the conceptual level, I think. I have to admit though that I didn't really think this through thoroughly yet, it was more of a quick idea I got after pondering this for a bit before I went to bed yesterday. [reads through it a few more times, sips caffeine, and thinks] Really, what you care about is whether any of the READ WRITE transactions active at the time the snapshot was acquired commit after developing a rw-conflict with a transaction which committed before the READ ONLY DEFERRABLE snapshot was acquired. (The reader would have to appear first in any serial schedule, yet the READ ONLY transaction can see the effects of the writer but not the reader.) Which brings up another point, that reader must also write to a permanent table before it commits in order to become the pivot in the dangerous structure. Pseudo-code of idea (conveniently ignoring locking issues and non-serializable transactions): // serializable read only deferrable xact do { get a snapshot clear inconsistent flag if (no concurrent read write xacts) break; // we got it the easy way associate all active read write xacts with this xact block until told to wake } while (inconsistent); clear xact from any SSI structures its in run with the snapshot // each xact associated with the above on transaction completion if (commit and has written and has conflict out to xact committed before deferrable snapshot) { flag deferrable as inconsistent unblock deferrable xact } else if this is termination of last associated read write xact unblock deferrable xact Seem sane? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Tom Lane t...@sss.pgh.pa.us wrote: I assume this would have to be a hard definition of READ ONLY, not the rather squishy definition we use now? Oh, I just went through the code on setting READ ONLY and discovered that contrary to the standard *and* the PostgreSQL documentation, you can change the status of a transaction between READ ONLY and READ WRITE at will. Yeah, that's a problem for my intended use. Many optimizations would need to go right out the window, and the false positive rate under SSI would be high. How would we manage the compatibility implications? Comply with the standard. The bright side of this is that it wouldn't require any change to our user docs. http://www.postgresql.org/docs/current/interactive/sql-start-transaction.html | This command begins a new transaction block. If the isolation | level or read/write mode is specified, the new transaction has | those characteristics, as if SET TRANSACTION was executed. This is | the same as the BEGIN command. and on the same page: | Compatibility | | In the standard, it is not necessary to issue START TRANSACTION to | start a transaction block: any SQL command implicitly begins a | block. PostgreSQL's behavior can be seen as implicitly issuing a | COMMIT after each command that does not follow START TRANSACTION | (or BEGIN), and it is therefore often called autocommit. Other | relational database systems might offer an autocommit feature as a | convenience. No mention of and you can change back and forth between READ ONLY and READ WRITE any time during the transaction, including between reads and writes, as many times as you like. Was there a justification for this behavior, or was it just not implemented carefully? Does anyone currently depend on the current behavior? test=# create table asdf (id int not null primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index asdf_pkey for table asdf CREATE TABLE test=# set default_transaction_isolation = serializable; SET test=# set transaction read only; SET BEGIN test=# set transaction read only; SET test=# select 1; ?column? -- 1 (1 row) test=# set transaction read write; SET test=# insert into asdf values (1); INSERT 0 1 test=# set transaction read only; SET test=# select * from asdf; id 1 (1 row) test=# set transaction read write; SET test=# insert into asdf values (2); INSERT 0 1 test=# commit; COMMIT I find that to be a huge POLA violation. I will happily prepare a patch to fix this if there is agreement that we want it. I really need READ ONLY *transactions*, not READ ONLY *moments* within transactions to do any optimization based on the property. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Kevin Grittner kevin.gritt...@wicourts.gov writes: Oh, I just went through the code on setting READ ONLY and discovered that contrary to the standard *and* the PostgreSQL documentation, you can change the status of a transaction between READ ONLY and READ WRITE at will. Yeah, that's a problem for my intended use. Many optimizations would need to go right out the window, and the false positive rate under SSI would be high. I believe you had better support the locution begin; set transaction read only; ... I agree that letting it be changed back to read/write after that is surprising and unnecessary. Perhaps locking down the setting at the time of first grabbing a snapshot would be appropriate. IIRC that's how it works for transaction isolation level, and this seems like it ought to work the same. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
I wrote: Tom Lane wrote: I assume this would have to be a hard definition of READ ONLY, not the rather squishy definition we use now? I'm excluding temporary tables from SSI on the grounds that they are only read and written by a single transaction and therefore can't be a source of rw-dependencies, and I'm excluding system tables on the grounds that they don't follow normal snapshot isolation rules. Hint bit rewrites are not an issue for SSI. Are there any other squishy aspect I might need to consider? I reviewed the documentation and played around with this a bit and can't find any areas where the current PostgreSQL implementation of READ ONLY is incompatible with what is needed for the SSI optimizations where it is used. There are a large number of tests which exercise this, and they're all passing. Did you have something in particular in mind which I should check? An example of code you think might break would be ideal, but anything more concrete than the word squishy would be welcome. Any thoughts on the original question about what to use as a heavyweight lock to support the subject feature? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Kevin Grittner kevin.gritt...@wicourts.gov writes: I reviewed the documentation and played around with this a bit and can't find any areas where the current PostgreSQL implementation of READ ONLY is incompatible with what is needed for the SSI optimizations where it is used. There are a large number of tests which exercise this, and they're all passing. Did you have something in particular in mind which I should check? I did not, just thought it was a point that merited examination. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
On Dec5, 2010, at 16:11 , Kevin Grittner wrote: The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under SSI would be to have each non-read-only serializable transaction acquire a heavyweight lock which can coexist with other locks at the same level (SHARE looks good) on some common object and hold that for the duration of the transaction, while a SERIALIZABLE READ ONLY DEFERRABLE transaction would need to acquire a conflicting lock (EXCLUSIVE looks good) before it could acquire a snapshot, and release the lock immediately after acquiring the snapshot. Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to acquire the lock, no other transaction would be allowed to start until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been able to acquire its snapshot. For pg_dump's purposes at least, that seems undesirable, since a single long-running transaction at the time you start pg_dump would effectly DoS your system until the long-running transaction finishes. The alternative seems to be to drop the guarantee that a SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a stream of overlapping non-READ ONLY transactions. Then a flag in the proc array that marks non-READ ONLY transactions should be sufficient, plus a wait-and-retry loop to take snapshots for SERIALIZABLE READ ONLY DEFERRABLE transactions. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Florian Pflug f...@phlo.org wrote: On Dec5, 2010, at 16:11 , Kevin Grittner wrote: The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under SSI would be to have each non-read-only serializable transaction acquire a heavyweight lock which can coexist with other locks at the same level (SHARE looks good) on some common object and hold that for the duration of the transaction, while a SERIALIZABLE READ ONLY DEFERRABLE transaction would need to acquire a conflicting lock (EXCLUSIVE looks good) before it could acquire a snapshot, and release the lock immediately after acquiring the snapshot. Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to acquire the lock, no other transaction would be allowed to start until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been able to acquire its snapshot. For pg_dump's purposes at least, that seems undesirable, since a single long-running transaction at the time you start pg_dump would effectly DoS your system until the long-running transaction finishes. Well, when you put it that way, it sounds pretty grim. :-( Since one of the bragging points of SSI is that it doesn't introduce any blocking beyond current snapshot isolation, I don't want to do something here which blocks anything except the transaction which has explicitly requested the DEFERRABLE property. I guess that, simple as that technique might be, it just isn't a good idea. The alternative seems to be to drop the guarantee that a SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a stream of overlapping non-READ ONLY transactions. Then a flag in the proc array that marks non-READ ONLY transactions should be sufficient, plus a wait-and-retry loop to take snapshots for SERIALIZABLE READ ONLY DEFERRABLE transactions. If I can find a way to pause an active process I already have functions in which I maintain the count of active SERIALIZABLE READ WRITE transactions as they begin and end -- I could release pending DEFERRABLE transactions when the count hits zero without any separate loop. That has the added attraction of being a path to the more complex checking which could allow the deferrable process to start sooner in some circumstances. The simple solution with the heavyweight lock would not have been a good path to that. What would be the correct way for a process to put itself to sleep, and for another process to later wake it up? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
On 06.12.2010 22:53, Kevin Grittner wrote: What would be the correct way for a process to put itself to sleep, and for another process to later wake it up? See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 06.12.2010 22:53, Kevin Grittner wrote: What would be the correct way for a process to put itself to sleep, and for another process to later wake it up? See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code. Is there a reason to prefer one over the other? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Kevin Grittner kevin.gritt...@wicourts.gov writes: I'm reviving the discussion on the subject topic because I just had an epiphany which makes it seem simple to implement. The concept of this is that if you start a SERIALIZABLE READ ONLY transaction in an SSI environment when certain conditions are true, it doesn't need to acquire predicate locks or test for rw-conflicts. I assume this would have to be a hard definition of READ ONLY, not the rather squishy definition we use now? How would we manage the compatibility implications? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Tom Lane wrote: Kevin Grittner writes: I'm reviving the discussion on the subject topic because I just had an epiphany which makes it seem simple to implement. The concept of this is that if you start a SERIALIZABLE READ ONLY transaction in an SSI environment when certain conditions are true, it doesn't need to acquire predicate locks or test for rw-conflicts. I assume this would have to be a hard definition of READ ONLY, not the rather squishy definition we use now? How would we manage the compatibility implications? If there are issues with whether READ ONLY covers the right ground, it's likely to affect more than this particular issue -- I've taken advantage of the READ ONLY property of transactions to allow quite a few novel optimizations to SSI beyond what is presented in Cahill's doctoral work. I'm excluding temporary tables from SSI on the grounds that they are only read and written by a single transaction and therefore can't be a source of rw-dependencies, and I'm excluding system tables on the grounds that they don't follow normal snapshot isolation rules. Hint bit rewrites are not an issue for SSI. Are there any other squishy aspect I might need to consider? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers