Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 11 December 2012 03:01, Noah Misch n...@leadboat.com wrote: On Mon, Dec 10, 2012 at 08:04:55PM -0500, Robert Haas wrote: I think the current behavior, where we treat FREEZE as a hint, is just awful. Regardless of whether the behavior is automatic or manually requested, the idea that you might get the optimization or not depending on the timing of relcache flushes seems very much undesirable. I mean, if the optimization is actually important for performance, then you want to get it when you ask for it. If it isn't, then why bother having it at all? Let's say that COPY FREEZE normally doubles performance on a data load that therefore takes 8 hours - somebody who suddenly loses that benefit because of a relcache flush that they can't prevent or control and ends up with a 16 hour data load is going to pop a gasket. Until these threads, I did not know that a relcache invalidation could trip up the WAL avoidance optimization, and now this. I poked at the relevant relcache.c code, and it already takes pains to preserve the needed facts. The header comment of RelationCacheInvalidate() indicates that entries bearing an rd_newRelfilenodeSubid can safely survive the invalidation, but the code does not implement that. I think the comment is right, and this is just an oversight in the code going back to its beginning (fba8113c). I think the comment is right also and so the patch is good. I will apply, barring objections. The information is only ever non-zero inside a single backend. There isn't any reason I can see why we wouldn't be able to remember this information in all cases, perhaps with a few push-ups. I doubt the comment at the declaration of rd_createSubid in rel.h, though I can't presently say what correct comment should replace it. rd_createSubid certainly does *not* get blown away by a message overflow as copy.c claims. I can't see any other way for a message overflow to cause it to be reset. I can no longer see a reason for us to regard the rd_createSubid as merely a hint. So we should change copy.c also. CLUSTER does preserve the old value, at least for the main table relation. CLUSTER probably should *set* rd_newRelfilenodeSubid. I can't see a reason not to do this in terms of correctness. However, I can't see any reason why you'd want to CLUSTER a table and then load more data into it in the same transaction, so I'm tempted to just leave that as is to avoid introducing other bugs. But I dare say people will think we should fix it there also. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Fri, Dec 21, 2012 at 06:47:56PM +, Simon Riggs wrote: On 11 December 2012 03:01, Noah Misch n...@leadboat.com wrote: Until these threads, I did not know that a relcache invalidation could trip up the WAL avoidance optimization, and now this. I poked at the relevant relcache.c code, and it already takes pains to preserve the needed facts. The header comment of RelationCacheInvalidate() indicates that entries bearing an rd_newRelfilenodeSubid can safely survive the invalidation, but the code does not implement that. I think the comment is right, and this is just an oversight in the code going back to its beginning (fba8113c). I think the comment is right also and so the patch is good. I will apply, barring objections. The information is only ever non-zero inside a single backend. There isn't any reason I can see why we wouldn't be able to remember this information in all cases, perhaps with a few push-ups. I doubt the comment at the declaration of rd_createSubid in rel.h, though I can't presently say what correct comment should replace it. rd_createSubid certainly does *not* get blown away by a message overflow as copy.c claims. I can't see any other way for a message overflow to cause it to be reset. I can no longer see a reason for us to regard the rd_createSubid as merely a hint. So we should change copy.c also. I thought of one case where we do currently forget rd_newRelfilenodeSubid: BEGIN; TRUNCATE t; SAVEPOINT save; TRUNCATE t; ROLLBACK TO save; I don't mind that one too much. CLUSTER does preserve the old value, at least for the main table relation. CLUSTER probably should *set* rd_newRelfilenodeSubid. I can't see a reason not to do this in terms of correctness. However, I can't see any reason why you'd want to CLUSTER a table and then load more data into it in the same transaction, so I'm tempted to just leave that as is to avoid introducing other bugs. But I dare say people will think we should fix it there also. I could see using that capability occasionally, but I wouldn't mix such a change in with the goals of this thread. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 21 December 2012 20:10, Noah Misch n...@leadboat.com wrote: I thought of one case where we do currently forget rd_newRelfilenodeSubid: BEGIN; TRUNCATE t; SAVEPOINT save; TRUNCATE t; ROLLBACK TO save; That's a weird one. Aborting a subtransacton that sets it, when it was already set. The loss of rd_newRelfilenodeSubid in that case is deterministic, but tracking the full complexity of multiple relations and multiple nested subxids isn't worth the trouble for such rare cases [assumption]. I'd go for just setting an its_too_complex flag (with better name) that we can use to trigger a message in COPY to say that FREEZE option won't be honoured. That would then be completely consistent, rather than the lack of deterministic behaviour that Robert rightly objects to. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Sat, Dec 22, 2012 at 12:42:43AM +, Simon Riggs wrote: On 21 December 2012 20:10, Noah Misch n...@leadboat.com wrote: I thought of one case where we do currently forget rd_newRelfilenodeSubid: BEGIN; TRUNCATE t; SAVEPOINT save; TRUNCATE t; ROLLBACK TO save; That's a weird one. Aborting a subtransacton that sets it, when it was already set. The loss of rd_newRelfilenodeSubid in that case is deterministic, but tracking the full complexity of multiple relations and multiple nested subxids isn't worth the trouble for such rare cases [assumption]. I'd go for just setting an its_too_complex flag (with better name) that we can use to trigger a message in COPY to say that FREEZE option won't be honoured. That would then be completely consistent, rather than the lack of deterministic behaviour that Robert rightly objects to. I wouldn't bother. The behavior here is deterministic, the cause clearly traceable to the specific commands issued. Stable software won't suddenly miss the optimization for no visible reason. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Mon, Dec 10, 2012 at 08:04:55PM -0500, Robert Haas wrote: You know, I hadn't been taking that option terribly seriously, but maybe we ought to reconsider it. It would certainly be simpler, and as you point out, it's not really any worse from an MVCC point of view than anything else we do. Moreover, it would make this available to clients like pg_dump without further hackery. I think the current behavior, where we treat FREEZE as a hint, is just awful. Regardless of whether the behavior is automatic or manually requested, the idea that you might get the optimization or not depending on the timing of relcache flushes seems very much undesirable. I mean, if the optimization is actually important for performance, then you want to get it when you ask for it. If it isn't, then why bother having it at all? Let's say that COPY FREEZE normally doubles performance on a data load that therefore takes 8 hours - somebody who suddenly loses that benefit because of a relcache flush that they can't prevent or control and ends up with a 16 hour data load is going to pop a gasket. Why was this patch applied when there are obviously so many concerns about its behavior? Was that not clear at commit time? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Simon, * Simon Riggs (si...@2ndquadrant.com) wrote: Agreed, but that is also be a silent change of current behaviour. Sure, proper MVCC for catalog entries would be a change, but I think it's one which would actually reduce the surprises for our users. Today we tell people we have transactional DDL, and that's somewhat true, but it's not MVCC-safe and that can lead to surprises. But the above will only work for CREATE TABLE, not for TRUNCATE. I'm trying to figure out why there are all the constraints around this command to begin with. If we're going to support this, why do we require the user to create or truncate the table in the same transaction? Clearly that's going to reduce the usefulness of this feature and it's not clear to me why that constraint is needed, code-wise. Also, what about adding FREEZE options to INSERT and maybe even UPDATE? Surely it would reduce the overhead associated with those commands as well. I've invested a lot of time in trying to improve the situation and investigated many routes forwards, none of them very satisfying. Until someone comes up with a better plan, FREEZE is a pragmatic way forwards that improves things now rather than waiting for the perfect solution. I agree that the perfect can sometimes be the enemy of the good, but I still feel that this is quite a slippery slope that's going to end up getting ourselves into trouble- regardless of how much we document it or set up constraints around it. And if we want checksums anytime soon we need ways to ameliorate the effect of hints on checksums, which this does, soemwhat. I don't buy into this argument in the least. Better plans, with code, welcome. While I appreciate the mentality that those-who-code-win, I don't believe that it can be used in an argument based on *correctness*. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Mon, Dec 10, 2012 at 08:32:53AM -0500, Stephen Frost wrote: * Noah Misch (n...@leadboat.com) wrote: On Fri, Dec 07, 2012 at 06:51:18PM -0500, Stephen Frost wrote: Now, what I've honestly been hoping for on this thread was for someone to speak up and point out why I'm wrong about this concern and explain how this patch addresses that issue. If that's been done, I've missed it.. [...] So, apparently I'm not wrong about my concern, but no one seems to share my feelings on this change. I continue to hold that this could end up being a slippery slope for us to go down wrt 'correctness' vs. 'do whatever the user wants'. I agree we should be reticent to compromise correctness for convenience. Compromising mere bug-compatibility, trading one incorrect behavior for another incorrect behavior, is not as bad. Furthermore, today's behavior in question is not something I can see applications deliberately and successfully relying upon. If we keep this to only COPY and where the table has to be truncated/created in the same transaction (which requires the user to have sufficient privileges to do non-MVCC-safe things on the table already), perhaps it's alright. Extending it to cases not involving a just-created or just-truncated table really would compromise correctness; errors could leave the table in an otherwise-impossible state. Let's indeed not go there. I see no particular need to restrict this to COPY; that's just the most important case by far. As a side note, the calculus for whether to extend the optimization to INSERT and UPDATE differs from that for WAL avoidance. WAL avoidance can be a substantial loss when the total change is small. For pre-hinting/freezing, the worst case is having needlessly checked a few local variables to rule out the optimization. It'll definitely reduce the interest in finding a real solution though, which is unfortunate. That effect seems likely, but I do not find it unfortunate. The change variant I have advocated does not stand in contrast to some real solution to PostgreSQL's treatment for readers of tables created or truncated by a transaction not in the reader's snapshot. The two topics interact at arm's length. Bundling them into one patch, artificially making them to stand or fall as a pair, is not a win for PostgreSQL. That does raise another disadvantage of making the change syntax-controlled: if we someday implement the other improvement, COPY FREEZE will have minimal reason not to be the default. FREEZE then becomes a relic noise word. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Sun, Dec 9, 2012 at 3:06 PM, Noah Misch n...@leadboat.com wrote: I favor[1] unconditionally letting older snapshots see the new rows after the CREATE+COPY transaction commits. To recap, making affected scans see an empty table is as wrong as making them see those rows. Robert also listed[2] that as a credible option, and I don't recall anyone opining against it in previous discussions. I did perceive an undercurrent preference, all other things being equal, for an optimization free from semantic side-effects. I shared that preference, but investigations showed that we must compromise something. You know, I hadn't been taking that option terribly seriously, but maybe we ought to reconsider it. It would certainly be simpler, and as you point out, it's not really any worse from an MVCC point of view than anything else we do. Moreover, it would make this available to clients like pg_dump without further hackery. I think the current behavior, where we treat FREEZE as a hint, is just awful. Regardless of whether the behavior is automatic or manually requested, the idea that you might get the optimization or not depending on the timing of relcache flushes seems very much undesirable. I mean, if the optimization is actually important for performance, then you want to get it when you ask for it. If it isn't, then why bother having it at all? Let's say that COPY FREEZE normally doubles performance on a data load that therefore takes 8 hours - somebody who suddenly loses that benefit because of a relcache flush that they can't prevent or control and ends up with a 16 hour data load is going to pop a gasket. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Noah, * Noah Misch (n...@leadboat.com) wrote: I agree we should be reticent to compromise correctness for convenience. Compromising mere bug-compatibility, trading one incorrect behavior for another incorrect behavior, is not as bad. Furthermore, today's behavior in question is not something I can see applications deliberately and successfully relying upon. I actually don't agree with the notion that one bad bug should allow us to introduce additional such bugs. I agree that it's unlikely that applications are depending on today's behavior of TRUNCATE making concurrent transactions see an empty table, but it does *not* follow that applications *won't* start depending on this new behavior of COPY FREEZE. Extending it to cases not involving a just-created or just-truncated table really would compromise correctness; errors could leave the table in an otherwise-impossible state. Let's indeed not go there. Even if we could fix that, I'd be against allowing it arbitrairly for any regular user INSERT or UPDATE; I'm still not particularly happy with this approach for COPY. It'll definitely reduce the interest in finding a real solution though, which is unfortunate. That effect seems likely, but I do not find it unfortunate. The change variant I have advocated does not stand in contrast to some real solution to PostgreSQL's treatment for readers of tables created or truncated by a transaction not in the reader's snapshot. The two topics interact at arm's length. Bundling them into one patch, artificially making them to stand or fall as a pair, is not a win for PostgreSQL. Having proper MVCC support for DDL *would* be a win for PostgreSQL and this *does* reduce the chances of that ever happening. That does raise another disadvantage of making the change syntax-controlled: if we someday implement the other improvement, COPY FREEZE will have minimal reason not to be the default. FREEZE then becomes a relic noise word. Indeed, that's certainly unfortunate as well. Really, though, it just goes to show how much of a hack this is rather than a real solution. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
* Robert Haas (robertmh...@gmail.com) wrote: You know, I hadn't been taking that option terribly seriously, but maybe we ought to reconsider it. It would certainly be simpler, and as you point out, it's not really any worse from an MVCC point of view than anything else we do. Moreover, it would make this available to clients like pg_dump without further hackery. I really don't agree with this notion that the behavior of TRUNCATE, a top-level, seperately permissioned command, makes it OK to introduce other busted behavior in existing commands. I think the current behavior, where we treat FREEZE as a hint, is just awful. I agree that it's pretty grotty, but I had assumed it was at least deterministic, ala TRUNCATE/COPY and WAL... If it isn't, then this certainly gets really ugly really quickly. I don't think that means we should go ahead and try to always optimize it though- even when it isn't explicit, there will be an expectation that it's going to work when all the 'right' conditions are met. I know that's certainly how I feel about TRUNCATE/COPY and WAL'ing. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Mon, Dec 10, 2012 at 08:04:55PM -0500, Robert Haas wrote: I think the current behavior, where we treat FREEZE as a hint, is just awful. Regardless of whether the behavior is automatic or manually requested, the idea that you might get the optimization or not depending on the timing of relcache flushes seems very much undesirable. I mean, if the optimization is actually important for performance, then you want to get it when you ask for it. If it isn't, then why bother having it at all? Let's say that COPY FREEZE normally doubles performance on a data load that therefore takes 8 hours - somebody who suddenly loses that benefit because of a relcache flush that they can't prevent or control and ends up with a 16 hour data load is going to pop a gasket. Until these threads, I did not know that a relcache invalidation could trip up the WAL avoidance optimization, and now this. I poked at the relevant relcache.c code, and it already takes pains to preserve the needed facts. The header comment of RelationCacheInvalidate() indicates that entries bearing an rd_newRelfilenodeSubid can safely survive the invalidation, but the code does not implement that. I think the comment is right, and this is just an oversight in the code going back to its beginning (fba8113c). I doubt the comment at the declaration of rd_createSubid in rel.h, though I can't presently say what correct comment should replace it. CLUSTER does preserve the old value, at least for the main table relation. CLUSTER probably should *set* rd_newRelfilenodeSubid. Thanks, nm *** a/src/backend/utils/cache/relcache.c --- b/src/backend/utils/cache/relcache.c *** *** 2149,2156 RelationCacheInvalidate(void) /* Must close all smgr references to avoid leaving dangling ptrs */ RelationCloseSmgr(relation); ! /* Ignore new relations, since they are never cross-backend targets */ ! if (relation-rd_createSubid != InvalidSubTransactionId) continue; relcacheInvalsReceived++; --- 2149,2162 /* Must close all smgr references to avoid leaving dangling ptrs */ RelationCloseSmgr(relation); ! /* !* Ignore new relations; no other backend will manipulate them before !* we commit. Likewise, before replacing a relation's relfilenode, we !* shall have acquired AccessExclusiveLock and drained any applicable !* pending invalidations. !*/ ! if (relation-rd_createSubid != InvalidSubTransactionId || ! relation-rd_newRelfilenodeSubid != InvalidSubTransactionId) continue; relcacheInvalsReceived++; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Mon, Dec 10, 2012 at 08:54:04PM -0500, Stephen Frost wrote: I agree that it's unlikely that applications are depending on today's behavior of TRUNCATE making concurrent transactions see an empty table, but it does *not* follow that applications *won't* start depending on this new behavior of COPY FREEZE. That is a good point. I'm not sure whether that should worry us enough to implement an error in the scenario before letting COPY write frozen tuples. But it's the strongest argument I've seen for imposing that prerequisite. Even if we could fix that, I'd be against allowing it arbitrairly for any regular user INSERT or UPDATE; I'm still not particularly happy with this approach for COPY. Sure; COPY is the 99% important case here. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Mon, Dec 10, 2012 at 7:02 PM, Stephen Frost sfr...@snowman.net wrote: I continue to hold that this could end up being a slippery slope for us to go down wrt 'correctness' vs. 'do whatever the user wants'. If we keep this to only COPY and where the table has to be truncated/created in the same transaction (which requires the user to have sufficient privileges to do non-MVCC-safe things on the table already), perhaps it's alright. It'll definitely reduce the interest in finding a real solution though, which is unfortunate. I wonder if something more complete can be done by forcing COPY FREEZE or whatever we call it to take an exclusive lock on the table and run loading as an append-only operation. By taking a strong lock, we will block out any concurrent read/writes to the table. If an error occurs while loading the data, the table will be truncated at the previously recorded size. We may need some additional book keeping and WAL logging to handle crash recovery. To solve the visibility issue for old snapshots that should not see the new data, we can store some additional visibility information in the pg_class itself. For example, we can store the size of the table before the COPY FREEZE started and the XID of the COPY FREEZE operation. An old snapshot that can not see the XID, can not see the tuples inserted in the new blocks either. Once the COPY FREEZE finishes and the lock on the relation is released, new transactions can start writing to the table and write past the old size of the table. But that should be OK. If an old snapshot can't see the tuples inserted by COPY FREEZE, AFAIK it can't see any of those other tuples as well. I'm sure there will still be challenges with this approach. But I wonder if we can guarantee correctness by proper use of synchronization and still avoid multiple writes for most common data loading scenarios. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Mon, 2012-12-10 at 08:16 -0500, Stephen Frost wrote: I'm trying to figure out why there are all the constraints around this command to begin with. If we're going to support this, why do we require the user to create or truncate the table in the same transaction? Clearly that's going to reduce the usefulness of this feature and it's not clear to me why that constraint is needed, code-wise. There is a very specific reason: If you insert frozen tuples into a table that already has tuples in it, then you no longer have just an isolation issue, you have an *atomicity* issue (and probably a number of other serious issues) because you can't roll back. Doing it in the same transaction as the table was created leaves you with a way to roll back: just delete the table (which will happen implicitly because the creation is part of the same transaction anyway). Perhaps we can take some partial steps toward MVCC-safe access? For instance, how about trying to recheck the xmin of a pg_class tuple when starting a scan? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 8 December 2012 22:18, Stephen Frost sfr...@snowman.net wrote: So the committed feature does address the visibility issue. Not hardly. It lets a user completely violate the basic rules of the overall database. The *correct* solution to this problem is to actually *fix* it, by setting it up such that tables created after a particular transaction starts aren't visible and similar. Agreed, but that is also be a silent change of current behaviour. But the above will only work for CREATE TABLE, not for TRUNCATE. I've invested a lot of time in trying to improve the situation and investigated many routes forwards, none of them very satisfying. Until someone comes up with a better plan, FREEZE is a pragmatic way forwards that improves things now rather than waiting for the perfect solution. And if we want checksums anytime soon we need ways to ameliorate the effect of hints on checksums, which this does, soemwhat. Better plans, with code, welcome. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Wed, Dec 05, 2012 at 07:43:08PM -0500, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Dec 4, 2012 at 3:38 PM, Jeff Davis pg...@j-davis.com wrote: After reading that thread, I still don't understand why it's unsafe to set HEAP_XMIN_COMMITTED in those conditions. Even if it is, I would think that a sufficiently narrow case -- such as CTAS outside of a transaction block -- would be safe, along with some slightly broader cases (like BEGIN; CREATE TABLE; INSERT/COPY). I haven't looked at the committed patch - which seemed a bit precipitous to me given the stage the discussion was at - but I believe the general issue with HEAP_XMIN_COMMITTED is that there might be other snapshots in the same transaction, for example from open cursors. From memory, the tqual.c code assumes that any tuple with XMIN_COMMITTED couldn't possibly be from its own transaction, and thus it doesn't make the tests that would be appropriate for a tuple that is from the current transaction. Maybe it's all right anyway (i.e. if we should always treat such a tuple as good) but I don't recall exactly what's tested in those paths. I don't see semantics preservable by freezing, yet omitting HEAP_XMIN_COMMITTED. The HeapTupleHeaderGetCmin(tuple) = snapshot-curcid test is the one at risk. HeapTupleSatisfiesMVCC() does skip that test for HEAP_XMIN_COMMITTED tuples, but seeing xmin==FrozenTransactionId hampers it all the more. What if one of the preconditions for the optimization were the equivalent of CheckTableNotInUse()? I cannot immediately think of a older-cmin-scan source not caught thereby. Unmodified HeapTupleSatisfiesMVCC() will then suffice. Happily, it's not a restriction users will regularly encounter. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Fri, Dec 07, 2012 at 06:51:18PM -0500, Stephen Frost wrote: * Jeff Davis (pg...@j-davis.com) wrote: Most of your concerns seem to be related to freezing, and I'm mostly interested in HEAP_XMIN_COMMITTED optimizations. So I think we're talking past each other. My concern is about this patch/capability as a whole. I agree that the hint bit setting may be fine by itself, I'm not terribly concerned with that. Now, if you (and others) aren't concerned about the overall behavior that is being introduced here, that's fine, but it was my understanding from previous discussions that making tuples visible to all transactions, even those started before the committing transaction which are set more strictly than 'read-committed', wasn't 'ok'. Now, what I've honestly been hoping for on this thread was for someone to speak up and point out why I'm wrong about this concern and explain how this patch addresses that issue. If that's been done, I've missed it.. I favor[1] unconditionally letting older snapshots see the new rows after the CREATE+COPY transaction commits. To recap, making affected scans see an empty table is as wrong as making them see those rows. Robert also listed[2] that as a credible option, and I don't recall anyone opining against it in previous discussions. I did perceive an undercurrent preference, all other things being equal, for an optimization free from semantic side-effects. I shared that preference, but investigations showed that we must compromise something. Though I like the idea of making new relations appear nonexistent to older snapshots, let's keep that as an independent patch. Otherwise, the chance of having none of the above in PostgreSQL 9.3 escalates significantly. Thanks, nm [1] http://archives.postgresql.org/message-id/20120302205834.gc29...@tornado.leadboat.com [2] http://archives.postgresql.org/message-id/CA+TgmoYh_KXErp9eOejMV6EJJaczeZZcSj3kRtq=yg1sjim...@mail.gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 7 December 2012 23:51, Stephen Frost sfr...@snowman.net wrote: Jeff, * Jeff Davis (pg...@j-davis.com) wrote: Most of your concerns seem to be related to freezing, and I'm mostly interested in HEAP_XMIN_COMMITTED optimizations. So I think we're talking past each other. My concern is about this patch/capability as a whole. I agree that the hint bit setting may be fine by itself, I'm not terribly concerned with that. Now, if you (and others) aren't concerned about the overall behavior that is being introduced here, that's fine, but it was my understanding from previous discussions that making tuples visible to all transactions, even those started before the committing transaction which are set more strictly than 'read-committed', wasn't 'ok'. Now, what I've honestly been hoping for on this thread was for someone to speak up and point out why I'm wrong about this concern and explain how this patch addresses that issue. If that's been done, I've missed it.. Visibility of pre-hinted data is an issue and because of that we previously agreed that it would be allowed only when explicitly requested by the user, which has been implemented as the FREEZE option on COPY. This then makes it identical to TRUNCATE, where a separate command differentiates MVCC-compliant row removal (DELETE) from non-MVCC row removal (TRUNCATE). So the committed feature does address the visibility issue. Jeff has been speaking about an extension to that behaviour, which would be to allow HEAP_XMIN_COMMITTED to be set in some cases also. The committed feature specifically does not do that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Simon, * Simon Riggs (si...@2ndquadrant.com) wrote: Visibility of pre-hinted data is an issue and because of that we previously agreed that it would be allowed only when explicitly requested by the user, which has been implemented as the FREEZE option on COPY. This then makes it identical to TRUNCATE, where a separate command differentiates MVCC-compliant row removal (DELETE) from non-MVCC row removal (TRUNCATE). To be honest, I really don't buy off on this. Yes, TRUNCATE (a top-level, individually permissioned command) can violate MVCC and make things which might have been visible to a given transaction no longer visible to that transaction. I find that very different from making rows which should *not* be visible suddenly appear. So the committed feature does address the visibility issue. Not hardly. It lets a user completely violate the basic rules of the overall database. The *correct* solution to this problem is to actually *fix* it, by setting it up such that tables created after a particular transaction starts aren't visible and similar. Not by punting to the user with very little control over it (or so I'm guessing). Will we accept a patch to do an INSERT or UPDATE FREEZE...? I realize this might be a bit of a stretch, but why not allow February 31st to be accepted as a date also, should the user request it? This is quite a slippery slope, in my view. Jeff has been speaking about an extension to that behaviour, which would be to allow HEAP_XMIN_COMMITTED to be set in some cases also. The committed feature specifically does not do that. It's not clear to me why you *wouldn't* just go ahead and set everything you can at the same time...? It hardly seems like it could be worse than what is apparently going to happen with this command... Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 4 December 2012 20:38, Jeff Davis pg...@j-davis.com wrote: Even if it is, I would think that a sufficiently narrow case -- such as CTAS outside of a transaction block -- would be safe CTAS would be safe to do that with. CLUSTER and VACUUM FULL are already done. Outside of a transaction block would be automatic. Inside of a transaction block we could use a parameter called initially_frozen. This would act same as FREEZE on COPY. Parameter would not be recorded into the reloptions field on pg_class, since it is a one-time option and has no meaning after end of xact. along with some slightly broader cases (like BEGIN; CREATE TABLE; INSERT/COPY). Historically we've chosen not to allow bulk insert options on INSERT, so that is a separate discussion. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Thu, 2012-12-06 at 22:34 -0500, Stephen Frost wrote: * Jeff Davis (pg...@j-davis.com) wrote: That is documented in the committed patch -- it's a trade, basically saying that you lose isolation but avoid extra writes. It seems reasonable that the user gets this behavior if specifically requested. Strictly speaking, it could actually be two different uesrs.. That's a good point. Somewhat like SERIALIZABLE, unless everyone is on board, it doesn't really work. In the simple approach that only affects loads in the same transaction as the create, this is not an issue. The only issue there is for other reads in the same transaction. I think you already know that, but I am repeating for clarity. Actually, no, I'm not convinced of that. If a seperate transaction starts before the create/insert, and is still open when the create/insert transaction commits, wouldn't that seperate transaction see rows in the newly created table? Not if all you do is set HEAP_XMIN_COMMITTED. Committed visible. That's more-or-less the specific issue I'm bringing up as a potential concern. If it's just a FrozenXID stored in the heap and there isn't anything telling the 2nd transaction to not consider this table that exists through SnapshotNow, how are we going to know to not look at the tuples? Or do we accept that it's ok for those to be visible? I am talking about HEAP_XMIN_COMMITTED. I think it's understood that freezing has much stricter requirements for safety in various situations because it loses information. How I wish we could fix the table visibility and not have to worry about this issue at all, which would also remove the need for some special keyword to be used to tell us it's 'ok' to use this optimization... I think we need to be clear about which one we're discussing: HEAP_XMIN_COMMITTED or FrozenXID. I believe discussing them together has caused a significant amount of confusion in this thread. Most of your concerns seem to be related to freezing, and I'm mostly interested in HEAP_XMIN_COMMITTED optimizations. So I think we're talking past each other. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Jeff, * Jeff Davis (pg...@j-davis.com) wrote: Most of your concerns seem to be related to freezing, and I'm mostly interested in HEAP_XMIN_COMMITTED optimizations. So I think we're talking past each other. My concern is about this patch/capability as a whole. I agree that the hint bit setting may be fine by itself, I'm not terribly concerned with that. Now, if you (and others) aren't concerned about the overall behavior that is being introduced here, that's fine, but it was my understanding from previous discussions that making tuples visible to all transactions, even those started before the committing transaction which are set more strictly than 'read-committed', wasn't 'ok'. Now, what I've honestly been hoping for on this thread was for someone to speak up and point out why I'm wrong about this concern and explain how this patch addresses that issue. If that's been done, I've missed it.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 6 December 2012 00:43, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Dec 4, 2012 at 3:38 PM, Jeff Davis pg...@j-davis.com wrote: After reading that thread, I still don't understand why it's unsafe to set HEAP_XMIN_COMMITTED in those conditions. Even if it is, I would think that a sufficiently narrow case -- such as CTAS outside of a transaction block -- would be safe, along with some slightly broader cases (like BEGIN; CREATE TABLE; INSERT/COPY). I haven't looked at the committed patch - which seemed a bit precipitous to me given the stage the discussion was at - but I believe the general issue with HEAP_XMIN_COMMITTED is that there might be other snapshots in the same transaction, for example from open cursors. From memory, the tqual.c code assumes that any tuple with XMIN_COMMITTED couldn't possibly be from its own transaction, and thus it doesn't make the tests that would be appropriate for a tuple that is from the current transaction. Maybe it's all right anyway (i.e. if we should always treat such a tuple as good) but I don't recall exactly what's tested in those paths. Yes. We'd need to add in a call to TransactionIdIsCurrentTransactionId(xmin), which is basically just wasted path in 99% of use cases. I've looked at optimising TransactionIdIsCurrentTransactionId() with what appears some vague success. Attached patch gives more consistent response. The other thing to do is to have a backend local flag that gets set when we use the HEAP_XMIN_COMMITTED anywhere. When not set we just skip past the TransactionIdIsCurrent test altogether. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services optimize_transactionidiscurrent.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 4 December 2012 20:38, Jeff Davis pg...@j-davis.com wrote: The simple case of BEGIN; CREATE TABLE ...; COPY ... WITH (FREEZE); doesn't meet the pre-conditions. It only meets the conditions if preceded by a TRUNCATE, which all of the tests do. I looked into it, and I think the test: ... cstate-rel-rd_newRelfilenodeSubid == GetCurrentSubTransactionId() should be: ... (cstate-rel-rd_newRelfilenodeSubid == GetCurrentSubTransactionId() || cstate-rel-rd_createSubid == GetCurrentSubTransactionId()) (haven't tested). Another option to consider is that rd_newRelfilenodeSubid could be set on newly-created tables as well as newly-truncated tables. I'll expand the test above and add new regression. I focused on correctness ahead of a wide use case and missed this. Also, I recommend a hint along with the NOTICE when the pre-conditions aren't met to tell the user what they need to do. Perhaps something like: Create or truncate the table in the same transaction as COPY FREEZE. OK, will add hint. The documentation could expand on that, clarifying that a TRUNCATE in the same transaction (perhaps even ALTER?) allows a COPY FREEZE. The code comment could be improved a little, while we're at it: Note that the stronger test of exactly which subtransaction created it is crucial for correctness of this optimisation. is a little vague, can you explain using the reasoning from the thread? I would say something like: The transaction and subtransaction creating or truncating the table must match that of the COPY FREEZE. Otherwise, it could mix tuples from different transactions together, and it would be impossible to distinguish them for the purposes of atomicity. OK, will try. After reading that thread, I still don't understand why it's unsafe to set HEAP_XMIN_COMMITTED in those conditions. Even if it is, I would think that a sufficiently narrow case -- such as CTAS outside of a transaction block -- would be safe, along with some slightly broader cases (like BEGIN; CREATE TABLE; INSERT/COPY). It *could* be safe, but needs changes to visibility rules, which as explained I had not been able to optimise sufficiently. The code is easy enough to add back in at the time it is sufficiently well optimised and we all agree. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Hi, On 2012-12-03 17:34:01 -0800, Jeff Davis wrote: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8de72b66a2edcf12c812de0a73bd50b6b7d81d62 On the subject of that patch. I am not a big fan of only emitting a NOTICE if FREEZE wasn't properly used: + if (cstate-freeze (hi_options HEAP_INSERT_FROZEN) == 0) + ereport(NOTICE, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg(FREEZE option specified but pre-conditions not met))); + Imo it should fail. Imagine adding FREEZE to the loading part of your application and not noticing the optimization broke because somebody else changed something in another part of the loading procedure. Not sure whether that discussed previously. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 6 December 2012 13:12, Andres Freund and...@2ndquadrant.com wrote: Hi, On 2012-12-03 17:34:01 -0800, Jeff Davis wrote: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8de72b66a2edcf12c812de0a73bd50b6b7d81d62 On the subject of that patch. I am not a big fan of only emitting a NOTICE if FREEZE wasn't properly used: + if (cstate-freeze (hi_options HEAP_INSERT_FROZEN) == 0) + ereport(NOTICE, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg(FREEZE option specified but pre-conditions not met))); + Imo it should fail. Imagine adding FREEZE to the loading part of your application and not noticing the optimization broke because somebody else changed something in another part of the loading procedure. Not sure whether that discussed previously. It was. Only Robert and I spoke about that. Also imagine having to analyze your code in detail to reevaluate the exact optimisation required each time. This way you get to add FREEZE and have it work always, with feedback if its not optimized. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 2012-12-06 14:07:32 +, Simon Riggs wrote: On 6 December 2012 13:12, Andres Freund and...@2ndquadrant.com wrote: Hi, On 2012-12-03 17:34:01 -0800, Jeff Davis wrote: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8de72b66a2edcf12c812de0a73bd50b6b7d81d62 On the subject of that patch. I am not a big fan of only emitting a NOTICE if FREEZE wasn't properly used: + if (cstate-freeze (hi_options HEAP_INSERT_FROZEN) == 0) + ereport(NOTICE, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg(FREEZE option specified but pre-conditions not met))); + Imo it should fail. Imagine adding FREEZE to the loading part of your application and not noticing the optimization broke because somebody else changed something in another part of the loading procedure. Not sure whether that discussed previously. It was. Only Robert and I spoke about that. Also imagine having to analyze your code in detail to reevaluate the exact optimisation required each time. This way you get to add FREEZE and have it work always, with feedback if its not optimized. I remain unconvinced by that argument, but if I am alone with this ok. Could we at least make it a WARNING? Nobody ever reads NOTICEs because it contains so much noise. And this is isn't noise. Its a bug on the client side. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 6 December 2012 14:12, Andres Freund and...@2ndquadrant.com wrote: I remain unconvinced by that argument, but if I am alone with this ok. Could we at least make it a WARNING? Nobody ever reads NOTICEs because it contains so much noise. And this is isn't noise. Its a bug on the client side. It's not a bug. Requesting a useful, but not critical optimisation is just a hint. The preconditions are not easy to understand, so I see no reason to punish people that misunderstand, or cause programs to fail in ways that need detailed understanding to make them work again. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
* Robert Haas (robertmh...@gmail.com) wrote: I haven't looked at the committed patch - which seemed a bit Disclaimer- neither have I, but.. When I last recall this discussion (likely in some bar in Europe), the problem was also that an independent session would be able to: a) see that the table exists (due to SnapshotNow being used for lookups) b) see rows in the table Part 'a' is something which I think would be good to fix (but hard), and it sounds like this patch might make part 'b' happen, which I think makes the part 'a' problem worse. I'm guessing this isn't a problem for the TRUNCATE case because the second session would still be looking at the pre-TRUNCATE files on disk, right? Is that reference to exactly which files on disk to look at being used to address the CREATE problem too..? That said, I love the idea of having a way to drop tuples in w/o having to go back and rewrite them three times.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
* Simon Riggs (si...@2ndquadrant.com) wrote: It's not a bug. Requesting a useful, but not critical optimisation is just a hint. The preconditions are not easy to understand, so I see no reason to punish people that misunderstand, or cause programs to fail in ways that need detailed understanding to make them work again. I tend to agree with Andres on this one. This feels a bit like accepting a command but then not actually following-through on it if it turns out we can't actually do it. If it's truely an optimization (and I suspect my other email/question might provide insight into that), then it should be something we can 'just do' without needing to be asked to do it, along the same lines of not WAL'ing when the appropriate conditions are met (table created in this transaction, etc, etc). Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Thu, 2012-12-06 at 11:55 -0500, Stephen Frost wrote: * Robert Haas (robertmh...@gmail.com) wrote: When I last recall this discussion (likely in some bar in Europe), the problem was also that an independent session would be able to: a) see that the table exists (due to SnapshotNow being used for lookups) b) see rows in the table Part 'a' is something which I think would be good to fix (but hard), and it sounds like this patch might make part 'b' happen, which I think makes the part 'a' problem worse. I'm guessing this isn't a problem for the TRUNCATE case because the second session would still be looking at the pre-TRUNCATE files on disk, right? Is that reference to exactly which files on disk to look at being used to address the CREATE problem too..? That isn't a problem, because the other session won't see the tuple in pg_class until the creating transaction commits, at which point the rows have committed, too (because this would only kick in when the rows are loaded in the same transaction as the CREATE). So, yes, it's like TRUNCATE in that regard. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 6 December 2012 17:02, Stephen Frost sfr...@snowman.net wrote: * Simon Riggs (si...@2ndquadrant.com) wrote: It's not a bug. Requesting a useful, but not critical optimisation is just a hint. The preconditions are not easy to understand, so I see no reason to punish people that misunderstand, or cause programs to fail in ways that need detailed understanding to make them work again. I tend to agree with Andres on this one. This feels a bit like accepting a command but then not actually following-through on it if it turns out we can't actually do it. If it's truely an optimization (and I suspect my other email/question might provide insight into that), then it should be something we can 'just do' without needing to be asked to do it, along the same lines of not WAL'ing when the appropriate conditions are met (table created in this transaction, etc, etc). That depends whether its a command or a do-if-possible hint. Its documented as the latter. Similar to the way VACUUM tries to truncate a relation, but gives up if it can't. And on an even closer example, VACUUM FREEZE itself, which doesn't guarantee that all rows are frozen at the end of it... -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Thu, 2012-12-06 at 18:16 +, Simon Riggs wrote: I tend to agree with Andres on this one. This feels a bit like accepting a command but then not actually following-through on it if it turns out we can't actually do it. If it's truely an optimization (and I suspect my other email/question might provide insight into that), then it should be something we can 'just do' without needing to be asked to do it, along the same lines of not WAL'ing when the appropriate conditions are met (table created in this transaction, etc, etc). That depends whether its a command or a do-if-possible hint. Its documented as the latter. Similar to the way VACUUM tries to truncate a relation, but gives up if it can't. And on an even closer example, VACUUM FREEZE itself, which doesn't guarantee that all rows are frozen at the end of it... Also, if the set of conditions changes in the future, we would have a problem if that caused new errors to appear. I think a WARNING might make more sense than a NOTICE, but I don't have a strong opinion about that. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Jeff, * Jeff Davis (pg...@j-davis.com) wrote: That isn't a problem, because the other session won't see the tuple in pg_class until the creating transaction commits, at which point the rows have committed, too (because this would only kick in when the rows are loaded in the same transaction as the CREATE). See, that's what I originally thought but was corrected on it at one point (by Tom, iirc). I just did a simple test against 9.2.1 using serializable mode. In this mode, if you do something like this: session a - begin; session b - begin; create table q (a integer); insert into q values (1); commit; select * from q; You'll get an empty table. That's not great, but it's life- once something is in pg_class, all sessions can see it because the table lookups are done using SnapshotNow and aren't truely transactional, but at least you can't see any rows in the table because the individual rows are marked with the transaction ID which created them and we can't see them in our transaction that started before the table was created. It sounds like, with this patch/change, this behavior would change. Now, I'm not necessairly against this, but it's clearly something different than what we had before and might be an issue for some. If, in the general case, we're actually 'ok' with this, I'd ask why we don't simply do it by default..? If we're *not* 'ok' with it, perhaps we shouldn't be doing it at all... If we fix the bigger issue (which, as I understand it from various discussions with Tom and Robert, is very difficult), then this whole problem goes away entirely. I always figured/expected that to be how we'd fix this, not just punt on the issue and tell the user sure, you can do this, hope you know what you're doing Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Thu, 2012-12-06 at 14:18 -0500, Stephen Frost wrote: begin; You need to do a SELECT here to actually get a snapshot. session b - begin; create table q (a integer); insert into q values (1); commit; select * from q; You'll get an empty table. That's not great, but it's life- once something is in pg_class, all sessions can see it because the table lookups are done using SnapshotNow and aren't truely transactional, but at least you can't see any rows in the table because the individual rows are marked with the transaction ID which created them and we can't see them in our transaction that started before the table was created. It sounds like, with this patch/change, this behavior would change. No, it would not change. Session A would see that the table exists and see that the rows' inserting transaction (in Session B) committed. That is correct because the inserting transaction *did* commit, and it's the same as we have now. However, the rows will *not* be visible, because the serializable snapshot doesn't contain the inserting transaction. Think about the current behavior: right after the commit, another select could come along and set all those hint bits anyway. Even if the hint bits aren't set, it will do a CLOG lookup and find that the transaction committed. The change being proposed is just to set those hint bits preemptively, because the fate of the INSERT is identical to the fate of the CREATE (they are in the same transaction). There will be no observable problem outside of that CREATE+INSERT transaction. The only catch is what to do about visibility of the tuples when still inside the transaction (which is not a problem for transactions doing a simple load). The interesting thing about HEAP_XMIN_COMMITTED is that it can be set preemptively if we know that the transaction will actually commit (aside from the visibility issues within the transaction). Even if the transaction doesn't commit, it would still be possible to clean out the dead tuples with a VACUUM, because no information has really been lost in the process. So there may yet be some kind of safe protocol to set these even during a load into an existing table... Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
* Jeff Davis (pg...@j-davis.com) wrote: However, the rows will *not* be visible, because the serializable snapshot doesn't contain the inserting transaction. That's what we've got now and what would be expected, however... Think about the current behavior: right after the commit, another select could come along and set all those hint bits anyway. Even if the hint bits aren't set, it will do a CLOG lookup and find that the transaction committed. The command is 'FREEZE', which sounded to me like the transaction ID would be set to FrozenXID, meaning that we wouldn't be able to tell if the inserting transaction was before or after ours... Your analysis of the hint bits themselves sounds reasonable but it seems independent of the issue regarding setting the actual transaction ID. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Thu, 2012-12-06 at 20:12 -0500, Stephen Frost wrote: The command is 'FREEZE', which sounded to me like the transaction ID would be set to FrozenXID, meaning that we wouldn't be able to tell if the inserting transaction was before or after ours... Freezing does lose information, but I thought that this sub-thread was about the HEAP_XMIN_COMMITTED optimization that was in the first version of the commit but removed in a later commit. Setting HEAP_XMIN_COMMITTED does not lose information. Your analysis of the hint bits themselves sounds reasonable but it seems independent of the issue regarding setting the actual transaction ID. Upon re-reading, my last paragraph was worded a little too loosely. The interesting thing about HEAP_XMIN_COMMITTED is that it can be set preemptively if we know that the transaction will actually commit (aside from the visibility issues within the transaction). That should really be: aside from the visibility issues before it does commit. Anyway, the HEAP_XMIN_COMMITTED loading optimizations require more discussion, but I think they are worth pursuing. The simpler form is when the table is created and loaded in the same transaction, but there may be some more sophisticated approaches, as well. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Jeff, * Jeff Davis (pg...@j-davis.com) wrote: On Thu, 2012-12-06 at 20:12 -0500, Stephen Frost wrote: The command is 'FREEZE', which sounded to me like the transaction ID would be set to FrozenXID, meaning that we wouldn't be able to tell if the inserting transaction was before or after ours... Freezing does lose information, but I thought that this sub-thread was about the HEAP_XMIN_COMMITTED optimization that was in the first version of the commit but removed in a later commit. Setting HEAP_XMIN_COMMITTED does not lose information. I'm less concerned about the hint bits and more concerned about the implications of the FrozenXID being used, which would make the rows visible to other transactions even if they began before the rows were loaded. That should really be: aside from the visibility issues before it does commit. My concern is more about what happens to transactions which are opened before this transaction commits and that they might be able to see data in the table. As I mentioned before, I'm not *convinced* that this is really a big deal, or even a problem for this patch, but it's something to *consider* and think about because, as much as we like to say that DDL is transaction-safe, it's *not* completely MVCC and things like creating tables and committing them will show up as visible even in transactions that shouldn't be able to see them. Due to that, we need to think about what happens with data in those tables, etc. Anyway, the HEAP_XMIN_COMMITTED loading optimizations require more discussion, but I think they are worth pursuing. The simpler form is when the table is created and loaded in the same transaction, but there may be some more sophisticated approaches, as well. Sure. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Thu, 2012-12-06 at 20:49 -0500, Stephen Frost wrote: I'm less concerned about the hint bits and more concerned about the implications of the FrozenXID being used, which would make the rows visible to other transactions even if they began before the rows were loaded. That is documented in the committed patch -- it's a trade, basically saying that you lose isolation but avoid extra writes. It seems reasonable that the user gets this behavior if specifically requested. My concern is more about what happens to transactions which are opened before this transaction commits and that they might be able to see data in the table. In the simple approach that only affects loads in the same transaction as the create, this is not an issue. The only issue there is for other reads in the same transaction. I think you already know that, but I am repeating for clarity. As I mentioned before, I'm not *convinced* that this is really a big deal, or even a problem for this patch, but it's something to *consider* and think about because, as much as we like to say that DDL is transaction-safe, it's *not* completely MVCC and things like creating tables and committing them will show up as visible even in transactions that shouldn't be able to see them. Due to that, we need to think about what happens with data in those tables, etc. Agreed. I don't think anyone is suggesting that we change the semantics of existing commands, unless it's to improve the serializability of DDL. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
* Jeff Davis (pg...@j-davis.com) wrote: That is documented in the committed patch -- it's a trade, basically saying that you lose isolation but avoid extra writes. It seems reasonable that the user gets this behavior if specifically requested. Strictly speaking, it could actually be two different uesrs.. In the simple approach that only affects loads in the same transaction as the create, this is not an issue. The only issue there is for other reads in the same transaction. I think you already know that, but I am repeating for clarity. Actually, no, I'm not convinced of that. If a seperate transaction starts before the create/insert, and is still open when the create/insert transaction commits, wouldn't that seperate transaction see rows in the newly created table? That's more-or-less the specific issue I'm bringing up as a potential concern. If it's just a FrozenXID stored in the heap and there isn't anything telling the 2nd transaction to not consider this table that exists through SnapshotNow, how are we going to know to not look at the tuples? Or do we accept that it's ok for those to be visible? How I wish we could fix the table visibility and not have to worry about this issue at all, which would also remove the need for some special keyword to be used to tell us it's 'ok' to use this optimization... Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Tue, Dec 4, 2012 at 3:38 PM, Jeff Davis pg...@j-davis.com wrote: After reading that thread, I still don't understand why it's unsafe to set HEAP_XMIN_COMMITTED in those conditions. Even if it is, I would think that a sufficiently narrow case -- such as CTAS outside of a transaction block -- would be safe, along with some slightly broader cases (like BEGIN; CREATE TABLE; INSERT/COPY). I haven't looked at the committed patch - which seemed a bit precipitous to me given the stage the discussion was at - but I believe the general issue with HEAP_XMIN_COMMITTED is that there might be other snapshots in the same transaction, for example from open cursors. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Robert Haas robertmh...@gmail.com writes: On Tue, Dec 4, 2012 at 3:38 PM, Jeff Davis pg...@j-davis.com wrote: After reading that thread, I still don't understand why it's unsafe to set HEAP_XMIN_COMMITTED in those conditions. Even if it is, I would think that a sufficiently narrow case -- such as CTAS outside of a transaction block -- would be safe, along with some slightly broader cases (like BEGIN; CREATE TABLE; INSERT/COPY). I haven't looked at the committed patch - which seemed a bit precipitous to me given the stage the discussion was at - but I believe the general issue with HEAP_XMIN_COMMITTED is that there might be other snapshots in the same transaction, for example from open cursors. From memory, the tqual.c code assumes that any tuple with XMIN_COMMITTED couldn't possibly be from its own transaction, and thus it doesn't make the tests that would be appropriate for a tuple that is from the current transaction. Maybe it's all right anyway (i.e. if we should always treat such a tuple as good) but I don't recall exactly what's tested in those paths. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Wed, 2012-12-05 at 19:43 -0500, Tom Lane wrote: From memory, the tqual.c code assumes that any tuple with XMIN_COMMITTED couldn't possibly be from its own transaction, and thus it doesn't make the tests that would be appropriate for a tuple that is from the current transaction. Maybe it's all right anyway (i.e. if we should always treat such a tuple as good) but I don't recall exactly what's tested in those paths. Oh, I see, it probably warrants some more discussion. It seems like we could solve these problems if we narrow the conditions enough. Anyway, the partial revert looks good, and the commit message seems appropriate (essentially, the code got ahead of the discussion). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On 4 December 2012 01:34, Jeff Davis pg...@j-davis.com wrote: I assume that refers to the discussion here: http://archives.postgresql.org/pgsql-hackers/2012-02/msg01177.php But that was quite a while ago, so is there a more recent discussion that prompted this commit now? Yes, this was discussed within the last month, thread TRUNCATE SERIALIZABLE... The patch for that was already posted, so I committed it. I am a little confused about the case where setting HEAP_XMIN_COMMITTED when loading the table in the same transaction is wrong. There was some discussion about subtransactions, but those problems only seemed to appear when the CREATE and the INSERT/COPY happened in different subtransactions. So, I guess my question is, why the partial revert? Well, first, because I realised that it wasn't wanted. I was re-reading the threads trying to figure out a way to help the checksum patch further. Second, because I realised why it wasn't wanted. Setting HEAP_XMIN_COMMITTED causes MVCC violations within the transaction issuing the COPY. Accepting the MVCC violation requires explicit consent from user. If we have that, we may as well set everything we can. So there's no middle ground. Nothing, or all frozen. We could change that, but it would require some complex tweaks to tqual routines, which I tried and was not happy with, since they would need to get more complex. It's possible a route through that minefield exists. I'm inclined to believe other approaches are more likely to yield benefit. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
On Tue, 2012-12-04 at 10:15 +, Simon Riggs wrote: On 4 December 2012 01:34, Jeff Davis pg...@j-davis.com wrote: I assume that refers to the discussion here: http://archives.postgresql.org/pgsql-hackers/2012-02/msg01177.php But that was quite a while ago, so is there a more recent discussion that prompted this commit now? Yes, this was discussed within the last month, thread TRUNCATE SERIALIZABLE... The patch for that was already posted, so I committed it. Thank you for pointing me toward that thread. While experimenting with COPY FREEZE, I noticed something: The simple case of BEGIN; CREATE TABLE ...; COPY ... WITH (FREEZE); doesn't meet the pre-conditions. It only meets the conditions if preceded by a TRUNCATE, which all of the tests do. I looked into it, and I think the test: ... cstate-rel-rd_newRelfilenodeSubid == GetCurrentSubTransactionId() should be: ... (cstate-rel-rd_newRelfilenodeSubid == GetCurrentSubTransactionId() || cstate-rel-rd_createSubid == GetCurrentSubTransactionId()) (haven't tested). Another option to consider is that rd_newRelfilenodeSubid could be set on newly-created tables as well as newly-truncated tables. Also, I recommend a hint along with the NOTICE when the pre-conditions aren't met to tell the user what they need to do. Perhaps something like: Create or truncate the table in the same transaction as COPY FREEZE. The documentation could expand on that, clarifying that a TRUNCATE in the same transaction (perhaps even ALTER?) allows a COPY FREEZE. The code comment could be improved a little, while we're at it: Note that the stronger test of exactly which subtransaction created it is crucial for correctness of this optimisation. is a little vague, can you explain using the reasoning from the thread? I would say something like: The transaction and subtransaction creating or truncating the table must match that of the COPY FREEZE. Otherwise, it could mix tuples from different transactions together, and it would be impossible to distinguish them for the purposes of atomicity. I am a little confused about the case where setting HEAP_XMIN_COMMITTED when loading the table in the same transaction is wrong. There was some discussion about subtransactions, but those problems only seemed to appear when the CREATE and the INSERT/COPY happened in different subtransactions. So, I guess my question is, why the partial revert? Well, first, because I realised that it wasn't wanted. I was re-reading the threads trying to figure out a way to help the checksum patch further. Second, because I realised why it wasn't wanted. Setting HEAP_XMIN_COMMITTED causes MVCC violations within the transaction issuing the COPY. After reading that thread, I still don't understand why it's unsafe to set HEAP_XMIN_COMMITTED in those conditions. Even if it is, I would think that a sufficiently narrow case -- such as CTAS outside of a transaction block -- would be safe, along with some slightly broader cases (like BEGIN; CREATE TABLE; INSERT/COPY). But perhaps that requires more discussion. I was just curious if there was a concrete problem that I was missing. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8de72b66a2edcf12c812de0a73bd50b6b7d81d62 Part of that patch was reverted later: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=5457a130d3a66db807d1e0ee2b8e829321809b83 I assume that refers to the discussion here: http://archives.postgresql.org/pgsql-hackers/2012-02/msg01177.php But that was quite a while ago, so is there a more recent discussion that prompted this commit now? I am a little confused about the case where setting HEAP_XMIN_COMMITTED when loading the table in the same transaction is wrong. There was some discussion about subtransactions, but those problems only seemed to appear when the CREATE and the INSERT/COPY happened in different subtransactions. So, I guess my question is, why the partial revert? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers