Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
On lör, 2012-04-28 at 11:12 -0400, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On fre, 2012-04-27 at 22:30 +0200, Andres Freund wrote: In the few cases where I investigated it TMs don't use transactions themselves (which I think is correct, they don't need them), so terminating any idle session - which the TM would appear as, as its not using txns - would leave prepared transactions in a limbo state till the database is up again, instead of waiting till all prepared transactions are either aborted or committed. It may also choose to coordinate to abort all transactions, but all that is hard if the database shuts you out. This would lead to another shutdown mode, one that terminates idle sessions unless they have prepared transactions. That could be useful. Huh? Prepared transactions aren't associated with sessions. At least not in a context using a TM --- the TM will be doing commits or rollbacks from a session different from the ones that ran the prepared transactions. From what Andres wrote I gather that the TM would be using the same session for preparing and committing. In any case, if either the existing session of the TM is cut or it cannot create a new connection, it will, after some time, have to give up roll back the prepared transactions on the other servers. So some kind of setting to not shut down if there are prepared transactions pending could be useful. But this could probably be a separate GUC setting or two instead of a shutdown mode (or two) of its own. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
On fre, 2012-04-27 at 14:57 -0400, Robert Haas wrote: I think there is no point at all in having a discussion about this unless we can first agree that the overwhelming majority of people who have commented on this issue on this list are unhappy with the current default behavior. If we are not going to change the default behavior, then there is zero point in talking about this. Have you reviewed the previous discussions where changing the default behavior was discussed and rejected? I don't like the current default any more than you do, but without any new arguments, there is, as you say, zero point in talking about this. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
On Sun, Apr 29, 2012 at 12:41 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Apr 28, 2012 at 7:04 AM, Simon Riggs si...@2ndquadrant.com wrote: So lets implement the new shutdown mode and work out a transition path to a new default. Changing rapidly screws up the people we love the most. In some cases, there are ways to phase in a change over a series of releases, but I don't see how that would be possible here. If we intend ever to change the default mode, then we have to do it sometime, and that release is going to have a backward-incompatibility no matter which one it is. Personally, as backward incompatibilities go, I think this one is pretty minor. Most people are probably already using scripts that specify fast mode, and those scripts won't change. But even for people who actually are using smart mode, most people do not shut down the database all that often, and it's rather pessimistic to suppose that the proposed new mode will break anything for them. But even if it does, we can't make improvements to the system without sometimes changing things in a backward-incompatible way, and if we get into the mind-set that no amount of backward-incompatibility is ever acceptable, we're going to seriously limit our opportunities to revisit poor design decisions. I think there's a funny kind of thing that happens when we discuss a behavior that is sub-optimal: we start to look for ways to justify leaving it the way it is, because surely it couldn't be a terrible idea if it's been like that forever. I think there's some of that going on on the thread about stripping trailing null columns, too: if we've got a benchmark result showing that the patch saves CPU time on a 5-column table (!), then all the pontificating about 700-column tables being rare is irrelevant. Similarly here: it's true that someone might have to revisit their init scripts, but should they fail to do so, the consequences are really not that dire. On the other hand, in PostgreSQL 8.4, we changed TRUNCATE to wipe out the entire inheritance hierarchy instead of only the named table (unless the new ONLY keyword was added). This obviously has the potential to be completely disastrous for someone with a very particular usage pattern, but there was little discussion and everyone basically said yeah, we should go ahead and change that, despite the small risk that someone will accidentally blow away a lot more data than they intended. Maybe there are more people using smart shutdown than there are people truncating only the root of an inheritance hierarchy, but nothing we're proposing to do here is going to permanently erase anyone's data, either. I don't think you can use the TRUNCATE case as an example. For me, that was a prime case of insufficient discussion around the principle of backwards compatibility. It wasn't clear to me that was happening and had I known, I would have objected. IIRC the first I knew of it was when the release notes came out months after things were settled. We go to great lengths to note initdb inducing behaviour during beta, but very little towards behaviour changes that require downstream software changes. Maybe we don't need to do this over multiple releases, but we do need to give warning of possible incompatibilities. It would be good to see a specific post on hackers called Planned Incompatibilities in 9.2, or collect such things on the open items wiki, so that people listening can see what might happen and get a chance to object. Or if changes do go ahead, at least we give them a few months warning to change the downstream software. Otherwise all that happens is our new release comes out and fewer people use it because it takes ages to actually realign the software stack enough for our software to be used. The better we succeed at persuading the world to use Postgres the more important backwards compatibility becomes. When fewer people used Postgres it was easy to charge forwards aggressively, but as we begin to lead we must be more careful. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future In-Core Replication
On Sat, Apr 28, 2012 at 8:40 PM, Hannu Krosing ha...@krosing.net wrote: As to what LCRs should contain, it will probably be locical equivalents of INSERT, UPDATE ... LIMIT 1, DELETE ... LIMIT 1, TRUNCATE and all DDL. Yeh I would even go as far as propose a variant for DML-WITH-LIMIT-1 to be added to postgresql's SQL syntax so that the LCRs could be converted to SQL text for some tasks and thus should be easy to process using generic text-based tools. The DML-WITH-LIMIT-1 is required to do single logical updates on tables with non-unique rows. And as for any logical updates we will have huge performance problem when doing UPDATE or DELETE on large table with no indexes, but fortunately this problem is on slave, not master ;) While that is possible, I would favour the do-nothing approach. By making the default replication mode = none, we then require a PK to be assigned before allowing replication mode = on for a table. Trying to replicate tables without PKs is a problem that can wait basically. Generating and shipping the LCR-s at WAL-generation time or perhaps even a bit earlier will have a huge performance benefit of not doing double writing of captured events on the master which currently is needed for several reasons, the main one being the determining of which transactions do commit and in what order. (this cant be solved on master without a local event log table as we dont have commit/rollback triggers) If we delegate that part out of the master then this alone enables us to be almost as fast as WAL based replica in most cases, even when we have different logical structure on slaves. Agreed How the LCRs are produced and how they are applied is a subject for debate and measurement. We're lucky enough to have a variety of mechanisms to compare, Slony 1.0/2.0, Slony 2.2/Londiste/Bucardo and its worth adding WAL translation there also. My initial thought is that WAL translation has many positive aspects to it and we are investigating. There are also some variants on those themes, such as the one you discussed above. You probably won't recognise this as such, but I hope that people might see that I'm hoping to build Slony 3.0, Londiste++ etc. At some point, we'll all say thats not Slony, but we'll also say (Josh already did) thats not binary replication. But it will be the descendant of all. If we get efficient and flexible logical change event generation on the master, then I'm sure the current trigger-based logical replication providers will switch (for full replication) or at least add and extra LCR-source . It may still make sense to leave some flexibility to the master side, so the some decisions - possibly even complex ones - could be made when generating the LCR-s What I would like is to have some of it exposed to userspace via function which could be used by developers to push their own LCRs. Yes, I see that one coming. That use case is not something I'm focused on, but I do recognise others wish to pursue that. The bit I'm not sure about is whether we have custom handler code as well. As metioned above, significant part of this approach can be prototyped from user-level triggers as soon as we have triggers on commit and rollback , even though at a slightly reduced performance. That is it will still have the trigger overhead, but we can omit all the extra writing and then re-reading and event-table management on the master. Agreed. Wanting to play with Streaming Logical Replication (as opposed to current Chunked Logical Replication) is also one of the reasons that I complained when the command triggers patch was kicked out from 9.2. Yeh. It's clear that project needs to move forwards quickly in 9.3 if we are to make this Just Work in the way we hope. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
On Sun, Apr 29, 2012 at 12:24 AM, Robert Haas robertmh...@gmail.com wrote: Let's not go looking for reasons to reject the approach just because we didn't expect it to work as well as it does. Who here, in your opinion, is looking for reasons to reject anything? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
There might be a patch available for this already. In the worst case articulated above (less than 64 columns), if all the nulls are trailing nulls, the bitmap need not be saved. Actually it is not 64(actually 72), as postgres heaptupleheader is only 23 bytes and one byte is left for the start of the bitmap. The same principle can be considered for Index Tuple as an extension Thanks, Gokul.
Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte
Hi Alexander, Perhaps I'm too early with these tests, but FWIW I reran my earlier test program against three instances. (the patches compiled fine, and make check was without problem). -- 3 instances: HEAD port 6542 trgm_regex port 6547 HEAD + trgm-regexp patch (22 Nov 2011) [1] trgm_regex_wchar2mb port 6549 HEAD + trgm-regexp + wchar2mb patch (23 Apr 2012) [2] [1] http://archives.postgresql.org/pgsql-hackers/2011-11/msg01297.php [2] http://archives.postgresql.org/pgsql-hackers/2012-04/msg01095.php -- table sizes: azjunk4 10^4 rows 1 MB azjunk5 10^5 rows11 MB azjunk6 10^6 rows 112 MB azjunk7 10^7 rows 1116 MB for table creation/structure, see: [3] http://archives.postgresql.org/pgsql-hackers/2012-01/msg01094.php Results for three instances with 4 repetitions per instance are attached. Although the regexes I chose are somewhat arbitrary, it does show some of the good, the bad and the ugly of the patch(es). (Also: I've limited the tests to a range of 'workable' regexps, i.e. avoiding unbounded regexps) hth (and thanks, great work!), Erik Rijkers trgm_regex_test.out.20120429_1300.txt.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] default_transaction_isolation = serializable causes crash under Hot Standby
On Sat, Apr 28, 2012 at 5:56 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: But if you set it in the postgresql.conf file, it's not pretty: kevin@kevin-desktop:~$ psql -p 5433 test psql: FATAL: can not create a serializable snapshot during recovery Ideas? The patch as submitted doesn't do anything useful for the case where the setting in .conf is serializable. It just throws an ERROR for any action, which is not useful or friendly. IMHO the desired behaviour would be * prevent default_transaction_isolation = serializable as a default setting when we enter Hot Standby by throwing a FATAL error from the startup process. I can help implement that if we agree. * throw a WARNING if serializable is stated in other cases, and downgrade the request to repeatable read Throwing a WARNING is better than an ERROR, since it reduces the level of application editing to make it work for HS. Or better still, just document that serializable currently means repeatable read when executed on an HS. On another note, what happened to the plan to export regular serializable snapshots to the standby for use as deferred snapshots? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2 release notes, beta time?
On Sat, Apr 28, 2012 at 12:15 AM, Devrim GÜNDÜZ dev...@gunduz.org wrote: (As a side note, RPMs *may not* be ready, because I (and Magnus) will be at PGDay Turkey on 12th, and will be busy over the whole weekend). Is that a closed meeting? I hadn't seen any mention of that anywhere. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] default_transaction_isolation = serializable causes crash under Hot Standby
Simon Riggs wrote: Kevin Grittner wrote: But if you set it in the postgresql.conf file, it's not pretty: kevin@kevin-desktop:~$ psql -p 5433 test psql: FATAL: can not create a serializable snapshot during recovery Ideas? The patch as submitted doesn't do anything useful for the case where the setting in .conf is serializable. Precisely the point I was trying to make. We discussed an approach, I tried it, but that approach doesn't work. So we need new ideas. IMHO the desired behaviour would be * prevent default_transaction_isolation = serializable as a default setting when we enter Hot Standby by throwing a FATAL error from the startup process. I can help implement that if we agree. I'm not sure how well that plays for all users of HS. It would work for us because there are a number of settings we want to change on a HS; we use a different postgresql.conf from the master, so this works fine. I am concerned about people who want to promote the HS to a new master without swapping in a different conf file and doing a reload or restart -- if they want serializable transactions as the default on their master, how would this work? * throw a WARNING if serializable is stated in other cases, and downgrade the request to repeatable read Yeah, that idea has been in the back of my mind all along, but I couldn't find a comfortable way to reconcile that with the point above and with the fact that we all seem to hope to some day have a way to run true serializable transactions under HS. Throwing a WARNING is better than an ERROR, since it reduces the level of application editing to make it work for HS. Or better still, just document that serializable currently means repeatable read when executed on an HS. Somehow it just feels wrong, but if nobody can propose a less painful alternative, perhaps that is best. For now. On another note, what happened to the plan to export regular serializable snapshots to the standby for use as deferred snapshots? For the record, I think the most promising variation on that was to include a new WAL record type or a bit in the WAL record header to flag points in the WAL stream at which a transaction on HS could not see any anomalies. A serializable transaction on a HS would use a snapshot built at that point, either by waiting for one or having the WAL receiver keep the latest known safe one available. That is still on my list of things I would like to do, but round tuits are hard to come by these days, and there are a number of items higher on that list. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: xReader, double-effort (was: Temporary tables under hot standby)
Tom Lane wrote: Kevin Grittner writes: Well, the xReader approach (Aakash's GSoC project) is to serve as a proxy for a WAL stream going to a hot standby, to interpret each incoming WAL record by cheating and querying the HS before passing the WAL along, and then using sync rep to the HS regardless of whether the incoming stream is sync or async. What reliability problem do you see with this approach? Well, first off, it requires a replica that *is* a physical replica (else it won't have the same table OIDs, for instance). I'm not sure I get the point of logical replication that requires a physical replica as a prerequisite. Maybe I can help with that by describing what the Wisconsin court system does for circuit court data. Due to state law, state supreme court rules, and the general suckiness of the WAN available to us, each of Wisconsin's 72 counties has a database server in the county seat which is the official data of record. We have two parallel forms of backup -- WAL-based and logical. For WAL-based replication we maintain a copy of the last two weekly base backups and all WAL files needed to recover from either of them to any point in time. One copy of this is required to be in the county, on a separate system with separate drives. The other copy is maintained centrally. One rather large server has a hot standby of all county databases, currently fed through copying WAL files (although we plan to move to streaming this year). These physical backups are the primary source for recovering from catastrophic failure of a county server. We have software to top off the recovered database with transactions from the logical transaction stream as needed, since that is more up-to-date. We have also found this combination of physical and logical transactions useful when someone trying to do a data fix with direct SQL statements against a production database mangled things badly. We use PITR recovery up to just before the damage, then apply logical transactions from that point forward, skipping the problem transactions. Until recently our logical transaction stream was generated from our Java middle tier running on the database servers. This has recently been converted to use a generalized PostgreSQL trigger, written in C, which is attached to all tables for which we want logical replication (about 90% of them). The trigger captures data into transaction tables in the database as part of the same transaction they are representing, so that the generation of the logical transaction data is atomic with the represented data. We didn't switch to one of the existing solutions because of all the code downstream which uses the logical transaction streams -- we didn't want to change the format or delivery method of the logical steam because it's simply not feasible to change everything at once. The way the logical stream is currently delivered is still through the aforementioned Java middle tier. It reads the logical transaction log tables and sends the same data in non-persistent JMS messages through queues to the TRAN client at the central location, just as we did for years before we ever heard of PostgreSQL. Normally, the TRAN client reads each logical transaction once, and feeds many replication targets at the central site. These include: - four Central Information Repository (CIR) databases each of which contains data from all 72 counties (two machines each at two different sites) - four Database Transaction Repository (DTR) databases on the same servers as the CIR databases (but separate RAIDs) This is used as a supplementary source for recovery (as mentioned above), as well as for auditing what happened on a case when there is any question, and certain activity reports. - a filter to identify transactions of possible interest to the publishers we use for interfaces to other agencies (district attorneys, public defenders, department of corrections, police agencies, social service agencies, etc.) Transactions of interest are queued for closer review by the publisher software, which actually posts messages if the transaction is found to actually be of interest. - other databases for the state court's central services for such things as court interpreters used for non-English speakers in the various counties The number and varied nature of the targets is significant, because the fan out and timing differences need to be considered. The TRAN client normally receives one copy of the logical data for each transaction, and about 99% of the time the transaction's data is replicated to all databases (and queued for review by publishers) within a second of someone pressing Enter 300 miles away. It's that other 1% that gets really interesting. To deal with the possibility that one or more central targets were down or running very slowly, the TRAN client will give up on keeping them at the
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
Peter Eisentraut pete...@gmx.net writes: In any case, if either the existing session of the TM is cut or it cannot create a new connection, it will, after some time, have to give up roll back the prepared transactions on the other servers. So some kind of setting to not shut down if there are prepared transactions pending could be useful. But this could probably be a separate GUC setting or two instead of a shutdown mode (or two) of its own. This argument still seems pretty bogus. The *entire* point of a TM is to cope with crashes of individual databases under its management. The proposed setting seems to amount to a please don't crash GUC, which is silly on its face, and does not actually make the TM's life any easier anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
Peter Eisentraut pete...@gmx.net writes: On fre, 2012-04-27 at 14:57 -0400, Robert Haas wrote: I think there is no point at all in having a discussion about this unless we can first agree that the overwhelming majority of people who have commented on this issue on this list are unhappy with the current default behavior. If we are not going to change the default behavior, then there is zero point in talking about this. Have you reviewed the previous discussions where changing the default behavior was discussed and rejected? I don't like the current default any more than you do, but without any new arguments, there is, as you say, zero point in talking about this. Perhaps I've forgotten something, but I only recall debates about switching the default to a different one of the existing shutdown modes. The new material here is the proposal for a new mode. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
On Sun, Apr 29, 2012 at 4:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Eisentraut pete...@gmx.net writes: In any case, if either the existing session of the TM is cut or it cannot create a new connection, it will, after some time, have to give up roll back the prepared transactions on the other servers. So some kind of setting to not shut down if there are prepared transactions pending could be useful. But this could probably be a separate GUC setting or two instead of a shutdown mode (or two) of its own. This argument still seems pretty bogus. The *entire* point of a TM is to cope with crashes of individual databases under its management. The proposed setting seems to amount to a please don't crash GUC, which is silly on its face, and does not actually make the TM's life any easier anyway. You are right that the TM can cope with aborted transactions, but that doesn't mean we should force it to have to do that. If we can wait for commit then we should do so. I think we only need one new mode, shutdown when transactions are finished should only shutdown when all types of transaction are complete. For people that don't use prepared transactions the difference is irrelevant. For people that do use prepared transactions, I can't imagine they would want a new setting that ends with aborted transactions, since that isn't any different to a fast shutdown. If that hangs waiting for TM that has gone away, then you can issue shutdown fast. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2 release notes, beta time?
On Sun, Apr 29, 2012 at 14:23, Simon Riggs si...@2ndquadrant.com wrote: On Sat, Apr 28, 2012 at 12:15 AM, Devrim GÜNDÜZ dev...@gunduz.org wrote: (As a side note, RPMs *may not* be ready, because I (and Magnus) will be at PGDay Turkey on 12th, and will be busy over the whole weekend). Is that a closed meeting? I hadn't seen any mention of that anywhere. It's been announced on the turkish site I believe, and I picked it up on their twitter feed. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
Simon Riggs si...@2ndquadrant.com writes: I think we only need one new mode, shutdown when transactions are finished should only shutdown when all types of transaction are complete. For people that don't use prepared transactions the difference is irrelevant. For people that do use prepared transactions, I can't imagine they would want a new setting that ends with aborted transactions, since that isn't any different to a fast shutdown. That sounds reasonable at first blush. Implementing it might be trickier than you think though, since (despite Peter's opinion) the prepared xacts are not associated with any particular session, and the postmaster itself doesn't know they are there. What's more, if individual sessions are told to commit hara-kiri as soon as they are not in a transaction, there soon won't be any surviving session in which the TM could issue a COMMIT PREPARED. I think the only way this could be made to fly would be if the TM could set a session state that indicates I'm a TM session, don't kill me until all prepared transactions are gone. Which might be problematic from a security standpoint, if random users could use it to proof themselves against getting kicked out. We could make it SUSET but then TMs would have to run as superuser, which seems a bit less than desirable. On the whole it is not apparent to me that we really need a mode in which shutdown waits for prepared transactions to flush out; and I would definitely not be in favor of it being the default. I think that that would make prepared transactions an even bigger foot-gun than they are now. Just think: you say pg_ctl stop, and the server promptly kicks off all your users and won't let any more in, but doesn't actually shut down. You may not know why, and even if you do, you can't connect to do something about it. Eventually you give up and issue shutdown fast, cursing whoever designed that misbegotten behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] default_transaction_isolation = serializable causes crash under Hot Standby
On Sun, Apr 29, 2012 at 1:40 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: IMHO the desired behaviour would be * prevent default_transaction_isolation = serializable as a default setting when we enter Hot Standby by throwing a FATAL error from the startup process. I can help implement that if we agree. I'm not sure how well that plays for all users of HS. It would work for us because there are a number of settings we want to change on a HS; we use a different postgresql.conf from the master, so this works fine. I am concerned about people who want to promote the HS to a new master without swapping in a different conf file and doing a reload or restart -- if they want serializable transactions as the default on their master, how would this work? I can see you might want to have a standby with hot_standby=off that was immediately ready to take the place of the master. In that case, the setting of default_transaction_isolation would have no effect on the standby, so you are ready and waiting. If you request hot_standby=on presumably you're not going to want a continuous stream of ERRORs. Blocking that completely is hard because we would need to test the parameter file as well as testing user or database settings. The only way default_transaction_isolation = serializable would be acceptable when hot_standby = on is if we silently downgrade the isolation level to read committed. That way everything just works, albeit not quite as requested. So I think that's the best way forwards. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] default_transaction_isolation = serializable causes crash under Hot Standby
Simon Riggs wrote: The only way default_transaction_isolation = serializable would be acceptable when hot_standby = on is if we silently downgrade the isolation level to read committed. That way everything just works, albeit not quite as requested. So I think that's the best way forwards. I guess I don't see much alternative. Did you want to code that (I seem to remember you saying that in an earlier post) or would you like me to do it? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: xReader, double-effort (was: Temporary tables under hot standby)
On Sun, Apr 29, 2012 at 3:27 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: It would be enormously more performant for the master to be emitting logical replication records to start with, since it already has all the right names etc at hand at basically no cost. Not when the consumers are across a WAN, and that WAN is the biggest performance bottleneck and the most expensive resource involved. I agree that the WAN is important, for both bandwidth and response time. Though it isn't a given that logical change records (LCRs) will require more bandwidth than physical WAL. WAL contains full page images, index changes and other information that would be absent from the LCR stream. It also depends upon the specification of the LCRs - what metadata is included and whether the LCRs use text or binary. Those choices have other impacts as well, so measurements and detailed analysis is required to justify how to proceed. Which is what is in progress now. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future In-Core Replication
On Sun, 2012-04-29 at 12:03 +0100, Simon Riggs wrote: On Sat, Apr 28, 2012 at 8:40 PM, Hannu Krosing ha...@krosing.net wrote: As to what LCRs should contain, it will probably be locical equivalents of INSERT, UPDATE ... LIMIT 1, DELETE ... LIMIT 1, TRUNCATE and all DDL. Yeh I would even go as far as propose a variant for DML-WITH-LIMIT-1 to be added to postgresql's SQL syntax so that the LCRs could be converted to SQL text for some tasks and thus should be easy to process using generic text-based tools. The DML-WITH-LIMIT-1 is required to do single logical updates on tables with non-unique rows. And as for any logical updates we will have huge performance problem when doing UPDATE or DELETE on large table with no indexes, but fortunately this problem is on slave, not master ;) While that is possible, I would favour the do-nothing approach. By making the default replication mode = none, we then require a PK to be assigned before allowing replication mode = on for a table. Trying to replicate tables without PKs is a problem that can wait basically. While this is a good approach in most cases, there is a large use case for pk-less / indexless tables in large logfiles, where you may want to do INSERT only replication, perhaps with some automatic partitioning on logdate. Allowing this is probably something to look at in the first release, even though I'm not sure wht would happen on violation of this insert-only policy. Should it * refuse to continue and rollback the transaction (probably not) * fail silently * succeed but log the change locally * succseed with some special flags so other side can treat it specially without having to look up stuff in system catalog * (if we mark the unique / pk fields in some special way anyway, then the previous one is free :) -- --- Hannu Krosing PostgreSQL Unlimited Scalability and Performance Consultant 2ndQuadrant Nordic PG Admin Book: http://www.2ndQuadrant.com/books/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
On Sun, Apr 29, 2012 at 5:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: I think we only need one new mode, shutdown when transactions are finished should only shutdown when all types of transaction are complete. For people that don't use prepared transactions the difference is irrelevant. For people that do use prepared transactions, I can't imagine they would want a new setting that ends with aborted transactions, since that isn't any different to a fast shutdown. That sounds reasonable at first blush. Implementing it might be trickier than you think though, since (despite Peter's opinion) the prepared xacts are not associated with any particular session, and the postmaster itself doesn't know they are there. What's more, if individual sessions are told to commit hara-kiri as soon as they are not in a transaction, there soon won't be any surviving session in which the TM could issue a COMMIT PREPARED. I think the only way this could be made to fly would be if the TM could set a session state that indicates I'm a TM session, don't kill me until all prepared transactions are gone. Which might be problematic from a security standpoint, if random users could use it to proof themselves against getting kicked out. We could make it SUSET but then TMs would have to run as superuser, which seems a bit less than desirable. I think an explicit state is overkill and has other problems as you say. On the whole it is not apparent to me that we really need a mode in which shutdown waits for prepared transactions to flush out; and I would definitely not be in favor of it being the default. I think that that would make prepared transactions an even bigger foot-gun than they are now. Just think: you say pg_ctl stop, and the server promptly kicks off all your users and won't let any more in, but doesn't actually shut down. You may not know why, and even if you do, you can't connect to do something about it. Eventually you give up and issue shutdown fast, cursing whoever designed that misbegotten behavior. Waiting too long is clearly a foot fun, as you say. But if you just issued PREPARE on a session, its more than likely that this will be followed almost immediately by a COMMIT. Simply waiting is a good indication, and some reasonable time like 10 seconds is fine in determining whether that COMMIT will arrive, or not. This only matters on a shutdown. If its a restart, we can shutdown after a PREPARE because as soon as we are back up again the TM can issue the COMMIT. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future In-Core Replication
On Sun, Apr 29, 2012 at 6:20 PM, Hannu Krosing ha...@2ndquadrant.com wrote: On Sun, 2012-04-29 at 12:03 +0100, Simon Riggs wrote: On Sat, Apr 28, 2012 at 8:40 PM, Hannu Krosing ha...@krosing.net wrote: As to what LCRs should contain, it will probably be locical equivalents of INSERT, UPDATE ... LIMIT 1, DELETE ... LIMIT 1, TRUNCATE and all DDL. Yeh I would even go as far as propose a variant for DML-WITH-LIMIT-1 to be added to postgresql's SQL syntax so that the LCRs could be converted to SQL text for some tasks and thus should be easy to process using generic text-based tools. The DML-WITH-LIMIT-1 is required to do single logical updates on tables with non-unique rows. And as for any logical updates we will have huge performance problem when doing UPDATE or DELETE on large table with no indexes, but fortunately this problem is on slave, not master ;) While that is possible, I would favour the do-nothing approach. By making the default replication mode = none, we then require a PK to be assigned before allowing replication mode = on for a table. Trying to replicate tables without PKs is a problem that can wait basically. While this is a good approach in most cases, there is a large use case for pk-less / indexless tables in large logfiles, where you may want to do INSERT only replication, perhaps with some automatic partitioning on logdate. Allowing this is probably something to look at in the first release, even though I'm not sure wht would happen on violation of this insert-only policy. Should it * refuse to continue and rollback the transaction (probably not) * fail silently * succeed but log the change locally * succseed with some special flags so other side can treat it specially without having to look up stuff in system catalog * (if we mark the unique / pk fields in some special way anyway, then the previous one is free :) OK, I think an insert-only replication mode would allow that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] default_transaction_isolation = serializable causes crash under Hot Standby
On Sun, Apr 29, 2012 at 5:54 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Simon Riggs wrote: The only way default_transaction_isolation = serializable would be acceptable when hot_standby = on is if we silently downgrade the isolation level to read committed. That way everything just works, albeit not quite as requested. So I think that's the best way forwards. I guess I don't see much alternative. Did you want to code that (I seem to remember you saying that in an earlier post) or would you like me to do it? I'll happily do that, unless you wish to. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
On sön, 2012-04-29 at 10:19 +0100, Simon Riggs wrote: Maybe we don't need to do this over multiple releases, but we do need to give warning of possible incompatibilities. It would be good to see a specific post on hackers called Planned Incompatibilities in 9.2, or collect such things on the open items wiki, so that people listening can see what might happen and get a chance to object. Or if changes do go ahead, at least we give them a few months warning to change the downstream software. Otherwise all that happens is our new release comes out and fewer people use it because it takes ages to actually realign the software stack enough for our software to be used. Well, either there are possible incompatibilities, in which case users will be slow to adopt new releases, as is currently the case, or there strictly won't be any (unless hidden behind config settings or similar), but then introducing new features or bug fixes can take many years. So far we've erred on the side of progress. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] precision and scale functions for numeric
I didn't find a good way to find out how many digits a numeric value has or things like whether a numeric value is an integer. (I had to go through bc(1) for the latter.) Functions like precision() and scale() would have been quite handy. Are there other ways to do this, or would this make a good addition? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2 release notes, beta time?
Magnus Hagander wrote: 2012/4/28 Josh Berkusj...@agliodbs.com: Ugh. Maybe the whole idea of getting a beta out before PGCon is doomed. Still, if we don't try for this schedule, we're looking at at least two more weeks' slip, because we're surely not going to wrap during PGCon. We could do it in person! We could have Berkus handing out CDs! Or, if you guys want to mail out CDs to the whole US, I know some people. Jay -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request to add options to tools/git_changelog
Bruce Momjian wrote: I am again requesting the addition of options to tools/git_changelog so I can more easily produce the release notes. I asked for this during 9.1 development and it was rejected. I am currently using my own custom version of the tool, but have to merge community improvements into the tool every year before I use it. FYI in the general case of I have to maintain a patch set: Now that PG is on git, there's a tool called Stacked Git that lets you use git's excellent merge capabilities to maintain patches. http://www.procode.org/stgit/ Jay -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: xReader, double-effort (was: Temporary tables under hot standby)
Simon Riggs wrote: Though it isn't a given that logical change records (LCRs) will require more bandwidth than physical WAL. It would help if you could share a little more of what you have in mind. It's not yet clear to me whether you're talking about adding logical records to the WAL stream, replacing page-oriented records in the WAL stream with logical records, or leaving the WAL stream alone and generating a second, new stream of logical data. WAL contains full page images, index changes and other information that would be absent from the LCR stream. Other than full page images, which could be eliminated in other ways before streaming the WAL, what is in the WAL file that would not be needed to maintain an exact replica of the cluster, suitable for quick deployment in place of a lost cluster for disaster recovery? Or do you foresee eliminating some of the current page-image oriented WAL records in favor of logical WAL records, with an expectation that the logical records will be smaller? measurements and detailed analysis is required to justify how to proceed. Which is what is in progress now. Could you clarify? Are you referring to this discussion or some effort at 2Q that hasn't yet been shared with the community? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] default_transaction_isolation = serializable causes crash under Hot Standby
Simon Riggs wrote: On Sun, Apr 29, 2012 at 5:54 PM, Kevin Grittner wrote: Simon Riggs wrote: The only way default_transaction_isolation = serializable would be acceptable when hot_standby = on is if we silently downgrade the isolation level to read committed. That way everything just works, albeit not quite as requested. So I think that's the best way forwards. I guess I don't see much alternative. Did you want to code that (I seem to remember you saying that in an earlier post) or would you like me to do it? I'll happily do that, unless you wish to. I think I know where to do that. I'll give it a shot. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] default_transaction_isolation = serializable causes crash under Hot Standby
On Sun, Apr 29, 2012 at 8:20 AM, Simon Riggs si...@2ndquadrant.com wrote: * prevent default_transaction_isolation = serializable as a default setting when we enter Hot Standby by throwing a FATAL error from the startup process. I can help implement that if we agree. I am strongly disinclined to go that route, because (1) our customers, at least, really hate it when the standby won't start up, and I think we should be very careful about finding more reasons for that to happen; and (2) it's not bullet-proof anyway, because there is still ALTER USER .. SET and ALTER DATABASE .. SET and maybe a few other methods as well. Keep in mind, also, that this has to be back-patched to 9.1; so if we go this route then someone might shut down their server, upgrade the binaries, restart the server, and have it fail to start. I think that's not a nice thing to do in a minor release. * throw a WARNING if serializable is stated in other cases, and downgrade the request to repeatable read I think this would be reasonable, but it's still my second choice. The advantage of throwing an ERROR is that someone will presumably be forced to realize that a problem exists and fix it, whereas a WARNING may just generate a combination of log spam and unexpected behavior forever. Also, we currently block cases where you try to set transaction_isolation by throwing an ERROR, so it seems a bit more consistent to do that in other cases as well. Still, it's a reasonable choice, and certainly better than failing an assertion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Temporary tables under hot standby
On Tue, Apr 24, 2012 at 11:55:15PM -0400, Noah Misch wrote: Concerning everyone's favorite topic, how to name the new type of table, I liked Tom's proposal[1] to make CREATE TEMP TABLE retain current behavior and have CREATE GLOBAL TEMP TABLE and/or CREATE LOCAL TEMP TABLE request the new SQL-standard variety. (I'd vote for using CREATE GLOBAL and retaining CREATE LOCAL for future expansion.) As he mentions, to get there, we'd ideally start by producing a warning instead of silently accepting GLOBAL as a noise word. Should we put such a warning into 9.2? Here is the change I'd make. Thanks, nm [1] http://archives.postgresql.org/message-id/5422.1240936...@sss.pgh.pa.us *** a/doc/src/sgml/ref/create_table.sgml --- b/doc/src/sgml/ref/create_table.sgml *** *** 163,169 CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI para Optionally, literalGLOBAL/literal or literalLOCAL/literal can be written before literalTEMPORARY/ or literalTEMP/. ! This makes no difference in productnamePostgreSQL/, but see xref linkend=sql-createtable-compatibility endterm=sql-createtable-compatibility-title. /para --- 163,170 para Optionally, literalGLOBAL/literal or literalLOCAL/literal can be written before literalTEMPORARY/ or literalTEMP/. ! This presently makes no difference in productnamePostgreSQL/ ! and is deprecated; see xref linkend=sql-createtable-compatibility endterm=sql-createtable-compatibility-title. /para *** *** 1310,1316 CREATE TABLE employees OF employee_type ( productnamePostgreSQL/productname does not have. For compatibility's sake, productnamePostgreSQL/productname will accept the literalGLOBAL/literal and literalLOCAL/literal keywords ! in a temporary table declaration, but they have no effect. /para para --- 1311,1318 productnamePostgreSQL/productname does not have. For compatibility's sake, productnamePostgreSQL/productname will accept the literalGLOBAL/literal and literalLOCAL/literal keywords ! in a temporary table declaration, but they have no effect. This usage is ! deprecated and may specify standard-compliant behavior in the future. /para para *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *** *** 2512,2521 CreateStmt: CREATE OptTemp TABLE qualified_name '(' OptTableElementList ')' */ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; } | TEMP { $$ = RELPERSISTENCE_TEMP; } ! | LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; } ! | LOCAL TEMP{ $$ = RELPERSISTENCE_TEMP; } ! | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; } ! | GLOBAL TEMP { $$ = RELPERSISTENCE_TEMP; } | UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; } | /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; } ; --- 2512,2549 */ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; } | TEMP { $$ = RELPERSISTENCE_TEMP; } ! | LOCAL TEMPORARY ! { ! ereport(WARNING, ! (errmsg(LOCAL is deprecated in temporary table creation), !errdetail(This may specify different semantics in future versions of PostgreSQL.), ! parser_errposition(@1))); ! $$ = RELPERSISTENCE_TEMP; ! } ! | LOCAL TEMP ! { ! ereport(WARNING, ! (errmsg(LOCAL is deprecated in temporary table creation), !errdetail(This may specify different semantics in future versions of PostgreSQL.), ! parser_errposition(@1))); ! $$ = RELPERSISTENCE_TEMP; ! } ! | GLOBAL TEMPORARY ! { ! ereport(WARNING, ! (errmsg(GLOBAL is deprecated in temporary table creation), !
Re: [HACKERS] Re: xReader, double-effort (was: Temporary tables under hot standby)
On Sun, Apr 29, 2012 at 7:36 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Simon Riggs wrote: Though it isn't a given that logical change records (LCRs) will require more bandwidth than physical WAL. It would help if you could share a little more of what you have in mind. It's not yet clear to me whether you're talking about adding logical records to the WAL stream, replacing page-oriented records in the WAL stream with logical records, or leaving the WAL stream alone and generating a second, new stream of logical data. WAL contains full page images, index changes and other information that would be absent from the LCR stream. Other than full page images, which could be eliminated in other ways before streaming the WAL, what is in the WAL file that would not be needed to maintain an exact replica of the cluster, suitable for quick deployment in place of a lost cluster for disaster recovery? Or do you foresee eliminating some of the current page-image oriented WAL records in favor of logical WAL records, with an expectation that the logical records will be smaller? measurements and detailed analysis is required to justify how to proceed. Which is what is in progress now. Could you clarify? Are you referring to this discussion or some effort at 2Q that hasn't yet been shared with the community? These are early days yet and I've been deliberately vague on the format and content of the LCRs. That is to allow the idea that we have choices to bed in a little, so we can discuss them. The one thing that does stand clear in my mind at this stage is that the physical transport of LCRs should be the current streaming infrastructure. So the different aspects of the design are at differing stages of certainty (in my mind). Much of today has been spent on a presentation explaining the thought processes and options available to us. Deciding between those options needs to be fact based rather than just a matter of opinion and I want to ensure we make the right choices, whatever they are. I've mentioned a few times already that we're working on prototypes all of which will be shared with the community. RD, in that order. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: xReader, double-effort (was: Temporary tables under hot standby)
On Sat, Apr 28, 2012 at 11:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: Translating WAL is a very hard task. No kidding. I would think it's impossible on its face. Just for starters, where will you get table and column names from? (Looking at the system catalogs is cheating, and will not work reliably anyway.) IMO, if we want non-physical replication, we're going to need to build it in at a higher level than after-the-fact processing of WAL. I foresee wasting quite a lot of effort on the currently proposed approaches before we admit that they're unworkable. I think the question we should be asking ourselves is not whether WAL as it currently exists is adequate for logical replication, but rather or not it could be made adequate. For example, suppose that we were to arrange things so that, after each checkpoint, the first insert, update, or delete record for a given relfilenode after each checkpoint emits a special WAL record that contains the relation name, schema OID, attribute names, and attribute type OIDs. Well, now we are much closer to being able to do some meaningful decoding of the tuple data, and it really doesn't cost us that much. Handling DDL (and manual system catalog modifications) seems pretty tricky, but I'd be very reluctant to give up on it without banging my head against the wall pretty hard. The trouble with giving up on WAL completely and moving to a separate replication log is that it means a whole lot of additional I/O, which is bound to have a negative effect on performance. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
On Sun, Apr 29, 2012 at 1:48 PM, Peter Eisentraut pete...@gmx.net wrote: On sön, 2012-04-29 at 10:19 +0100, Simon Riggs wrote: Maybe we don't need to do this over multiple releases, but we do need to give warning of possible incompatibilities. It would be good to see a specific post on hackers called Planned Incompatibilities in 9.2, or collect such things on the open items wiki, so that people listening can see what might happen and get a chance to object. Or if changes do go ahead, at least we give them a few months warning to change the downstream software. Otherwise all that happens is our new release comes out and fewer people use it because it takes ages to actually realign the software stack enough for our software to be used. Well, either there are possible incompatibilities, in which case users will be slow to adopt new releases, as is currently the case, or there strictly won't be any (unless hidden behind config settings or similar), but then introducing new features or bug fixes can take many years. So far we've erred on the side of progress. Erred on the side of progress might even be a little strong, because I think for the most part we have been extremely judicious about backward incompatibilities in the last few releases (which is a good thing). Obviously, 8.3 was a flag day of the first magnitude, and one I hope we won't repeat any time soon, but if you look through the release notes for, say, 9.1, just about every incompatibility listed there amounts to fixing something that was either demonstrably broken or widely hated in prior releases. Turning on standard_conforming_strings by default was a big deal, but we've been phasing that change in for five years or so, so I think we really did about as much to ease that transition as is humanly possible. Moreover, you can always turn the GUC off again, if the new behaviour is a problem. The only way we're going to have fewer incompatibilities than we do now is to preserve existing behavior even when it's broken, widely-hated, and/or not standards-conformant. IMHO, that would be taking a sound principle to an illogical extreme. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
On Sun, Apr 29, 2012 at 7:19 AM, Simon Riggs si...@2ndquadrant.com wrote: On Sun, Apr 29, 2012 at 12:24 AM, Robert Haas robertmh...@gmail.com wrote: Let's not go looking for reasons to reject the approach just because we didn't expect it to work as well as it does. Who here, in your opinion, is looking for reasons to reject anything? I'm just saying that there seems to be a bit more skepticism here than can be justified considering that the test results are all on one side. It wouldn't take a lot of evidence to convince me that this is a bad idea, but it will take more than none, which is the amount we have now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
Robert Haas robertmh...@gmail.com writes: Erred on the side of progress might even be a little strong, because I think for the most part we have been extremely judicious about backward incompatibilities in the last few releases (which is a good thing). Obviously, 8.3 was a flag day of the first magnitude, and one I hope we won't repeat any time soon, but if you look through the release notes for, say, 9.1, just about every incompatibility listed there amounts to fixing something that was either demonstrably broken or widely hated in prior releases. Well, if you're ragging on the implicit coercions changes, let me point out that those were also fixing something that was demonstrably broken. So I'm afraid it's a tad pollyanna-ish to claim that there is never going to be push-back on changes we choose to make for one or another of these reasons. Having said that, though, I agree that we have to be willing to make incompatible changes from time to time, and I think our standards for when to do that are plenty high enough already. I'm not in favor of raising that bar still more. The reason we support back branches as long as we do is precisely to give people the option to not deal with incompatible changes until they are ready to. I don't think we need to do even more, and I don't want to add still more overhead to the development process to do it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: xReader, double-effort (was: Temporary tables under hot standby)
On Sun, 2012-04-29 at 16:33 -0400, Robert Haas wrote: On Sat, Apr 28, 2012 at 11:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: Translating WAL is a very hard task. No kidding. I would think it's impossible on its face. Just for starters, where will you get table and column names from? (Looking at the system catalogs is cheating, and will not work reliably anyway.) IMO, if we want non-physical replication, we're going to need to build it in at a higher level than after-the-fact processing of WAL. I foresee wasting quite a lot of effort on the currently proposed approaches before we admit that they're unworkable. I think the question we should be asking ourselves is not whether WAL as it currently exists is adequate for logical replication, but rather or not it could be made adequate. Agreed. For example, suppose that we were to arrange things so that, after each checkpoint, the first insert, update, or delete record for a given relfilenode after each checkpoint emits a special WAL record that contains the relation name, schema OID, attribute names, and attribute type OIDs. Not just the first after checkpoint, but also the first after a schema change, even though will duplicate the wals with changes to system catalog, it is likely much cheaper overall to always have a fresh structure in wal stream. And if we really want to do WAL--logical--SQL_text conversion on a host separate from the master, we also need to insert there the type definitions of user-defined types together with at least types output functions in some form . So you basically need a large part of postgres for reliably making sense of WAL. Well, now we are much closer to being able to do some meaningful decoding of the tuple data, and it really doesn't cost us that much. Handling DDL (and manual system catalog modifications) seems pretty tricky, but I'd be very reluctant to give up on it without banging my head against the wall pretty hard. Most straightforward way is to have a more or less full copy of pg_catalog also on the WAL-filtering / WAL-conversion node, and to use it in 1:1 replicas of transactions recreated from the WAL . This way we can avoid recreating any alternate views of the masters schema. Then again, we could do it all on master and inside the wal-writing transaction and thus avoid large chunk of the problems. If the receiving side is also PostgreSQL with same catalog structure (i.e same major version) then we don't actually need to handle DDL in any complicated way, it would be enough to just carry over the changes to system tables . The main reason we don't do it currently for trigger-based logical replication is the restriction of not being able to have triggers on system tables. I hope it is much easier to have the triggerless record generation also work on system tables. The trouble with giving up on WAL completely and moving to a separate replication log is that it means a whole lot of additional I/O, which is bound to have a negative effect on performance. Why would you give up WAL ? Or do you mean that the new logical-wal needs to have same commit time behaviour as WAL to be reliable ? I'd envision a scenario where the logi-wal is sent to slave or distribution hub directly and not written at the local host at all. An optionally sync mode similar to current sync WAL replication could be configured. I hope this would run mostly in parallel with local WAL generation so not much extra wall-clock time would be wasted. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- --- Hannu Krosing PostgreSQL Unlimited Scalability and Performance Consultant 2ndQuadrant Nordic PG Admin Book: http://www.2ndQuadrant.com/books/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future In-Core Replication
Excerpts from Simon Riggs's message of jue abr 26 11:10:09 -0300 2012: On Thu, Apr 26, 2012 at 1:41 PM, Simon Riggs si...@2ndquadrant.com wrote: I will also be organising a small-medium sized Future of In-Core Replication meeting in Ottawa on Wed 16 May, 6-10pm. Thanks for such rapid response. I've put up a wiki page and will be adding names as they come through http://wiki.postgresql.org/wiki/PgCon2012CanadaInCoreReplicationMeeting How is this not redundant with the Cluster Summit? http://wiki.postgresql.org/wiki/PgCon2012CanadaClusterSummit ... oh, you're also already enlisted in that one. Sigh. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: add timing of buffer I/O requests
Robert Haas robertmh...@gmail.com writes: I like the idea of including the word block in there. I don't think it was probably a terribly good idea to abbreviate that to blk everywhere, but at this point it's probably better to be consistent, sigh. As for track_iotiming - track_io_timing, I'm fine with that as well. I made these changes, so I think we are done with the naming issues. However, I'd still like to propose that we think about adjusting the timing column datatypes, ie uniformly use float8 msec for values representing elapsed times. By my count there are six columns that would be affected: pg_stat_bgwriter.checkpoint_write_time pg_stat_bgwriter.checkpoint_sync_time pg_stat_database.blk_read_time pg_stat_database.blk_write_time pg_stat_user_functions.total_time pg_stat_user_functions.self_time The first four of these are new in 9.2, meaning that we would only be creating a compatibility issue for the last two. If we wait to do this in the future, we will have a significantly bigger problem than if we do it today. Advantages of the change are: * Better precision exposed to the user (pg_stat_user_functions has historically provided only millisecond precision). * Removal of arbitrary limit of microsecond precision. Of course, the underlying data is still no better than microsecond, but if we ever are able to migrate to OS APIs that return better-than-microsecond data, we won't have to adjust the stats view APIs to expose that data. * A chance to make the functions underlying these stats view columns agree with the exposed column definitions. Any objections out there? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2 release notes, beta time?
On Sun, 2012-04-29 at 13:23 +0100, Simon Riggs wrote: (As a side note, RPMs *may not* be ready, because I (and Magnus) will be at PGDay Turkey on 12th, and will be busy over the whole weekend). Is that a closed meeting? I hadn't seen any mention of that anywhere. Not that much. I've been tweeting about that (in Turkish, sorry) and sending out announcements to local mailing lists for quite some time, but we not able to finalize the venue until last week. This is the 2nd one after last year. I will do a wider announcement next time, I promise -- and the new event submission policy on postgresql.org raised a barrier against adding an item to the website, but that is more a -www topic. (Oh, and you need to have the secret key to enter the meeting.) -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
Excerpts from Ryan Kelly's message of sáb ene 14 16:22:21 -0300 2012: I have attached a new patch which handles the connect_timeout option by adding a PQconnectTimeout(conn) function to access the connect_timeout which I then use to retrieve the existing value from the old connection. Was this patch dropped entirely? If not and it hasn't been committed yet, I think it belongs in the open CF here: https://commitfest.postgresql.org/action/commitfest_view?id=14 -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)
On Sun, Apr 29, 2012 at 5:27 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Erred on the side of progress might even be a little strong, because I think for the most part we have been extremely judicious about backward incompatibilities in the last few releases (which is a good thing). Obviously, 8.3 was a flag day of the first magnitude, and one I hope we won't repeat any time soon, but if you look through the release notes for, say, 9.1, just about every incompatibility listed there amounts to fixing something that was either demonstrably broken or widely hated in prior releases. Well, if you're ragging on the implicit coercions changes, let me point out that those were also fixing something that was demonstrably broken. True, but it was painful for a lot of people, and I continue to think that we broke too many reasonable cases. So I'm afraid it's a tad pollyanna-ish to claim that there is never going to be push-back on changes we choose to make for one or another of these reasons. Agreed, I expect some push-back. I'm just pointing out that we reject a very significant number of changes on backward-compatibility grounds. We don't reject too many entire patches on these grounds, but many are the patch authors who have been asked to change X,Y, or Z to avoid breaking backward compatibility, or who have had things ripped out by the committer for such reasons. Of course this is sometimes an occasion for complaint, and then the backward compatibility changes that do get through are also an occasion for complaint, so there's no perfect world, but we do try pretty hard, I think. Having said that, though, I agree that we have to be willing to make incompatible changes from time to time, and I think our standards for when to do that are plenty high enough already. I'm not in favor of raising that bar still more. The reason we support back branches as long as we do is precisely to give people the option to not deal with incompatible changes until they are ready to. I don't think we need to do even more, and I don't want to add still more overhead to the development process to do it. +1, and well put. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: xReader, double-effort (was: Temporary tables under hot standby)
On Sun, Apr 29, 2012 at 6:00 PM, Hannu Krosing ha...@2ndquadrant.com wrote: I think the question we should be asking ourselves is not whether WAL as it currently exists is adequate for logical replication, but rather or not it could be made adequate. Agreed. And of course I meant but rather whether or not it could be made adequate, but I dropped a word. For example, suppose that we were to arrange things so that, after each checkpoint, the first insert, update, or delete record for a given relfilenode after each checkpoint emits a special WAL record that contains the relation name, schema OID, attribute names, and attribute type OIDs. Not just the first after checkpoint, but also the first after a schema change, even though will duplicate the wals with changes to system catalog, it is likely much cheaper overall to always have a fresh structure in wal stream. Yes. And if we really want to do WAL--logical--SQL_text conversion on a host separate from the master, we also need to insert there the type definitions of user-defined types together with at least types output functions in some form . Yes. So you basically need a large part of postgres for reliably making sense of WAL. Agreed, but I think that's a problem we need to fix and not a tolerable situation at all. If a user can create a type-output function that goes and looks at the state of the database to determine what to output, then we are completely screwed, because that basically means you would need to have a whole Hot Standby instance up and running just to make it possible to run type output functions. Now you might be able to build a mechanism around that that is useful to some people in some situations, but wow does that sound painful. What I want is for the master to be able to cheaply rattle off the tuples that got inserted, updated, or deleted as those things happen; needing a whole second copy of the database just to do that does not meet my definition of cheap. Furthermore, it's not really clear that it's sufficient anyway, since there are problems with what happens before the HS instance reaches consistency, what happens when it crashes and restarts, and how do we handle the case when the system catalog we need to examine to generate the logical replication records is access-exclusive-locked? Seems like a house of cards. Some of this might be possible to mitigate contractually, by putting limits on what type input/output functions are allowed to do. Or we could invent a new analog of type input/output functions that is explicitly limited in this way, and support only types that provide it. But I think the real key is that we can't rely on catalog access: the WAL stream has to have enough information to allow the reader to construct some set of in-memory hash tables with sufficient detail to reliably decode WAL. Or at least that's what I'm thinking. Most straightforward way is to have a more or less full copy of pg_catalog also on the WAL-filtering / WAL-conversion node, and to use it in 1:1 replicas of transactions recreated from the WAL . This way we can avoid recreating any alternate views of the masters schema. See above; I have serious doubts that this can ever be made to work robustly. Then again, we could do it all on master and inside the wal-writing transaction and thus avoid large chunk of the problems. If the receiving side is also PostgreSQL with same catalog structure (i.e same major version) then we don't actually need to handle DDL in any complicated way, it would be enough to just carry over the changes to system tables . I agree it'd be preferable to handle DDL in terms of system catalog updates, rather than saying, well, this is an ALTER TABLE .. RENAME. But you need to be able to decode tuples using the right tuple descriptor, even while that's changing under you. Why would you give up WAL ? For lack of ability to make it work. Don't underestimate how hard it's going to nail this down. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: xReader, double-effort (was: Temporary tables under hot standby)
Robert Haas robertmh...@gmail.com writes: On Sun, Apr 29, 2012 at 6:00 PM, Hannu Krosing ha...@2ndquadrant.com wrote: So you basically need a large part of postgres for reliably making sense of WAL. Agreed, but I think that's a problem we need to fix and not a tolerable situation at all. If a user can create a type-output function that goes and looks at the state of the database to determine what to output, then we are completely screwed, because that basically means you would need to have a whole Hot Standby instance up and running just to make it possible to run type output functions. You mean like enum_out? Or for that matter array_out, record_out, range_out? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: xReader, double-effort (was: Temporary tables under hot standby)
On Sun, Apr 29, 2012 at 11:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, Apr 29, 2012 at 6:00 PM, Hannu Krosing ha...@2ndquadrant.com wrote: So you basically need a large part of postgres for reliably making sense of WAL. Agreed, but I think that's a problem we need to fix and not a tolerable situation at all. If a user can create a type-output function that goes and looks at the state of the database to determine what to output, then we are completely screwed, because that basically means you would need to have a whole Hot Standby instance up and running just to make it possible to run type output functions. You mean like enum_out? Or for that matter array_out, record_out, range_out? Yeah, exactly. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers