Re: [HACKERS] COUNT(*) and index-only scans
On Tue, 2011-10-11 at 13:22 -0400, Robert Haas wrote: > The real issue is that the costing estimates need to be accurate, and > that's where the rubber hits the road. Otherwise, even if we pick the > right way to scan the table, we may do silly things up the line when > we go to start constructing the join order. I think we need to beef > up ANALYZE to gather statistics on the fraction of the pages that are > marked all-visible, or maybe VACUUM should gather that information. > The trouble is that if we VACUUM and then ANALYZE, we'll often get > back a value very close to 100%, but then the real value may diminish > quite a bit before the next auto-analyze fires. I think if we can > figure out what to do about that problem we'll be well on our way... Can you send stats messages to keep track when you unset a bit in the VM? That might allow it to be more up-to-date. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On 11.10.2011 23:21, Simon Riggs wrote: If the normal default_transaction_isolation = read committed and all transactions that require serializable are explicitly marked in the application then there is no way to turn off SSI without altering the application. That is not acceptable, since it causes changes in application behaviour and possibly also performance issues. I don't get that. If all the transactions that require serializability are marked as such, why would you disable SSI for them? That would just break the application, since the transactions would no longer be serializable. If they don't actually need serializability, but repeatable read is enough, then mark them that way. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
Josh Berkus writes: > The reason I want to have the dependant roles created as part of a > database dump is so that we can ship around dump files as a single file, > and restore them with a single command. This is considerably simpler > than the current requirements, which are: > 1. pg_dumpall the roles > 2. pg_dump the database > 3. tar both files > 4. ship file > 5. untar both files > 6. psql the role file > 7. pg_restore the database file I don't find this terribly convincing. I can see the rationales for two endpoint cases: (1) restore these objects into exactly the same ownership/permissions environment that existed before, and (2) restore these objects with the absolute minimum of ownership/permissions assumptions. The latter case seems to me to be covered already by --no-owner --no-privileges. Cases in between those endpoints seem pretty special-purpose, and I don't want to buy into the assumption that we should fix them by creating a plethora of --do-it-joshs-way switches. Can we invent something comparable to the --list/--use-list mechanism, that can handle a range of use cases with a bit more manual effort? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index only scan paving the way for "auto" clustered tables?
On Tue, Oct 11, 2011 at 4:21 PM, Kääriäinen Anssi wrote: > This is probably a silly idea, but I have been wondering about the > following idea: Instead of having visibility info in the row header, > have a couple of row visibility slots in the page header. These slots > could be shared between rows in the page, so that if you do a bulk > insert/update/delete you would only use one slot. If the slots > overflow, you would use external slots buffer. > > When the row is all visible, no slot would be used at all. I've thought about stuff like this, too. One idea would be to make the slots just be items on the page. Each tuple includes a 2-byte field which points to the item number (on the same page) where the XMAX, CID, and CTID information for the update/delete of that tuple is stored. If the tuple isn't updated or deleted, it points back to the tuple's own item ID. When a tuple is updated or deleted, add an item to the page with the necessary XMAX/CMAX/CTID and set the pointer to the new item ID. If the page is full, allocate an "overflow block" and store the overflow block number in the page header. Add the new item to the overflow block and set the 2-byte field to point to it, setting the high bit or something like that to indicate that the data is in the overflow block rather than the data block. The main objection to this idea I see is that if the overflow blocks get much use, the overall performance might end up being poor. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET variable - Permission issues
Joe Conway writes: > On 10/11/2011 02:07 PM, Kevin Grittner wrote: >> I would certainly vote for enforcing on the SET and not causing an >> error on the attempt to change the limit. ... >> What problems do you see with that? > Yeah, I don't know why it need be handled any different than say > ALTER DATABASE foo SET config_param TO value > or > ALTER ROLE foo SET config_param TO value > These cases do not effect already existing processes either. It's not the same thing. Those operations are documented as providing the initial default value for subsequently-started sessions. The proposed change in limit values is different because the GUC range limits have always before been immutable and continuously enforced for the life of a database instance. It may be that Kevin's proposal is adequate. But I find that far from obvious. The trend of everything we've done with GUC for the last ten years is to cause settings changes to apply immediately on-demand and without "oh, but that's obvious if you know the implementation" special cases. I'm not real sure why this should get a free exemption from that expectation ... or to put it more plainly, I *am* sure that we'll be expected to fix it later, just like we had to fix the behavior around removal of postgresql.conf entries, and some other things that people didn't find as obvious as all that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index only scan paving the way for "auto" clustered tables?
On Tue, Oct 11, 2011 at 4:00 PM, Florian Pflug wrote: > On Oct11, 2011, at 21:27 , Robert Haas wrote: >> Alternatively, we could try to graft the concept of a self-clustering >> table on top of the existing heap implementation. But I'm having >> trouble seeing how that would work. The TODO describes it as >> something like "maintain CLUSTER ordering", but that's a gross >> oversimplification, because we have no structure that would allow us >> to sensibly do any such thing... the current heap implementation is >> just that: a pile of stuff. > > We could still be smarter about where we insert new rows in a clustered > table, though. > > Upon INSERT and UPDATE, we'd need to lookup the leaf page where the new > tuple will eventually go in the index we're supposed to maintain CLUSTER > for. Then we'd check if any of the pages referenced there contains enough > space, and if so place the new tuple there. If not it'd go at the end. That's an interesting idea, but my guess is that the "if not" clause would trigger frequently enough to make this not work very well. Of course, we'd need to test it to know for sure -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COUNT(*) and index-only scans
On Tue, Oct 11, 2011 at 3:18 PM, Josh Berkus wrote: > >> The trouble is that if we VACUUM and then ANALYZE, we'll often get >> back a value very close to 100%, but then the real value may diminish >> quite a bit before the next auto-analyze fires. I think if we can >> figure out what to do about that problem we'll be well on our way... > > It's not so much an issue of when the last auto-analyze was as an issue > of the number of rows in write transactions against that table in the > last X minutes. This is where it really hurts us that > pg_stat_user_tables is not time-based. The number of write transactions in the last X minutes seems pretty much irrelevant. What matters is the number of previously-all-visible pages written since the last vacuum. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online base backup from the hot-standby
> Some testing notes > -- > select pg_start_backup('x'); > ERROR: full_page_writes on master is set invalid at least once since > latest checkpoint > > I think this error should be rewritten as > ERROR: full_page_writes on master has been off at some point since > latest checkpoint > > We should be using 'off' instead of 'invalid' since that is what is what > the user sets it to. Sure. > I switched full_page_writes=on , on the master > > did a pg_start_backup() on the slave1. > > Then I switched full_page_writes=off on the master, did a reload + > checkpoint. > > I was able to then do my backup of slave1, copy the control file, and > pg_stop_backup(). > > When I did the test slave2 started okay, but is this safe? Do we need a > warning from pg_stop_backup() that is printed if it is detected that > full_page_writes was turned off on the master during the backup period? I also reproduced. pg_stop_backup() fails in most cases. However, it succeeds if both the following cases are true. * checkpoint is done before walwriter recieves SIGHUP. * slave1 has not received the WAL of 'off' by SIGHUP yet. > Minor typo above at 'CHECKPOTNT' Yes. > If my concern about full page writes being switched to off in the middle > of a backup is unfounded then I think this patch is ready for a > committer. They can clean the two editorial changes when they apply the > patches. Yes. I'll clean since these comments fix. > If do_pg_stop_backup is going to need some logic to recheck the full > page write status then an updated patch is required. It already contains. Regards. Jun Ishizuka NTT Software Corporation TEL:045-317-7018 E-Mail: ishizuka@po.ntts.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
> There seems to be agreement on something for (2), and it won't be hard. > (1) would probably be much more complex. Essentially we'd need to add a > new object type for roles, I think. But won't (2) give you most of what > you need for (1) anyway? AIUI, your problem is that the roles might not > exist, and so some or all of the dump will fail. But if you have (2) and > dump the roles without passwords and restore them before running > pg_restore that wouldn't happen. It won't be one command but it will be > two or three pretty easy commands. These serve two different purposes. The reason I want to have the dependant roles created as part of a database dump is so that we can ship around dump files as a single file, and restore them with a single command. This is considerably simpler than the current requirements, which are: 1. pg_dumpall the roles 2. pg_dump the database 3. tar both files 4. ship file 5. untar both files 6. psql the role file 7. pg_restore the database file Since the above is something I'm doing on around 11 different machines between once a day and once a week, eliminating the 4 extra steps would be really nice. However, we'd also need CREATE OR REPLACE ROLE to really integrate shipping database copies. Without that, including roles in the database dump doesn't help that much. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET variable - Permission issues
On 10/11/2011 02:07 PM, Kevin Grittner wrote: > Tom Lane wrote: > >> This isn't exactly a trivial matter. What happens for instance if >> you try to change the limit, and there are already active values >> outside the limit in some processes? > > I would certainly vote for enforcing on the SET and not causing an > error on the attempt to change the limit. (Maybe a notice?) At the > time they set the GUC, they were allowed to do so. It's a bit like > revoking a user's right to create a table in a schema -- what if > they've already done so? You leave the table and you don't let them > create another. > > What problems do you see with that? Yeah, I don't know why it need be handled any different than say ALTER DATABASE foo SET config_param TO value or ALTER ROLE foo SET config_param TO value These cases do not effect already existing processes either. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Andrew Dunstan wrote: > Attached are two patches, one to remove some infelicity in the entab > makefile, and the other to allow skipping specifying the typedefs file I have applied the 'entab' Makefile fix. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Simon Riggs wrote: > On Tue, Oct 11, 2011 at 10:30 PM, Florian Pflug wrote: > > > That experience has taught me that backwards compatibility, while very > > important in a lot of cases, has the potential to do just as much harm > > if overdone. > > Agreed. Does my suggestion represent overdoing it? I ask for balance, > not an extreme. Well, balance is looking at what everyone else in the group is suggesting, and realizing you might not have all the answers, and listening. As far as I can see, you are the _only_ one who thinks it needs an option. In that light, your suggestion seems extreme, not balanced. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online base backup from the hot-standby
On 11-10-11 11:17 AM, Jun Ishiduka wrote: > Done. > > Updated patch attached. > I have taken Jun's latest patch and applied it on top of Fujii's most recent patch. I did some testing with the result but nothing theory enough to stumble on any race conditions. Some testing notes -- select pg_start_backup('x'); ERROR: full_page_writes on master is set invalid at least once since latest checkpoint I think this error should be rewritten as ERROR: full_page_writes on master has been off at some point since latest checkpoint We should be using 'off' instead of 'invalid' since that is what is what the user sets it to. I switched full_page_writes=on , on the master did a pg_start_backup() on the slave1. Then I switched full_page_writes=off on the master, did a reload + checkpoint. I was able to then do my backup of slave1, copy the control file, and pg_stop_backup(). When I did the test slave2 started okay, but is this safe? Do we need a warning from pg_stop_backup() that is printed if it is detected that full_page_writes was turned off on the master during the backup period? Code Notes - *** 6865,6870 --- 6871,6886 /* Pre-scan prepared transactions to find out the range of XIDs present */ oldestActiveXID = PrescanPreparedTransactions(NULL, NULL); + /* + * The startup updates FPW in shaerd-memory after REDO. However, it must + * perform before writing the WAL of the CHECKPOINT. The reason is that + * it uses a value of fpw in shared-memory when it writes a WAL of its + * CHECKPOTNT. + */ Minor typo above at 'CHECKPOTNT' If my concern about full page writes being switched to off in the middle of a backup is unfounded then I think this patch is ready for a committer. They can clean the two editorial changes when they apply the patches. If do_pg_stop_backup is going to need some logic to recheck the full page write status then an updated patch is required. > Regards. > > > Jun Ishizuka > NTT Software Corporation > TEL:045-317-7018 > E-Mail: ishizuka@po.ntts.co.jp > > > >
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Simon Riggs wrote: > On Tue, Oct 11, 2011 at 10:22 PM, Tom Lane wrote: > > Simon Riggs writes: > >> How could I change the viewpoint of the group without making rational > >> arguments when it matters? > > > > Well, you make your arguments, and you see if you convince anybody. > > On these specific points, you've failed to sway the consensus AFAICT, > > and at some point you have to accept that you've lost the argument. > > I'm happy to wait more than 4 hours before trying to judge any > meaningful consensus. > > Rushing judgement on such points is hardly likely to encourage people > to speak up, even assuming they have the opportunity. This is an issue you have been pushing for a very long time on many fronts --- four hours is not going to change anything. Have you considered developing a super-backward-compatible version of Postgres, or a patch set which does this, and seeing if there is any interest from users? If you could get major uptake, it would certainly bolster your argument. But, until I see that, I am unlikely to listen to further protestations. I am much more likely to just ignore your suggestions as "Oh, it is just Simon on the backward-compatibility bandwagon again", and it causes me to just mentally filter your ideas, which isn't productive, and I am sure others will do the same. Read the MS-SQL post about backward compatibility knobs as proof that many of us know the risks of too much backward compatibility. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 10:30 PM, Florian Pflug wrote: > That experience has taught me that backwards compatibility, while very > important in a lot of cases, has the potential to do just as much harm > if overdone. Agreed. Does my suggestion represent overdoing it? I ask for balance, not an extreme. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Simon Riggs wrote: > They ask comparative questions like "What is the risk of > upgrade?", "How much testing is required?" > I never met a customer yet that has an automated test suite > designed to stress the accuracy of results under concurrent > workloads I'll try to provide some information here to help you answer those questions. I hope it is helpful. The only behavioral difference an unchanged application could see (short of some as-yet-undiscovered bug) is that they could get more serialization failures when using serializable isolation level than they previously got, and that there could be a performance impact. It really does nothing except run exactly what serializable mode was before, while monitoring for conditions which are present when a race condition between transactions might cause odd results, and generate a serialization failure as needed to prevent that. It kind of "hangs off the side" of legacy behavior and watches things. No new blocking. No new deadlocks. No chance of results you didn't get before. It might also be worth reviewing this page: http://www.postgresql.org/docs/9.1/interactive/transaction-iso.html -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
On 10/11/2011 03:50 PM, Josh Berkus wrote: Acording the docs, I assume -r is only roles, while -g includes tablespace, so what you want is already available in pg_dumpall. No, it's not. You don't seem to be actually reading any of my proposals. (1) I cannot produce a single file in custom dump format which includes both a single database and all of the roles I need to build that database. (2) I cannot dump a set of roles without md5 passwords. Both of these are things I need to support dev/stage/testing integration at multiple sites. There seems to be agreement on something for (2), and it won't be hard. (1) would probably be much more complex. Essentially we'd need to add a new object type for roles, I think. But won't (2) give you most of what you need for (1) anyway? AIUI, your problem is that the roles might not exist, and so some or all of the dump will fail. But if you have (2) and dump the roles without passwords and restore them before running pg_restore that wouldn't happen. It won't be one command but it will be two or three pretty easy commands. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl restart - behaviour based on wrong instance
Robert Haas wrote: > On Wed, Mar 23, 2011 at 1:48 AM, Fujii Masao wrote: > > On Sat, Mar 19, 2011 at 10:20 AM, Robert Haas wrote: > >> On Fri, Mar 18, 2011 at 1:19 PM, Erik Rijkers wrote: > >>> This is OK and expected. ?But then it continues (in the logfile) with: > >>> > >>> FATAL: ?lock file "postmaster.pid" already exists > >>> HINT: ?Is another postmaster (PID 20519) running in data directory > >>> "/var/data1/pg_stuff/pg_installations/pgsql.vanilla_1/data"? > >>> > >>> So, complaints about the *other* instance. ?It doesn't happen once a > >>> successful start (with pg_ctl > >>> start) has happened. > >> > >> I'm guessing that leftover postmaster.pid contents might be > >> responsible for this? > > > > The cause is that "pg_ctl restart" uses the postmaster.opts which was > > created in the primary. Since its content was something like > > "pg_ctl -D vanilla_1/data", vanilla_1/data/postmaster.pid was checked > > wrongly. > > > > The simple workaround is to exclude postmaster.opts from the backup > > as well as postmaster.pid. But when postmaster.opts doesn't exist, > > "pg_ctl restart" cannot start up the server. We might also need to change > > the code of "pg_ctl restart" so that it does just "pg_ctl start" when > > postmaster.opts doesn't exist. > > Sounds reasonable. I looked over this issue and I don't thinking having pg_ctl restart fall back to 'start' is a good solution. I am concerned about cases where we start a different server without shutting down the old server, for some reason. When they say 'restart', I think we have to assume they want a restart. What I did do was to document that not backing up postmaster.pid and postmaster.opts might help prevent pg_ctl from getting confused. Patch applied and backpatched to 9.1.X. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml new file mode 100644 index b8daedc..737355a *** a/doc/src/sgml/backup.sgml --- b/doc/src/sgml/backup.sgml *** SELECT pg_stop_backup(); *** 869,875 of mistakes when restoring. This is easy to arrange if pg_xlog/ is a symbolic link pointing to someplace outside the cluster directory, which is a common setup anyway for performance ! reasons. --- 869,879 of mistakes when restoring. This is easy to arrange if pg_xlog/ is a symbolic link pointing to someplace outside the cluster directory, which is a common setup anyway for performance ! reasons. You might also want to exclude postmaster.pid ! and postmaster.opts, which record information ! about the running postmaster, not about the ! postmaster which will eventually use this backup. ! (These files can confuse pg_ctl.) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 10:22 PM, Tom Lane wrote: > Simon Riggs writes: >> How could I change the viewpoint of the group without making rational >> arguments when it matters? > > Well, you make your arguments, and you see if you convince anybody. > On these specific points, you've failed to sway the consensus AFAICT, > and at some point you have to accept that you've lost the argument. I'm happy to wait more than 4 hours before trying to judge any meaningful consensus. Rushing judgement on such points is hardly likely to encourage people to speak up, even assuming they have the opportunity. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Oct11, 2011, at 22:55 , Simon Riggs wrote: > Probably as a matter of policy all new features that effect semantics > should have some kind of compatibility or off switch, if easily > possible. There's a huge downside to that, though. After a while, you end up with a gazillion settings, each influencing behaviour in non-obvious, subtle ways. Plus, every new code we add would have to be tested against *all* combinations of these switches. Or, maybe, we'd punt and make some features work only with "reasonable" settings. And by doing so cause much frustration of the kind "I need to set X to Y to use feature Z, but I can't because our app requires X to be set to Y2". I've recently had to use Microsoft SQL Server for a project, and they fell into *precisely* this trap. Nearly *everything* is a setting there, like whether various things follow the ANSI standard (NULLS, CHAR types, one setting for each), whether identifiers are double-quoted or put between square brackets, whether loss of precision is an error, ... And, some of their very own features depend on specific combination of these settings. Sometimes on the values in effect when the object was created, sometimes when it's used. For example, their flavour of materialized views (called "indexed views") requires a bunch of options to be set correctly to be able to create such an object. Some of these must even be in effect to update the view's base tables, once the view is created... That experience has taught me that backwards compatibility, while very important in a lot of cases, has the potential to do just as much harm if overdone. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 10:00 PM, Kevin Grittner wrote: > Simon Riggs wrote: > >> "You'll have to retest your apps" just isn't a good answer > > For which major PostgreSQL releases have you recommended that people > deploy their apps without retesting? None. People don't always follow my advice, regrettably. They ask comparative questions like "What is the risk of upgrade?", "How much testing is required?" I never met a customer yet that has an automated test suite designed to stress the accuracy of results under concurrent workloads, so the inability to control the way a new feature operates makes such questions more likely to be given an answer that indicates greater effort and higher risk. That is exactly what I personally would wish to avoid. An off switch encourages people to use new features. It is not a punishment or an admonition to the developer. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Simon Riggs writes: > How could I change the viewpoint of the group without making rational > arguments when it matters? Well, you make your arguments, and you see if you convince anybody. On these specific points, you've failed to sway the consensus AFAICT, and at some point you have to accept that you've lost the argument. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 9:53 PM, Bruce Momjian wrote: > Simon Riggs wrote: >> > Simon seems to value backward-compatibility more than the average >> > hackers poster. ?The lack of complaints about 9.1 I think means that the >> > hackers decision of _not_ providing a swich was the right one. >> >> So its been out 1 month and you think that is sufficient time for us >> to decide that there are no user complaints about SSI? I doubt it. >> Longer term I have every confidence that it will be appreciated. >> >> I'm keen to ensure people enjoy the possibility of upgrading to the >> latest release. The continual need to retest applications mean that >> very few users upgrade quickly or with anywhere near the frequency >> with which we put out new releases. What is the point of rushing out >> software that nobody can use? pg_upgrade doesn't change your >> applications, so there isn't a fast path to upgrade in the way you >> seem to think. > > Simon, I basically think you are swimming up-stream on this issue, and > on the recovery.conf thread as well. You can keep arguing that backward > compatibility warrants more effort, but until there is more general > agreement in the group, you are going to lose these arguments, and > frankly, the arguments are getting tiring. I speak when it is important that someone does so, and only on specific, real issues. When I speak, I do so on behalf of my clients and other Postgres users that suffer the problems created by those issues. I've never given a viewpoint on list that I know to be the opposite of the majority of people I represent. How could I change the viewpoint of the group without making rational arguments when it matters? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET variable - Permission issues
Tom Lane wrote: > This isn't exactly a trivial matter. What happens for instance if > you try to change the limit, and there are already active values > outside the limit in some processes? I would certainly vote for enforcing on the SET and not causing an error on the attempt to change the limit. (Maybe a notice?) At the time they set the GUC, they were allowed to do so. It's a bit like revoking a user's right to create a table in a schema -- what if they've already done so? You leave the table and you don't let them create another. What problems do you see with that? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index only scan paving the way for "auto" clustered tables?
Robert Haas writes: > Alternatively, we could try to graft the concept of a self-clustering > table on top of the existing heap implementation. But I'm having > trouble seeing how that would work. The TODO describes it as > something like "maintain CLUSTER ordering", but that's a gross > oversimplification, because we have no structure that would allow us > to sensibly do any such thing... the current heap implementation is > just that: a pile of stuff. I currently think that's the way to go, with some coarser granularity than tuple or page. Picture HOT inserts, if you will. That would be at the page level, but do we need that level of precision? I'm thinking that we need something more like segment based here, or maybe some intermediate value would be good between a page of 8Kb and a segment of 1GB, but I'm not so sure. We would have to track the bounds of each segment for the indexed columns, and maintain them, and the planner would have to exercise pruning at the segment level. So going down too much in granularity would have negative impacts on planning performances (too many data to play with), and anyway a server that needs that kind of optimization can certainly handle a couple of GB in its file system cache. So, it's quite hand wavy still, but Segment Exclusion has been discussed here already, and it seems to me that's the next thing we need. Call it partial Seq Scan and HOT inserts if new names are your thing :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 9:37 PM, Kevin Grittner wrote: > It would certainly be a trivial change to > implement; the problem is convincing others that it's a good idea. I don't want it, I just think we need it now. "You'll have to retest your apps" just isn't a good answer and we should respect the huge cost that causes our users. Probably as a matter of policy all new features that effect semantics should have some kind of compatibility or off switch, if easily possible. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only scans
Alexander Korotkov writes: > On Wed, Oct 12, 2011 at 12:35 AM, Tom Lane wrote: >> Maybe, instead of a simple constant amcanreturn column, we need an AM >> API function that says whether the index can return data. > I like idea of such AM API function. Since single multicolumn index can use > multiple opclasses, AM API function should also say *what* data index can > return. I was thinking more like "amcanreturn(index, column_number) returns bool" which says if the index can return the data for that column. The AM would still have to return a full IndexTuple at runtime, but it'd be allowed to insert nulls or garbage for columns it hadn't promised to return. BTW, if we do this, I'm rather strongly tempted to get rid of the name-versus-cstring hack (see index_descriptor_hack() in HEAD) by defining btree name_ops as not capable of returning data. I don't trust that hack much at all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Simon Riggs wrote: > "You'll have to retest your apps" just isn't a good answer For which major PostgreSQL releases have you recommended that people deploy their apps without retesting? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Simon Riggs wrote: > > Simon seems to value backward-compatibility more than the average > > hackers poster. ?The lack of complaints about 9.1 I think means that the > > hackers decision of _not_ providing a swich was the right one. > > So its been out 1 month and you think that is sufficient time for us > to decide that there are no user complaints about SSI? I doubt it. > Longer term I have every confidence that it will be appreciated. > > I'm keen to ensure people enjoy the possibility of upgrading to the > latest release. The continual need to retest applications mean that > very few users upgrade quickly or with anywhere near the frequency > with which we put out new releases. What is the point of rushing out > software that nobody can use? pg_upgrade doesn't change your > applications, so there isn't a fast path to upgrade in the way you > seem to think. Simon, I basically think you are swimming up-stream on this issue, and on the recovery.conf thread as well. You can keep arguing that backward compatibility warrants more effort, but until there is more general agreement in the group, you are going to lose these arguments, and frankly, the arguments are getting tiring. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only scans
Tom Lane writes: > I haven't thought as far ahead as how we might get the information > needed for a per-opclass flag. A syntax addition to CREATE OPERATOR > CLASS might be the only way. It looks to me like it's related to the RECHECK property. Maybe it's just too late, though. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Greg Sabino Mullane wrote: > Kevin Grittner: > >> Did these transactions write anything? If not, were they >> declared to be READ ONLY? If they were, in fact, only reading, >> it would be interesting to see what the performance looks like if >> the recommendation to use the READ ONLY attribute is followed. > > Yes, I'll definitely look into that, but the great majority of > the things done in this case are read/write. But it is precisely *because* those were fully cached read-only transactions that the numbers came out so bad. As Robert pointed out, in other loads the difference in time per transaction could be lost in the noise. Now, I know SSI won't be good fit for all applications, but you might not want to write it off on performance grounds for an application where "the great majority of the things done ... are read/write" based on a test which ran only read-only transactions without declaring them READ ONLY. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
Simon Riggs wrote: > On Tue, Oct 11, 2011 at 9:28 PM, Bruce Momjian wrote: > > > Standard conforming strings > > was tricky because it was more user-facing, or certainly SQL-facing. > > Why is SQL more important than backup? Because the percentage of database users it affects is different. Administrators know when they are installing a new version of Postgres and already are probably changing these configuration files. Application binaries and perhaps application developers are not as aware of a change, and there are a far higher percentage of them in an organization than administrators. > There is no good reason to do this so quickly. I just gave you a reason above, and as I said, doing backward compatibility can make the system more complex. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Simon Riggs writes: > There is no off switch and there should be. As Greg said, that ship has sailed. I believe that we specifically discussed the notion of an "off switch" via a GUC or similar during 9.1 development, and rejected it on the grounds that GUCs changing fundamental transactional behavior are dangerous. I don't believe that you've made a case for changing that decision, and even if you had, it's too late; 9.1 is what it is. Can we end this subthread please, and concentrate on something actually useful, like improving SSI's performance? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
> Acording the docs, I assume -r is only roles, while -g includes > tablespace, so what you want is already available in pg_dumpall. No, it's not. You don't seem to be actually reading any of my proposals. (1) I cannot produce a single file in custom dump format which includes both a single database and all of the roles I need to build that database. (2) I cannot dump a set of roles without md5 passwords. Both of these are things I need to support dev/stage/testing integration at multiple sites. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 9:32 PM, Bruce Momjian wrote: > Greg Sabino Mullane wrote: > -- Start of PGP signed section. >> > If the normal default_transaction_isolation = read committed and all >> > transactions that require serializable are explicitly marked in the >> > application then there is no way to turn off SSI without altering the >> > application. That is not acceptable, since it causes changes in >> > application behaviour and possibly also performance issues. >> >> Performance, perhaps. What application behavior changes? Less >> serialization conflicts? If you change default_transaction_isolation then the behaviour of the application will change. >> > We should provide a mechanism to allow people to upgrade to 9.1+ >> > without needing to change the meaning and/or performance of their >> > apps. >> >> That ship has sailed. > > Simon seems to value backward-compatibility more than the average > hackers poster. The lack of complaints about 9.1 I think means that the > hackers decision of _not_ providing a swich was the right one. So its been out 1 month and you think that is sufficient time for us to decide that there are no user complaints about SSI? I doubt it. Longer term I have every confidence that it will be appreciated. I'm keen to ensure people enjoy the possibility of upgrading to the latest release. The continual need to retest applications mean that very few users upgrade quickly or with anywhere near the frequency with which we put out new releases. What is the point of rushing out software that nobody can use? pg_upgrade doesn't change your applications, so there isn't a fast path to upgrade in the way you seem to think. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER EXTENSION .. ADD/DROP weirdness
Robert Haas writes: > On Mon, Oct 10, 2011 at 2:52 PM, Tom Lane wrote: >> The underlying issue here is whether objects dependent on an extension >> member should have direct dependencies on the extension too, and if not, >> how do we prevent that? The recordDependencyOnCurrentExtension calls >> don't have enough information to know what to do, I think. I think the original patch, that didn't have the DEPENDENCY_EXTENSION tracking but relied on the INTERNAL stuff, did only record first level objects as a dependency. Given the way INTERNAL dependencies following are done, that kind of worked in a limited set of cases. > Well, I'm not an expert on this code, but from a user perspective, I > think it would be nicer if only the view ended up being a member of > the extension, and the generated types did not. Otherwise, writing an > extension upgrade script requires detailed knowledge of what other > objects are going to be generated internally. In fact, it doesn't > seem implausible that the set of internally generated objects from a > given DDL command could change between releases, which would really be > rather ugly here. The reason why the original patch got changed by Tom is, of course, that it failed to work properly in some interesting cases. Specifically, handling both your use case and extension dependencies (earthdistance depends on cube) is not so easy. How do you know you're crossing a dependency unit when recursing in pg_depends is a nice exercise if you want to be very familiar with WITH RECURSIVE catalog queries. Been there, done that :) The main test case is DROP EXTENSION earthdistance;, adding CASCADE is easier because you then don't care about stopping at the right place. Of course I'm just trying to help you figure out why the problem is not already solved, please feel free to come back with a design that make it simple enough :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only scans
On Wed, Oct 12, 2011 at 12:35 AM, Tom Lane wrote: > > Hm. I had been supposing that lossless compress functions would just be > no-ops. If that's not necessarily the case then we might need something > different from the opclass's decompress function to get back the > original data. However, that doesn't really solve the problem I'm > concerned about, because the existence and use of such a function would > be entirely internal to GiST. There still needs to be a way for the > planner to know which opclasses support data retrieval. And I do *not* > want to see us hard-wire "the presence of opclass function 8 means a > GiST opclass can return data" into the planner. > > Maybe, instead of a simple constant amcanreturn column, we need an AM > API function that says whether the index can return data. > I like idea of such AM API function. Since single multicolumn index can use multiple opclasses, AM API function should also say *what* data index can return. -- With best regards, Alexander Korotkov.
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 04:32:45PM -0400, Bruce Momjian wrote: ... > Simon seems to value backward-compatibility more than the average > hackers poster. The lack of complaints about 9.1 I think means that the > hackers decision of _not_ providing a swich was the right one. I wouldn't go that far: 9.1 is very new. Certainly the release notes do not explain the change enough: part of the reason I wrote: http://blog.endpoint.com/2011/09/postgresql-allows-for-different.html Simon has a point, but I think that having applications switch from serializable to repeatable read is a pain point people should pay when going to 9.1, rather than adding some switch now. -- Greg Sabino Mullane g...@endpoint.com End Point Corporation PGP Key: 0x14964AC8 pgp7jN6DSMohw.pgp Description: PGP signature
Re: [HACKERS] SET variable - Permission issues
"Kevin Grittner" writes: > Dimitri Fontaine wrote: >> Adding the information visible at the right places is a fun >> project in itself, too :) > I was thinking a couple new columns in pg_settings (and what backs > it) would be the main thing, but I haven't searched the source code > yet. Does something else leap to mind for you? This isn't exactly a trivial matter. What happens for instance if you try to change the limit, and there are already active values outside the limit in some processes? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Simon Riggs wrote: > Greg Sabino Mullane wrote: >> Eh? It has an off switch: repeatable read. > > You mean: if we recode the application and retest it, we can get > it to work same way as it used to. > > To most people that is the same thing as "it doesn't work with > this release", ask any application vendor. > > There is no off switch and there should be. This was discussed at some length, and nobody seemed to favor a behavior-changing GUC. One example of such a thread is here: http://archives.postgresql.org/pgsql-hackers/2009-05/msg01165.php It came up at least a couple other times, and the outcome was always the same -- after discussion, nobody was in favor of a GUC to make the semantics of these statement variable. I'm sorry if you missed those discussions. It would certainly be a trivial change to implement; the problem is convincing others that it's a good idea. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Tue, Oct 11, 2011 at 9:28 PM, Bruce Momjian wrote: > Standard conforming strings > was tricky because it was more user-facing, or certainly SQL-facing. Why is SQL more important than backup? There is no good reason to do this so quickly. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
Josh Berkus wrote: > > > Acording the docs, I assume -r is only roles, while -g includes > > tablespace, so what you want is already available in pg_dumpall. > > No, it's not. You don't seem to be actually reading any of my proposals. > > (1) I cannot produce a single file in custom dump format which includes > both a single database and all of the roles I need to build that database. > > (2) I cannot dump a set of roles without md5 passwords. > > Both of these are things I need to support dev/stage/testing integration > at multiple sites. We are not writing this software for you. Please submit a clear proposal. I am sure you have 10k customers who want this. :-| -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only scans
Alexander Korotkov writes: > On Tue, Oct 11, 2011 at 5:22 PM, Tom Lane wrote: >> I haven't thought as far ahead as how we might get the information >> needed for a per-opclass flag. A syntax addition to CREATE OPERATOR >> CLASS might be the only way. >> > Shouldn't it be implemented through additional interface function? There are > situations when restoring of original tuple requires some transformation. > For example, in point_ops we store box in the leaf index tuple, while point > can be easily restored from box. Hm. I had been supposing that lossless compress functions would just be no-ops. If that's not necessarily the case then we might need something different from the opclass's decompress function to get back the original data. However, that doesn't really solve the problem I'm concerned about, because the existence and use of such a function would be entirely internal to GiST. There still needs to be a way for the planner to know which opclasses support data retrieval. And I do *not* want to see us hard-wire "the presence of opclass function 8 means a GiST opclass can return data" into the planner. Maybe, instead of a simple constant amcanreturn column, we need an AM API function that says whether the index can return data. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Simon Riggs wrote: > On Tue, Oct 11, 2011 at 6:44 PM, Kevin Grittner > wrote: > > >> If you alter the default_transaction_isolation then you will break > >> applications like this, so it is not a valid way to turn off SSI. > > > > I don't follow you here. ?What would break? ?In what fashion? ?Since > > the standard allows any isolation level to provide more strict > > transaction isolation than required, it would be conforming to > > *only* support serializable transactions, regardless of the level > > requested. ?Not a good idea for some workloads from a performance > > perspective, but it would be conforming, and any application which > > doesn't work correctly with that is not written to the standard. > > If the normal default_transaction_isolation = read committed and all > transactions that require serializable are explicitly marked in the > application then there is no way to turn off SSI without altering the > application. That is not acceptable, since it causes changes in > application behaviour and possibly also performance issues. > > We should provide a mechanism to allow people to upgrade to 9.1+ > without needing to change the meaning and/or performance of their > apps. > > I strongly support the development of SSI, but I don't support > application breakage. We can have SSI without breaking anything for > people that can't or don't want to use it. The problem is that all the switches needed to allow for "no application breakage" makes configuration of the server and source code more complicated. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Greg Sabino Mullane wrote: -- Start of PGP signed section. > > If the normal default_transaction_isolation = read committed and all > > transactions that require serializable are explicitly marked in the > > application then there is no way to turn off SSI without altering the > > application. That is not acceptable, since it causes changes in > > application behaviour and possibly also performance issues. > > Performance, perhaps. What application behavior changes? Less > serialization conflicts? > > > We should provide a mechanism to allow people to upgrade to 9.1+ > > without needing to change the meaning and/or performance of their > > apps. > > That ship has sailed. Simon seems to value backward-compatibility more than the average hackers poster. The lack of complaints about 9.1 I think means that the hackers decision of _not_ providing a swich was the right one. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COUNT(*) and index-only scans
> The trouble is that if we VACUUM and then ANALYZE, we'll often get > back a value very close to 100%, but then the real value may diminish > quite a bit before the next auto-analyze fires. I think if we can > figure out what to do about that problem we'll be well on our way... It's not so much an issue of when the last auto-analyze was as an issue of the number of rows in write transactions against that table in the last X minutes. This is where it really hurts us that pg_stat_user_tables is not time-based. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 9:21 PM, Greg Sabino Mullane wrote: > Simon Riggs: >> Most apps use mixed mode serializable/repeatable read and therefore >> can't be changed by simple parameter. Rewriting the application isn't >> a sensible solution. >> >> I think it's clear that SSI should have had and still needs an "off >> switch" for cases that cause performance problems. > > Eh? It has an off switch: repeatable read. You mean: if we recode the application and retest it, we can get it to work same way as it used to. To most people that is the same thing as "it doesn't work with this release", ask any application vendor. There is no off switch and there should be. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
On 10/11/2011 04:19 PM, Tom Lane wrote: Andrew Dunstan writes: On 10/11/2011 12:40 PM, Bruce Momjian wrote: Josh Berkus wrote: pg_dumpall -r --no-passwords which would dump the roles but without CREATE PASSWORD statements. This would be useful for cloning databases for use in Dev, Test and Staging, where you don't what to copy the md5s of passwords for possible cracking. What would this do that pg_dumpall --globals-only doesn't? As stated, it would not export the passwords. I can see some possible use in a --no-passwords option that's orthogonal to all else. The rest of this seems rather confused... Yes, I also think that will meet the case. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
Simon Riggs wrote: > On Tue, Oct 11, 2011 at 3:29 PM, Bruce Momjian wrote: > > > As much as I appreciate Simon's work in this area, I think we are still > > unclear if keeping backward-compatibility is worth the complexity > > required for future users. ?Historically we have been bold in changing > > postgresql.conf settings to improve clarity, and that approach has > > served us well. > > You raise a good point. First, thank you for the respectful comment; > my viewpoint is not formed from resistance to change per se, even if > may appear to be so. Thank you for raising that possibility to allow > me to explain and refute that. > > I am genuinely concerned that we show respect to downstream software > that uses our APIs and have no personal or corporate ulterior motive. > > Most people are used to the 3 year cycle of development on which > SQLServer and Oracle have now standardised. Our 1 year cycle provides > a considerable benefit in agility, but it also provides for x3 > complexity in release management and a continual temptation to change > for no good reason. I want to encourage people to adopt our APIs, not > give them a headache for attempting to do so. We know that software > exists that follows the previous API and we should take steps to > deprecate that across multiple releases, with appropriate notice, just > as we do in other cases, such as standard conforming strings where our > lack of boldness is appropriate. Well, let me be specific. Around 2003 to 2006, we added many new configuration parameters for logging, which required renaming or removing older parameters. There really wasn't a smooth way to allow for this to be done without impacting users, and the current system we have enjoyed since 2006 is logical only because we made the changes necessary. We can look at trying to phase changes in, but often the phasing becomes more complicated that just doing the change. Logging parameter changes were easier because it was assumed logging was an admin-only task, as I assume pitr and replication are as well. Standard conforming strings was tricky because it was more user-facing, or certainly SQL-facing. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Tue, Oct 11, 2011 at 3:29 PM, Bruce Momjian wrote: > As much as I appreciate Simon's work in this area, I think we are still > unclear if keeping backward-compatibility is worth the complexity > required for future users. Historically we have been bold in changing > postgresql.conf settings to improve clarity, and that approach has > served us well. You raise a good point. First, thank you for the respectful comment; my viewpoint is not formed from resistance to change per se, even if may appear to be so. Thank you for raising that possibility to allow me to explain and refute that. I am genuinely concerned that we show respect to downstream software that uses our APIs and have no personal or corporate ulterior motive. Most people are used to the 3 year cycle of development on which SQLServer and Oracle have now standardised. Our 1 year cycle provides a considerable benefit in agility, but it also provides for x3 complexity in release management and a continual temptation to change for no good reason. I want to encourage people to adopt our APIs, not give them a headache for attempting to do so. We know that software exists that follows the previous API and we should take steps to deprecate that across multiple releases, with appropriate notice, just as we do in other cases, such as standard conforming strings where our lack of boldness is appropriate. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COUNT(*) and index-only scans
On Tue, Oct 11, 2011 at 5:45 AM, Greg Stark wrote: > On Mon, Oct 10, 2011 at 9:17 PM, Tom Lane wrote: >> My intention was to allow it to consider any covering index. You're >> thinking about the cost estimate, which is really entirely different. >> > > Is there any reason to consider more than one? I would have expected > the narrowest one to be the best choice. There's something to be said > for using the same index consistently but we already have that problem > and make no attempt to do that. And partial indexes might be better > but then we would already be considering them if their constraints are > satisfied. You raise a fantastic idea. Use the frequency of use as a factor of an index in the cost of optimising a query. We have previously discussed the idea of using the RAM residency of an index to control the cost. That is difficult to judge. Using the long term prevalence of usage as a weighting factor makes a great deal of sense for queries that could potentially utilise multiple indexes. That information is readily available and directly applicable. The prevalence of use directly drives RAM residency, so it makes sense to use the causal factor as input to the cost. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
> If the normal default_transaction_isolation = read committed and all > transactions that require serializable are explicitly marked in the > application then there is no way to turn off SSI without altering the > application. That is not acceptable, since it causes changes in > application behaviour and possibly also performance issues. Performance, perhaps. What application behavior changes? Less serialization conflicts? > We should provide a mechanism to allow people to upgrade to 9.1+ > without needing to change the meaning and/or performance of their > apps. That ship has sailed. -- Greg Sabino Mullane g...@endpoint.com End Point Corporation PGP Key: 0x14964AC8 pgpsMUli41Pnm.pgp Description: PGP signature
Re: [HACKERS] Index only scan paving the way for "auto" clustered tables?
Robert Haas wrote: """ And it seems to me that there could easily be format changes that would make sense for particular cases, but not across the board, like: - index-organized tables (heap is a btree, and secondary indexes reference the PK rather than the TID; this is how MySQL does it, and Oracle offers it as an option) - WORM tables (no updates or deletes, and no inserts after creating transaction commits, allowing a much smaller tuple header) - non-transactional tables (tuples visible as soon as they're written, again allowing for smaller tuple header; useful for internal stuff and perhaps for insert-only log tables) """ This is probably a silly idea, but I have been wondering about the following idea: Instead of having visibility info in the row header, have a couple of row visibility slots in the page header. These slots could be shared between rows in the page, so that if you do a bulk insert/update/delete you would only use one slot. If the slots overflow, you would use external slots buffer. When the row is all visible, no slot would be used at all. The xmin, xmax and cid would be in the slots. ctid would have its current meaning, except when the external slots would be used, then ctid would point to the external slot, and it would have the real row header. I don't know if there would be any other row header parts which could be shared. The external slots buffer would then contain xmin, xmax, cid and the real ctid. Updates would write the new rows to another page in the heap, and old rows would stay in place, just as now. So there would not be any redo log like configuration. Also, the external slots buffer would be small (18 bytes per row), so it would not get out of cache too easily. The performance would suck if you had lots of small updates, or long running transactions. On the other hand in data warehousing, where bulk loads are normal, and there are a lot of small rows, this could actually work. As said, this is probably a silly idea. But as pluggable heap types came up, I thought to ask if this could actually work. If this kind of wondering posts are inappropriate for this list, please tell me so that I can avoid these in the future. - Anssi Kääriäinen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Robert Haas: > Serializable mode is much slower on this test, though. On > REL9_1_STABLE, it's about 8% slower with a single client. At 8 > clients, the difference rises to 43%, and at 32 clients, it's 51% > slower. Bummer. Thanks for putting some numbers out there; glad I was able to jump start a deeper look at this. Based on this thread so far, I am probably going to avoid serializable in this particular case, and stick to repeatable read. Once things are in place, perhaps I'll be able to try switching to serializable and get some measurements, but I wanted to see if the impact was minor enough to safely start with serializable. Seems not. :) Keep in mind this is not even a formal proposal yet for our client, so any benchmarks from me may be quite a while. Kevin Grittner: > Did these transactions write anything? If not, were they declared > to be READ ONLY? If they were, in fact, only reading, it would be > interesting to see what the performance looks like if the > recommendation to use the READ ONLY attribute is followed. Yes, I'll definitely look into that, but the great majority of the things done in this case are read/write. Simon Riggs: > Most apps use mixed mode serializable/repeatable read and therefore > can't be changed by simple parameter. Rewriting the application isn't > a sensible solution. > > I think it's clear that SSI should have had and still needs an "off > switch" for cases that cause performance problems. Eh? It has an off switch: repeatable read. Thanks for all replying to this thread, it's been very helpful. -- Greg Sabino Mullane g...@endpoint.com End Point Corporation PGP Key: 0x14964AC8 pgpkFVkl3Xl3T.pgp Description: PGP signature
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 6:44 PM, Kevin Grittner wrote: >> If you alter the default_transaction_isolation then you will break >> applications like this, so it is not a valid way to turn off SSI. > > I don't follow you here. What would break? In what fashion? Since > the standard allows any isolation level to provide more strict > transaction isolation than required, it would be conforming to > *only* support serializable transactions, regardless of the level > requested. Not a good idea for some workloads from a performance > perspective, but it would be conforming, and any application which > doesn't work correctly with that is not written to the standard. If the normal default_transaction_isolation = read committed and all transactions that require serializable are explicitly marked in the application then there is no way to turn off SSI without altering the application. That is not acceptable, since it causes changes in application behaviour and possibly also performance issues. We should provide a mechanism to allow people to upgrade to 9.1+ without needing to change the meaning and/or performance of their apps. I strongly support the development of SSI, but I don't support application breakage. We can have SSI without breaking anything for people that can't or don't want to use it. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
Andrew Dunstan writes: > On 10/11/2011 12:40 PM, Bruce Momjian wrote: >> Josh Berkus wrote: >>> pg_dumpall -r --no-passwords which would dump the roles but without >>> CREATE PASSWORD statements. This would be useful for cloning databases >>> for use in Dev, Test and Staging, where you don't what to copy the md5s >>> of passwords for possible cracking. >> What would this do that pg_dumpall --globals-only doesn't? > As stated, it would not export the passwords. I can see some possible use in a --no-passwords option that's orthogonal to all else. The rest of this seems rather confused... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET variable - Permission issues
"Kevin Grittner" writes: > Well, we've identified a few people who like the idea, but I'm not > sure we have the degree of consensus we normally look for before > putting something on the TODO list. After the discussion on this > thread, are there still any *objections* to allowing bounds or > subsets to be SUSET to limit GUC values more strictly than the > limits hard-coded in C? No objection here, I like this whole idea. Adding the information visible at the right places is a fun project in itself, too :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index only scan paving the way for "auto" clustered tables?
On Oct11, 2011, at 21:27 , Robert Haas wrote: > Alternatively, we could try to graft the concept of a self-clustering > table on top of the existing heap implementation. But I'm having > trouble seeing how that would work. The TODO describes it as > something like "maintain CLUSTER ordering", but that's a gross > oversimplification, because we have no structure that would allow us > to sensibly do any such thing... the current heap implementation is > just that: a pile of stuff. We could still be smarter about where we insert new rows in a clustered table, though. Upon INSERT and UPDATE, we'd need to lookup the leaf page where the new tuple will eventually go in the index we're supposed to maintain CLUSTER for. Then we'd check if any of the pages referenced there contains enough space, and if so place the new tuple there. If not it'd go at the end. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-tree parent pointer and checkpoints
Heikki Linnakangas wrote: > On 11.03.2011 19:41, Tom Lane wrote: > > Heikki Linnakangas writes: > >> On 11.03.2011 17:59, Tom Lane wrote: > >>> But that will be fixed during WAL replay. > > > >> Not under the circumstances that started the original thread: > > > >> 1. Backend splits a page > >> 2. Checkpoint starts > >> 3. Checkpoint runs to completion > >> 4. Crash > >> (5. Backend never got to insert the parent pointer) > > > >> WAL replay starts at the checkpoint redo pointer, which is after the > >> page split record, so WAL replay won't insert the parent pointer. That's > >> an incredibly tight window to hit in practice, but it's possible in theory. > > > > Hmm. It's not so improbable that checkpoint would start inside that > > window, but that the parent insertion is still pending by the time the > > checkpoint finishes is pretty improbable. > > > > How about just reducing the deletion-time ERROR for missing downlink to a > > LOG? > > Well, the code that follows expects to have a valid parent page locked, > so you can't literally do just that. But yeah, LOG and aborting the page > deletion seems fine to me. Added to TODO: Fix problem with btree page splits during checkpoints http://archives.postgresql.org/pgsql-hackers/2010-11/msg00052.php http://archives.postgresql.org/pgsql-hackers/2011-09/msg00184.php -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
Florian Pflug writes: > On Oct11, 2011, at 14:43 , David Fetter wrote: >> I'd recoil at not having ranges default to left-closed, right-open. >> The use case for that one is so compelling that I'm OK with making it >> the default from which deviations need to be specified. I agree with David on this. > The downside of that is that, as Tom pointed out upthread, we cannot > make [) the canonical representation of ranges. Yeah, we certainly *can* do that, we just have to allow ranges that include the last element of the domain to be corner cases that require special handling. If we don't want to just fail, we have to canonicalize them to closed instead of open ranges. It does not follow that the default on input has to be closed. Note that the INT_MAX case is probably not the worst issue in practice. What is going to be an issue is ranges over enum types, where having the last element being part of the range is a much more likely use-case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
Josh Berkus wrote: > > > What is the logic for not dumping passwords but the CREATE ROLE > > statement? I don't see how anyone would recognize that behavior as > > logical. If you want to add a --no-passwords option to pg_dumpall, that > > seems more logical to me. > > That's what I'm suggesting. > > Incidentally, what's the difference between -g and -r in terms of actual > output, anyway? Acording the docs, I assume -r is only roles, while -g includes tablespace, so what you want is already available in pg_dumpall. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Intermittent regression test failure from index-only scans patch
Magnus Hagander writes: > On Sun, Oct 9, 2011 at 06:34, Tom Lane wrote: >> Robert Haas writes: >>> Should we have another counter for heap fetches avoided? Seems like that >>> could be useful to know. >> Hm. I'm hesitant to add another per-table (or per index?) statistics >> counter because of the resultant bloat in the stats file. > We certainly need *some* way to figure out if this has been used, > IMHO. So yeah, if the current way doesn't scale enough, we need to > think of some other way. But I'm not sure one more counter would > really bloat it that bad? OTOH, repeating that reasoning enough time > will eventually make it enough to care about... You can already tell whether it's happening by comparing idx_tup_read versus idx_tup_fetch. Now that measure does conflate some things, like whether the tuple was not read at all or was read and rejected as not visible, but I'm not at all convinced that another counter is worth its weight. If invisible tuples are a significant part of the table then index-only scanning isn't going to be very useful to you anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
> What is the logic for not dumping passwords but the CREATE ROLE > statement? I don't see how anyone would recognize that behavior as > logical. If you want to add a --no-passwords option to pg_dumpall, that > seems more logical to me. That's what I'm suggesting. Incidentally, what's the difference between -g and -r in terms of actual output, anyway? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Intermittent regression test failure from index-only scans patch
On Sun, Oct 9, 2011 at 06:34, Tom Lane wrote: > Robert Haas writes: >> On Oct 8, 2011, at 11:04 AM, Tom Lane wrote: >>> I'm inclined to fix this by changing the test to examine idx_tup_read >>> not idx_tup_fetch. Alternatively, we could have the test force >>> enable_indexonlyscan off. Thoughts? > >> No preference. > > I ended up doing it the second way (ie enable_indexonlyscan = off) > because it turns out that pg_stat_user_tables doesn't have the > idx_tup_read column --- we track that count per index, not per table. > I could have complicated the test's stats queries some more, but it > seemed quite not relevant to the goals of the test. > >> Should we have another counter for heap fetches avoided? Seems like that >> could be useful to know. > > Hm. I'm hesitant to add another per-table (or per index?) statistics > counter because of the resultant bloat in the stats file. But it > wouldn't be a bad idea for somebody to take two steps back and rethink > what we're counting in this area. The current counter definitions are > mostly backwards-compatible with pre-8.1 behavior, and it seems like the > goalposts have moved enough that maybe it's time to break compatibility. We certainly need *some* way to figure out if this has been used, IMHO. So yeah, if the current way doesn't scale enough, we need to think of some other way. But I'm not sure one more counter would really bloat it that bad? OTOH, repeating that reasoning enough time will eventually make it enough to care about... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index only scan paving the way for "auto" clustered tables?
On Tue, Oct 11, 2011 at 3:02 PM, Kevin Grittner wrote: > Robert Haas wrote: >>> [implement "clustered index" as a covering index with all columns >>> which are present in the heap] >> I guess we could do that, but I'm not convinced there would be >> much benefit. > > The "traditional" way to implement a clustered index is to have the > leaf level of the index contain the tuples rather than pointers to > the tuples. If we're going to do clustered tables, we might want to > jump all the way to that, rather than a half-way solution which > stores everything twice. Not a bad thought. Actually, I've been thinking for a while now that we might need a pluggable heapam, similar to the pluggable indexams we already have. Our current heap format has served us pretty well, but there are any number of things that we can't really do without changing it. Of course, if we came up with a new format that was better in every case, across the board, then perhaps we'd be willing to just replace the current format outright -- though even then, that would break pg_upgrade, which would be painful, to put it mildly. And it seems to me that there could easily be format changes that would make sense for particular cases, but not across the board, like: - index-organized tables (heap is a btree, and secondary indexes reference the PK rather than the TID; this is how MySQL does it, and Oracle offers it as an option) - WORM tables (no updates or deletes, and no inserts after creating transaction commits, allowing a much smaller tuple header) - non-transactional tables (tuples visible as soon as they're written, again allowing for smaller tuple header; useful for internal stuff and perhaps for insert-only log tables) Alternatively, we could try to graft the concept of a self-clustering table on top of the existing heap implementation. But I'm having trouble seeing how that would work. The TODO describes it as something like "maintain CLUSTER ordering", but that's a gross oversimplification, because we have no structure that would allow us to sensibly do any such thing... the current heap implementation is just that: a pile of stuff. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On 10/10/11 9:53 PM, Fujii Masao wrote: > Or you think that, to keep the backward compatibility completely, > recovery.conf should be used as not only a configuration file but also a > recovery trigger one and it should be renamed to recovery.done at > the end of recovery? That's precisely my point. The trigger file nature of recovery.conf is a problem in itself, and I don't see any way to support that and fix it at the same time. Maybe Simon can? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index only scan paving the way for "auto" clustered tables?
Robert Haas wrote: >> [implement "clustered index" as a covering index with all columns >> which are present in the heap] > I guess we could do that, but I'm not convinced there would be > much benefit. The "traditional" way to implement a clustered index is to have the leaf level of the index contain the tuples rather than pointers to the tuples. If we're going to do clustered tables, we might want to jump all the way to that, rather than a half-way solution which stores everything twice. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET variable - Permission issues
On 10/11/2011 11:53 AM, Kevin Grittner wrote: > Bruce Momjian wrote: >> Is this a TODO? We might not want to make work_mem SUSET, but it >> would allow administrators to control this. > > Well, we've identified a few people who like the idea, but I'm not > sure we have the degree of consensus we normally look for before > putting something on the TODO list. That's pretty much what I was thinking. > After the discussion on this thread, are there still any *objections* > to allowing bounds or subsets to be SUSET to limit GUC values more > strictly than the limits hard-coded in C? No objections here. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index only scan paving the way for "auto" clustered tables?
On Tue, Oct 11, 2011 at 7:08 AM, Royce Ausburn wrote: > I wonder, could the recent work on index only scans pave the way for auto > clustered tables? Consider a wide, mostly insert table with some subset of > columns that I'd like to cluster on. I'm after locality of tuples that are > very frequently fetched together, but not keen on the downtime for a cluster, > nor the maintenance that it requires. Would it be a stretch to have an index > that branches on the subset of "cluster" columns, but still stores all the > columns, making it a covering index? Given that we can already index > concurrently, such an index would not require downtime, and would be self > maintaining. From my understanding of the index-only scan implementation, I > suspect that such an index would effectively give locality, with some caveats… I guess we could do that, but I'm not convinced there would be much benefit. The only thing you'd be saving would be the cost of keeping the tuples sorted by only the high-order columns rather than all of them, and I doubt that's significant. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET variable - Permission issues
Bruce Momjian wrote: > Kevin Grittner wrote: >> Joe Conway wrote: >>> On 10/10/2011 01:52 PM, Gurjeet Singh wrote: >> ALTER USER novice SET MIN_VAL OF statement_timeout TO '1'; -- So that the user cannot turn off the timeout ALTER DATABASE super_reliable SET ENUM_VALS OF synchronous_commit TO 'on'; -- So that the user cannot change the synchronicity of transactions against this database. >>> >>> I like this better than GRANT/REVOKE on SET. >> >> +1 >> >> I would really like a way to prevent normal users from switching >> from the default transaction isolation level I set. This seems >> like a good way to do that. Putting sane bounds on some other >> settings, more to protect against the accidental bad settings >> than malicious mischief, would be a good thing, too. > > Is this a TODO? We might not want to make work_mem SUSET, but it > would allow administrators to control this. Well, we've identified a few people who like the idea, but I'm not sure we have the degree of consensus we normally look for before putting something on the TODO list. After the discussion on this thread, are there still any *objections* to allowing bounds or subsets to be SUSET to limit GUC values more strictly than the limits hard-coded in C? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: casts row to array and array to row
On Tue, Oct 11, 2011 at 8:36 AM, Tom Lane wrote: > Pavel Stehule writes: >> 2011/10/11 Robert Haas : >>> On Tue, Oct 11, 2011 at 4:40 AM, Pavel Stehule >>> wrote: What do you think about this idea? > > It's a bad one. > >>> Well, a ROW can contain values of different types; an ARRAY can't. > >> yes, I know - but it should be problem only in few cases - when is not >> possible to cast a row field to array field. > > This idea is basically the same as "data types don't matter", which is > not SQL-ish and certainly not Postgres-ish. hm. I agree, but if it were possible to create sql/plpgsql functions accepting 'record', then you could at least rig the cast in userland around hstore without resorting to hacky text manipulation and/or flattening the record to text before doing the operation. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Tue, Oct 11, 2011 at 12:18:18PM -0400, Robert Haas wrote: > On Tue, Oct 11, 2011 at 12:12 PM, David Fetter wrote: > > Nothing's bad about '[]' per se. What's better, but possibly out > > of the reach of our current lexing and parsing system, would be > > things like: > > > > [1::int, 10) > > That's been discussed before. Aside from the parser issues (which > are formidable) it would break brace-matching in most if not all > commonly used editors. That being the situation, ubiquitous support for the natural syntax looks like it's a decade away, minimum. :( Trying to be cheery, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
Jeff Davis writes: > Tom made an observation about '[1,INT_MAX]' thowing an error because > canonicalization would try to increment INT_MAX. But I'm not > particularly disturbed by it. If you want a bigger range, use int8range > or numrange -- the same advice we give to people who want unsigned > types. Or, for people who really need the entire range of signed int4 > exactly, they can easily make their own range type that canonicalizes to > '[]'. I agree we shouldn't contort the entire design to avoid that corner case. We should, however, make sure that the increment *does* throw an error, and not just silently overflow. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Tue, 2011-10-11 at 12:40 -0400, Robert Haas wrote: > I think using '[)' is fine. At some level, this is just a question of > expectations. If you expect that int4range(1,4) will create a range > that includes 4, well, you're wrong. Once you get used to it, it will > seem normal, and you'll know that you need to write > int4range(1,4,'[]') if that's what you want. As long as the system is > designed around a set of consistent and well-thought-out principles, > people will get used to the details. I don't see that the idea of a > half-open range over a discrete-valued type is particularly confusing > - we use them all the time in programming, when we make the end > pointer point to the byte following the end of the array, rather than > the last element - but even if it is, overall design consistency > trumps what someone may find to be the absolutely perfect behavior in > some particular case. And saving typing is nearly always good - > unless it creates a LOT more confusion than I think this will. That sounds very reasonable to me. Tom made an observation about '[1,INT_MAX]' thowing an error because canonicalization would try to increment INT_MAX. But I'm not particularly disturbed by it. If you want a bigger range, use int8range or numrange -- the same advice we give to people who want unsigned types. Or, for people who really need the entire range of signed int4 exactly, they can easily make their own range type that canonicalizes to '[]'. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Simon Riggs wrote: > It's common to find applications that have some transactions > explicitly coded to use SERIALIZABLE mode, while the rest are in > the default mode READ COMMITTED. So common that TPC-E benchmark > has been written as a representation of such workloads. I would be willing to be that any such implementations assume S2PL, and would not prevent anomalies as expected unless all transactions are serializable. > The reason this is common is that some transactions require > SERIALIZABLE as a "fix" for transaction problems. That is a mode of thinking which doesn't work if you only assume serializable provides the guarantees required by the standard. Many people assume otherwise. It does *not* guarantee blocking on conflicts, and it does not require that transactions appear to have executed in the order of successful commit. It requires only that the result of concurrently running any mix of serializable transactions produce a result consistent with some one-at-a-time execution of those transactions. Rollback of transactions to prevent violations of that guarantee are allowed. I don't see any guarantees about how serializable transactions interact with non-serializable transactions beyond each transaction not seeing any of the phenomena prohibited for its isolation level. > If you alter the default_transaction_isolation then you will break > applications like this, so it is not a valid way to turn off SSI. I don't follow you here. What would break? In what fashion? Since the standard allows any isolation level to provide more strict transaction isolation than required, it would be conforming to *only* support serializable transactions, regardless of the level requested. Not a good idea for some workloads from a performance perspective, but it would be conforming, and any application which doesn't work correctly with that is not written to the standard. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Robert Haas wrote: > Kevin Grittner wrote: >> Did these transactions write anything? If not, were they >> declared to be READ ONLY? If they were, in fact, only reading, >> it would be interesting to see what the performance looks like if >> the recommendation to use the READ ONLY attribute is followed. > > pgbench -S doesn't do any writes, or issue any transaction control > statements. It just fires off SELECT statements against a single > table as fast as it can, retrieving values from rows chosen at > random. Each SELECT implicitly begins and ends a transaction. So that test could be accomplished by setting default_transaction_read_only to on. That's actually what we're doing, because we have a lot more of them than of read-write transactions. But, with the scripts I can confirm the performance of that on this end. It should be indistinguishable from the repeatable read line; if not, there's something to look at there. > Possibly the system could gaze upon the SELECT statement and infer > that the one-statement transaction induced thereby can't possibly > write any tuples, and mark it read-only automatically, but I'm > actually not that excited about that approach I wasn't intending to suggest that. In fact I hadn't really thought of it. It might be a fun optimization, although it would be well down my list, and it wouldn't be trivial because you couldn't use if for any statements with volatile functions -- so the statement would need to be planned far enough to know whether that was the case before making this decision. In fact, I'm not sure the community would want to generate an error if a user marked a function other than volatile and ran it in this way. Definitely not something to even look at any time soon. > trying to fix the lwlock contention that's causing the headache in > the first place seems like a better use of time, assuming it's > possible to make some headway there. Absolutely. I just thought the timings with READ ONLY would make for an interesting data point. For one thing, it might reassure people that even this artificial use cases doesn't perform that badly if the advice in the documentation is heeded. For another, a result slower than repeatable read would be a surprise that might point more directly to the problem. > My general observation is that, on this machine, a lightweight > lock that is taken in exclusive mode by a series of lockers in > quick succession seems to max out around 16-20 clients, and the > curve starts to bend well before that. OK, I will keep that in mind. Thanks, -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 6:14 PM, Robert Haas wrote: > On Tue, Oct 11, 2011 at 1:11 PM, Simon Riggs wrote: >> On Mon, Oct 10, 2011 at 11:31 PM, Kevin Grittner >> wrote: >>> Simon Riggs wrote: >>> How do we turn it on/off to allow the overhead to be measured? >>> >>> User REPEATABLE READ transactions or SERIALIZABLE transactions. The >>> easiest way, if you're doing it for all transactions (which I >>> recommend) is to set default_transaction_isolation. >> >> Most apps use mixed mode serializable/repeatable read and therefore >> can't be changed by simple parameter. Rewriting the application isn't >> a sensible solution. >> >> I think it's clear that SSI should have had and still needs an "off >> switch" for cases that cause performance problems. > > Is it possible that you are confusing the default level, which is READ > COMMITTED, with REPEATABLE READ? I can't see why anyone would code up > their application to use REPEATABLE READ for some things and > SERIALIZABLE for other things unless they were explicitly trying to > turn SSI off for a subset of their transactions. In all releases > prior to 9.0, REPEATABLE READ and SERIALIZABLE behaved identically, so > there wouldn't be any reason for a legacy app to mix-and-match between > the two. Yes, I mistyped "read" when I meant "committed". You are right to point out there is no problem if people were using repeatable read and serializable. Let me retype, so there is no confusion: It's common to find applications that have some transactions explicitly coded to use SERIALIZABLE mode, while the rest are in the default mode READ COMMITTED. So common that TPC-E benchmark has been written as a representation of such workloads. The reason this is common is that some transactions require SERIALIZABLE as a "fix" for transaction problems. If you alter the default_transaction_isolation then you will break applications like this, so it is not a valid way to turn off SSI. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COUNT(*) and index-only scans
On Tue, Oct 11, 2011 at 12:43 PM, Bruce Momjian wrote: > Greg Stark wrote: >> On Mon, Oct 10, 2011 at 9:17 PM, Tom Lane wrote: >> > My intention was to allow it to consider any covering index. ?You're >> > thinking about the cost estimate, which is really entirely different. >> > >> >> Is there any reason to consider more than one? I would have expected >> the narrowest one to be the best choice. There's something to be said >> for using the same index consistently but we already have that problem >> and make no attempt to do that. And partial indexes might be better >> but then we would already be considering them if their constraints are >> satisfied. > > Actually, I think the smallest non-partial one on disk might be the best > --- that is very easy to find out. I doubt there is any need to write special-purpose code to decide which index ought to be used for a full table scan. We can just throw all of the otherwise-useless indexes into the costing machinery with empty pathkeys, and let them duke it out. All but the best one will be instantly discarded, and the best one will either beat or lose to a sequential scan. All of this will happen before we start trying to build join paths, so there's no combinatorial explosion in planning time - it'll just be a straightforward cost comparison between plans with identical pathkeys, and should be quite fast. The real issue is that the costing estimates need to be accurate, and that's where the rubber hits the road. Otherwise, even if we pick the right way to scan the table, we may do silly things up the line when we go to start constructing the join order. I think we need to beef up ANALYZE to gather statistics on the fraction of the pages that are marked all-visible, or maybe VACUUM should gather that information. The trouble is that if we VACUUM and then ANALYZE, we'll often get back a value very close to 100%, but then the real value may diminish quite a bit before the next auto-analyze fires. I think if we can figure out what to do about that problem we'll be well on our way... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 1:11 PM, Simon Riggs wrote: > On Mon, Oct 10, 2011 at 11:31 PM, Kevin Grittner > wrote: >> Simon Riggs wrote: >> >>> How do we turn it on/off to allow the overhead to be measured? >> >> User REPEATABLE READ transactions or SERIALIZABLE transactions. The >> easiest way, if you're doing it for all transactions (which I >> recommend) is to set default_transaction_isolation. > > Most apps use mixed mode serializable/repeatable read and therefore > can't be changed by simple parameter. Rewriting the application isn't > a sensible solution. > > I think it's clear that SSI should have had and still needs an "off > switch" for cases that cause performance problems. Is it possible that you are confusing the default level, which is READ COMMITTED, with REPEATABLE READ? I can't see why anyone would code up their application to use REPEATABLE READ for some things and SERIALIZABLE for other things unless they were explicitly trying to turn SSI off for a subset of their transactions. In all releases prior to 9.0, REPEATABLE READ and SERIALIZABLE behaved identically, so there wouldn't be any reason for a legacy app to mix-and-match between the two. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Tue, Oct 11, 2011 at 12:46 PM, Kevin Grittner wrote: > Robert Haas wrote: >> I ran my good old pgbench -S, scale factor 100, shared_buffers = >> 8GB test on Nate Boley's box. I ran it on both 9.1 and 9.2dev, >> and at all three isolation levels. As usual, I took the median of >> three 5-minute runs, which I've generally found adequate to >> eliminate the noise. On both 9.1 and 9.2dev, read committed and >> repeatable read have basically identical performance; if anything, >> repeatable read may be slightly better - which would make sense, >> if it cuts down the number of snapshots taken. > > Right. Thanks for running this. Could you give enough details to > allow reproducing on this end (or point to a previous post with the > details)? Sure, it's pretty much just a vanilla pgbench -S run, but the scripts I used are attached here. I build the head of each branch using the "test-build" script and then used the "runtestiso" script to drive the test runs. These scripts are throwaway so they're not really documented, but hopefully it's clear enough what it's doing. The server itself is a 32-core AMD 6128. >> Data are attached, in text form and as a plot. I excluded the >> repeatable read results from the plot as they just clutter it up - >> they're basically on top of the read committed results. > > That was kind, but really the REPEATABLE READ results are probably > the more meaningful comparison, even if they are more embarrassing. > :-( They're neither more nor less embarrassing - they're pretty much not different at all. I just didn't see any point in making a graph with 6 lines on it when you could only actually see 4 of them. > Did these transactions write anything? If not, were they declared > to be READ ONLY? If they were, in fact, only reading, it would be > interesting to see what the performance looks like if the > recommendation to use the READ ONLY attribute is followed. pgbench -S doesn't do any writes, or issue any transaction control statements. It just fires off SELECT statements against a single table as fast as it can, retrieving values from rows chosen at random. Each SELECT implicitly begins and ends a transaction. Possibly the system could gaze upon the SELECT statement and infer that the one-statement transaction induced thereby can't possibly write any tuples, and mark it read-only automatically, but I'm actually not that excited about that approach - trying to fix the lwlock contention that's causing the headache in the first place seems like a better use of time, assuming it's possible to make some headway there. My general observation is that, on this machine, a lightweight lock that is taken in exclusive mode by a series of lockers in quick succession seems to max out around 16-20 clients, and the curve starts to bend well before that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company test-build Description: Binary data runtestiso Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Mon, Oct 10, 2011 at 11:31 PM, Kevin Grittner wrote: > Simon Riggs wrote: > >> How do we turn it on/off to allow the overhead to be measured? > > User REPEATABLE READ transactions or SERIALIZABLE transactions. The > easiest way, if you're doing it for all transactions (which I > recommend) is to set default_transaction_isolation. Most apps use mixed mode serializable/repeatable read and therefore can't be changed by simple parameter. Rewriting the application isn't a sensible solution. I think it's clear that SSI should have had and still needs an "off switch" for cases that cause performance problems. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
Andrew Dunstan wrote: > > > On 10/11/2011 12:40 PM, Bruce Momjian wrote: > > Josh Berkus wrote: > >> It occurs to me that we could really use two things to make it easier to > >> move copies of database stuff around: > >> > >> pg_dump -r, which would include a CREATE ROLE for all roles needed to > >> restore the database (probably without passwords), and > >> > >> pg_dumpall -r --no-passwords which would dump the roles but without > >> CREATE PASSWORD statements. This would be useful for cloning databases > >> for use in Dev, Test and Staging, where you don't what to copy the md5s > >> of passwords for possible cracking. > > What would this do that pg_dumpall --globals-only doesn't? > > > > As stated, it would not export the passwords. What is the logic for not dumping passwords but the CREATE ROLE statement? I don't see how anyone would recognize that behavior as logical. If you want to add a --no-passwords option to pg_dumpall, that seems more logical to me. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
Robert Haas wrote: > I ran my good old pgbench -S, scale factor 100, shared_buffers = > 8GB test on Nate Boley's box. I ran it on both 9.1 and 9.2dev, > and at all three isolation levels. As usual, I took the median of > three 5-minute runs, which I've generally found adequate to > eliminate the noise. On both 9.1 and 9.2dev, read committed and > repeatable read have basically identical performance; if anything, > repeatable read may be slightly better - which would make sense, > if it cuts down the number of snapshots taken. Right. Thanks for running this. Could you give enough details to allow reproducing on this end (or point to a previous post with the details)? > Serializable mode is much slower on this test, though. On > REL9_1_STABLE, it's about 8% slower with a single client. At 8 > clients, the difference rises to 43%, and at 32 clients, it's 51% > slower. On 9.2devel, raw performance is somewhat higher (e.g. > +51% at 8 clients) but the performance when not using SSI has > improved so much that the performance gap between serializable and > the other two isolation levels is now huge: with 32 clients, in > serializable mode, the median result was 21114.577645 tps; in read > committed, 218748.929692 tps - that is, read committed is running > more than ten times faster than serializable. Yeah. I was very excited to see your numbers as you worked on that, but I've been concerned that with the "Performance Whack A Mole" nature of things (to borrow a term from Josh Berkus), SSI lightweight locks might be popping their heads up. > Data are attached, in text form and as a plot. I excluded the > repeatable read results from the plot as they just clutter it up - > they're basically on top of the read committed results. That was kind, but really the REPEATABLE READ results are probably the more meaningful comparison, even if they are more embarrassing. :-( > I haven't run this with LWLOCK_STATS, but my seat-of-the-pants > guess is that there's a single lightweight lock that everything is > bottlenecking on. The lock in question is SerializableXactHashLock. A secondary problem is SerializableFinishedListLock, which is used for protecting cleanup of old transactions. This is per Dan's reports, who had a better look at in on a 16 core machine, but is consistent with what I saw on fewer cores. Early in development we had a bigger problem with SerializablePredicateLockListLock, but Dan added a local map to eliminate contention during lock promotion decision, and I reworked that lock from the SHARED read and EXCLUSIVE write approach to the SHARED for accessing your own data and EXCLUSIVE for accessing data for another process technique. Combined, that made the problems with that negligible. > One possible difference between this test case and the ones you > may have used is that this case involves lots and lots of really > short transactions that don't do much. I did some tests like that, but not on a box with that many processors, and I probably didn't try using a thread count more than double the core count, so I probably never ran into the level of contention you're seeing. The differences at the low connection counts are surprising to me. Maybe it will make more sense when I see the test case. There's also some chance that late elimination of some race conditions found in testing affected this, and I didn't re-run those tests late enough to see that. Not sure. > The effect of anything that only happens once or a few times per > transaction is really magnified in this type of workload (which is > why the locking changes make so much of a difference here - in a > longer or heavier-weight transaction that stuff would be lost in > the noise). Did these transactions write anything? If not, were they declared to be READ ONLY? If they were, in fact, only reading, it would be interesting to see what the performance looks like if the recommendation to use the READ ONLY attribute is followed. That's at the top of the list of performance tips for SSI at: http://www.postgresql.org/docs/9.1/interactive/transaction-iso.html#XACT-SERIALIZABLE Anyway, this isolates a real issue, even if the tests exaggerate it beyond what anyone is likely to see in production. Once this CF is over, I'll put a review of this at the top of my PG list. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
On 10/11/2011 12:40 PM, Bruce Momjian wrote: Josh Berkus wrote: It occurs to me that we could really use two things to make it easier to move copies of database stuff around: pg_dump -r, which would include a CREATE ROLE for all roles needed to restore the database (probably without passwords), and pg_dumpall -r --no-passwords which would dump the roles but without CREATE PASSWORD statements. This would be useful for cloning databases for use in Dev, Test and Staging, where you don't what to copy the md5s of passwords for possible cracking. What would this do that pg_dumpall --globals-only doesn't? As stated, it would not export the passwords. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COUNT(*) and index-only scans
Greg Stark wrote: > On Mon, Oct 10, 2011 at 9:17 PM, Tom Lane wrote: > > My intention was to allow it to consider any covering index. ?You're > > thinking about the cost estimate, which is really entirely different. > > > > Is there any reason to consider more than one? I would have expected > the narrowest one to be the best choice. There's something to be said > for using the same index consistently but we already have that problem > and make no attempt to do that. And partial indexes might be better > but then we would already be considering them if their constraints are > satisfied. Actually, I think the smallest non-partial one on disk might be the best --- that is very easy to find out. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping roles improvements?
Josh Berkus wrote: > It occurs to me that we could really use two things to make it easier to > move copies of database stuff around: > > pg_dump -r, which would include a CREATE ROLE for all roles needed to > restore the database (probably without passwords), and > > pg_dumpall -r --no-passwords which would dump the roles but without > CREATE PASSWORD statements. This would be useful for cloning databases > for use in Dev, Test and Staging, where you don't what to copy the md5s > of passwords for possible cracking. What would this do that pg_dumpall --globals-only doesn't? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Tue, Oct 11, 2011 at 12:30 PM, Jeff Davis wrote: > On Tue, 2011-10-11 at 12:09 -0400, Robert Haas wrote: >> The cure seems worse than the disease. What is so bad about '[]'? > > OK, so we stick with the 3-argument form. Do we have a default for the > third argument, or do we scrap it to avoid confusion? > > There were some fairly strong objections to using '[]' as the default or > having the default vary between types. So, the only real option > remaining, if we do have a default, is '[)'. I think using '[)' is fine. At some level, this is just a question of expectations. If you expect that int4range(1,4) will create a range that includes 4, well, you're wrong. Once you get used to it, it will seem normal, and you'll know that you need to write int4range(1,4,'[]') if that's what you want. As long as the system is designed around a set of consistent and well-thought-out principles, people will get used to the details. I don't see that the idea of a half-open range over a discrete-valued type is particularly confusing - we use them all the time in programming, when we make the end pointer point to the byte following the end of the array, rather than the last element - but even if it is, overall design consistency trumps what someone may find to be the absolutely perfect behavior in some particular case. And saving typing is nearly always good - unless it creates a LOT more confusion than I think this will. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Tue, 2011-10-11 at 06:28 -0700, David Fetter wrote: > > Certainly not the end of the world, but is the convenience of being > > able to write somerange(a, b) instead of somerange(a, b, '[)') > > really worth it? I kind of doubt that... > > You're making a persuasive argument for the latter based solely on the > clarity. If people see that 3rd element in the DDL, or need to > provide it, it's *very* obvious what's going on. That was how I originally thought, but we're also providing built-in range types like tsrange and daterange. I could see how if the former excluded the endpoint and the latter included it, it could be confusing. We could go back to having different constructor names for different inclusivity; e.g. int4range_cc(1,10). That at least removes the awkwardness of typing (and seeing) '[]'. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Tue, 2011-10-11 at 12:09 -0400, Robert Haas wrote: > The cure seems worse than the disease. What is so bad about '[]'? OK, so we stick with the 3-argument form. Do we have a default for the third argument, or do we scrap it to avoid confusion? There were some fairly strong objections to using '[]' as the default or having the default vary between types. So, the only real option remaining, if we do have a default, is '[)'. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Tue, Oct 11, 2011 at 12:12 PM, David Fetter wrote: > Nothing's bad about '[]' per se. What's better, but possibly out of > the reach of our current lexing and parsing system, would be things > like: > > [1::int, 10) That's been discussed before. Aside from the parser issues (which are formidable) it would break brace-matching in most if not all commonly used editors. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Tue, Oct 11, 2011 at 12:09:01PM -0400, Robert Haas wrote: > On Tue, Oct 11, 2011 at 12:03 PM, Jeff Davis wrote: > > On Tue, 2011-10-11 at 06:28 -0700, David Fetter wrote: > >> > Certainly not the end of the world, but is the convenience of being > >> > able to write somerange(a, b) instead of somerange(a, b, '[)') > >> > really worth it? I kind of doubt that... > >> > >> You're making a persuasive argument for the latter based solely on the > >> clarity. If people see that 3rd element in the DDL, or need to > >> provide it, it's *very* obvious what's going on. > > > > That was how I originally thought, but we're also providing built-in > > range types like tsrange and daterange. I could see how if the former > > excluded the endpoint and the latter included it, it could be confusing. > > > > We could go back to having different constructor names for different > > inclusivity; e.g. int4range_cc(1,10). That at least removes the > > awkwardness of typing (and seeing) '[]'. > > The cure seems worse than the disease. What is so bad about '[]'? Nothing's bad about '[]' per se. What's better, but possibly out of the reach of our current lexing and parsing system, would be things like: [1::int, 10) Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Tue, Oct 11, 2011 at 12:03 PM, Jeff Davis wrote: > On Tue, 2011-10-11 at 06:28 -0700, David Fetter wrote: >> > Certainly not the end of the world, but is the convenience of being >> > able to write somerange(a, b) instead of somerange(a, b, '[)') >> > really worth it? I kind of doubt that... >> >> You're making a persuasive argument for the latter based solely on the >> clarity. If people see that 3rd element in the DDL, or need to >> provide it, it's *very* obvious what's going on. > > That was how I originally thought, but we're also providing built-in > range types like tsrange and daterange. I could see how if the former > excluded the endpoint and the latter included it, it could be confusing. > > We could go back to having different constructor names for different > inclusivity; e.g. int4range_cc(1,10). That at least removes the > awkwardness of typing (and seeing) '[]'. The cure seems worse than the disease. What is so bad about '[]'? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation
On Mon, Oct 10, 2011 at 3:59 PM, Kevin Grittner wrote: > I do have some concern about whether the performance improvements > from reduced LW locking contention elsewhere in the code may (in > whack-a-mole fashion) cause the percentages to go higher in SSI. > The biggest performance issues in some of the SSI benchmarks were on > LW lock contention, so those may become more noticeable as other > contention is reduced. I've been trying to follow along on the > threads regarding Robert's work in that area, with hopes of applying > some of the same techniques to SSI, but it's not clear whether I'll > have time to work on that for the 9.2 release. (It's actually > looking improbably at this point.) I ran my good old pgbench -S, scale factor 100, shared_buffers = 8GB test on Nate Boley's box. I ran it on both 9.1 and 9.2dev, and at all three isolation levels. As usual, I took the median of three 5-minute runs, which I've generally found adequate to eliminate the noise. On both 9.1 and 9.2dev, read committed and repeatable read have basically identical performance; if anything, repeatable read may be slightly better - which would make sense, if it cuts down the number of snapshots taken. Serializable mode is much slower on this test, though. On REL9_1_STABLE, it's about 8% slower with a single client. At 8 clients, the difference rises to 43%, and at 32 clients, it's 51% slower. On 9.2devel, raw performance is somewhat higher (e.g. +51% at 8 clients) but the performance when not using SSI has improved so much that the performance gap between serializable and the other two isolation levels is now huge: with 32 clients, in serializable mode, the median result was 21114.577645 tps; in read committed, 218748.929692 tps - that is, read committed is running more than ten times faster than serializable. Data are attached, in text form and as a plot. I excluded the repeatable read results from the plot as they just clutter it up - they're basically on top of the read committed results. I haven't run this with LWLOCK_STATS, but my seat-of-the-pants guess is that there's a single lightweight lock that everything is bottlenecking on. One possible difference between this test case and the ones you may have used is that this case involves lots and lots of really short transactions that don't do much. The effect of anything that only happens once or a few times per transaction is really magnified in this type of workload (which is why the locking changes make so much of a difference here - in a longer or heavier-weight transaction that stuff would be lost in the noise). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company # clients PG91.read-committed PG91.repeatable-read PG91.serializable PG92dev.read-committed PG92dev.repeatable-read PG92dev.serializable 1 4309.029470 4309.786279 3997.033815 4420.435075 4432.991835 4158.783631 8 26881.573137 27105.962897 15604.313328 33369.189146 33744.015354 23607.824697 16 44578.177731 44347.193447 21687.135630 62690.803360 63391.583338 23308.801395 32 38725.876808 38773.523347 17995.772382 218748.929692 218928.089856 21114.577645 <> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online base backup from the hot-standby
> > I can't see a reason why we would use a new WAL record for this, > > rather than modify the XLOG_PARAMETER_CHANGE record type which was > > created for a very similar reason. > > The code would be much simpler if we just extend > > XLOG_PARAMETER_CHANGE, so please can we do that? > > Sure. > > > The log message "full_page_writes on master is set invalid more than > > once during online backup" should read "at least once" rather than > > "more than once". > > Yes. > > > lastFpwDisabledLSN needs to be initialized. > > I think it don't need because all values in XLogCtl is initialized 0. > > > Is there a reason to add lastFpwDisabledLSN onto the Control file? If > > we log parameters after every checkpoint then we'll know the values > > when we startup. If we keep logging parameters this way we'll end up > > with a very awkward and large control file. I would personally prefer > > to avoid that, but that thought could go either way. Let's see if > > anyone else thinks that also. > > Yes. I add to CreateCheckPoint(). > > Image: > CreateCheckPoint() > { > if (!shutdown && XLogStandbyInfoActive()) > { > LogStandbySnapshot() > XLogReportParameters() > } >} > > XLogReportParameters() > { > if (fpw == 'off' || ... ) > XLOGINSERT() > } > > However, it'll write XLOG_PARAMETER_CHANGE every checkpoints when FPW is > 'off'. > (It will increases the amount of WAL.) > Is it OK? Done. Updated patch attached. Regards. Jun Ishizuka NTT Software Corporation TEL:045-317-7018 E-Mail: ishizuka@po.ntts.co.jp standby_online_backup_09base-02fpw.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET variable - Permission issues
Kevin Grittner wrote: > Joe Conway wrote: > > On 10/10/2011 01:52 PM, Gurjeet Singh wrote: > > >> ALTER USER novice SET MIN_VAL OF statement_timeout TO '1'; > >> -- So that the user cannot turn off the timeout > >> > >> ALTER DATABASE super_reliable SET ENUM_VALS OF synchronous_commit > >> TO 'on'; > >> -- So that the user cannot change the synchronicity of > >> transactions against this database. > > > > I like this better than GRANT/REVOKE on SET. > > +1 > > I would really like a way to prevent normal users from switching > from the default transaction isolation level I set. This seems like > a good way to do that. Putting sane bounds on some other settings, > more to protect against the accidental bad settings than malicious > mischief, would be a good thing, too. Is this a TODO? We might not want to make work_mem SUSET, but it would allow administrators to control this. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
Fujii Masao wrote: > On Tue, Oct 11, 2011 at 6:37 AM, Simon Riggs wrote: > > On Mon, Oct 10, 2011 at 6:52 PM, Josh Berkus wrote: > > > >>> Tatsuo/Josh/Robert also discussed how recovery.conf can be used to > >>> provide parameters solely for recovery. That is difficult to do > >>> without causing all downstream tools to make major changes in the ways > >>> they supply parameters. > >> > >> Actually, this case is easily solved by an "include recovery.conf" > >> parameter. ?So it's a non-issue. > > > > That is what I've suggested and yes, doing that is straightforward. > > Even if we do that, you still need to modify the tool so that it can handle > the recovery trigger file. recovery.conf is used as just a configuration file > (not recovery trigger file at all). It's not renamed to recovery.done at the > end of recovery. If the tool depends on the renaming from recovery.conf > to recovery.done, it also would need to be modified. If the tool needs to > be changed anyway, why do you hesitate in changing it so that it adds > "include recovery.conf" into postgresql.conf automatically? > > Or you think that, to keep the backward compatibility completely, > recovery.conf should be used as not only a configuration file but also a > recovery trigger one and it should be renamed to recovery.done at > the end of recovery? As much as I appreciate Simon's work in this area, I think we are still unclear if keeping backward-compatibility is worth the complexity required for future users. Historically we have been bold in changing postgresql.conf settings to improve clarity, and that approach has served us well. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: casts row to array and array to row
Pavel Stehule writes: > 2011/10/11 Robert Haas : >> On Tue, Oct 11, 2011 at 4:40 AM, Pavel Stehule >> wrote: >>> What do you think about this idea? It's a bad one. >> Well, a ROW can contain values of different types; an ARRAY can't. > yes, I know - but it should be problem only in few cases - when is not > possible to cast a row field to array field. This idea is basically the same as "data types don't matter", which is not SQL-ish and certainly not Postgres-ish. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only scans
On Oct 7, 2011, at 8:47 PM, Joshua D. Drake wrote: > > On 10/07/2011 11:40 AM, Tom Lane wrote: >> Robert Haas writes: >>> Please find attached a patch implementing a basic version of >>> index-only scans. >> >> I'm making some progress with this, but I notice what seems like a >> missing feature: there needs to be a way to turn it off. Otherwise >> performance comparisons will be difficult to impossible. >> >> The most obvious solution is a planner control GUC, perhaps >> "enable_indexonlyscan". Anyone object, or want to bikeshed the name? > > enable_onlyindexscan > > I'm kidding. > > +1 on Tom's proposed name. +1 ... definitely an important thing to do. regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] DROP statement reworks
On Mon, Oct 10, 2011 at 1:38 PM, Kohei KaiGai wrote: > I'm sorry again. I tought it was obvious from the filenames. I guess I got confused because you re-posted part 2 without the other parts, and I got mixed up and thought you were reposting part one. I've committed a stripped-down version of the part one patch, which had several mistakes even in just the part I committed - e.g., you forgot TABLESPACEOID. I also did some renaming for clarity. I'm going to throw this back to you for rebasing at this point, which I realize is going to be somewhat of an unenjoyable task given the way I cut up your changes to objectaddress.c, but I wasn't very confident that all of the entries were correct (the one for attributes seemed clearly wrong to me, for example), and I didn't want to commit a bunch of stuff that wasn't going to be exercised. I suggest that you merge the remainder of the part-one changes into part-two. On the flip side, I think you should take the stuff that deals with dropping relations OUT of part two. I don't see what good it does us to try to centralize the drop logic if we still have to have special cases for relations, so let's just leave that separate for now until we figure out a better approach, or at least split it off as a separate patch so that it doesn't hold up all the other changes. I think get_object_namespace() needs substantial revision. Instead of passing the object type and the object address, why not just pass the object address? You should be able to use the classId in the address to figure out everything you need to know. Since this function is private to objectaddress.c, there's no reason for it to use those accessor functions - it can just iterate through the array just as object_exists() does. That way you also avoid iterating through the array multiple times. I also think that we probably ought to revise AlterObjectNamespace() to make use of this new machinery, instead of making the caller pass in all the same information that objectaddress.c is now learning how to provide. That would possibly open the way to a bunch more consolidation of the SET SCHEMA code; in fact, we might want to clean that up first, before dealing with the DROP stuff. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: casts row to array and array to row
2011/10/11 Merlin Moncure : > On Tue, Oct 11, 2011 at 8:18 AM, Pavel Stehule > wrote: >> 2011/10/11 Merlin Moncure : >>> On Tue, Oct 11, 2011 at 3:40 AM, Pavel Stehule >>> wrote: Hello A current limits of dynamic work with row types in PL/pgSQL can be decreased with a possible casts between rows and arrays. Now we have a lot of tools for arrays, and these tools should be used for rows too. postgres=# \d mypoint Composite type "public.mypoint" Column │ Type │ Modifiers ┼─┼─── a │ integer │ b │ integer │ postgres=# select cast(rmypoint '(10,20) as int[]); array {10,20} (1 row) postgres=# select cast(ARRAY[10,20] AS mypoint); mypoint ─ (10,20) (1 row) What do you think about this idea? >>> >>> Not sure what it buys you over the syntax we already have: >>> >>> select row(foo[1], bar[2]); >>> select array[(bar).a, (bar).b]; >> >> You can do it manually for known combinations of rowtype and >> arraytype. But proposed casts do it generally - what has sense mainly >> for plpgsql functions or some sql functions. >> >>> >>> Also, in my coding of composite types, homogeneously typed rows don't >>> really come up that often... >> >> you can use everywhere text type. >> >> When I wrote >> http://stackoverflow.com/questions/7711432/how-to-set-value-of-composite-variable-field-using-dynamic-sql/7722575#7722575 >> then I had to do lot of string operations. Proposed casts >> significantly do this simply - and it is enought general for general >> usage. > > How does your approach compare to hstore? hstore to me is just > enhanced generic container type which supports the operations you are > trying to do. It can be trivially (as of 9.0) moved in an out of both > arrays and record types: for replace some value is hstore ok, but cast to arrays is more general - you can do some tricks like table transposition, you can use a all tricks that we have for arrays. > > postgres=# create type foo_t as (a int, b text, c float); > CREATE TYPE > > postgres=# select row(1, 'abc', 1.0)::foo_t #= '"b"=>"def"'; > ?column? > --- > (1,def,1) > (1 row) In some future version I would to have a general indexable types - and then we can have a hash (hstore) in code, but casts to arrays or to hashs can be useful - in higher languages like plpgsql or sql. Pavel > > merlin > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers