Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Oct23, 2011, at 22:48 , Daniel Farina wrote: > It doesn't seem meaningful for StartupCLOG (or, indeed, any of the > hot-standby path functionality) to be called before that code is > executed, but it is anyway right now. I think the idea is to check that the CLOG part which recovery *won't* overwrite is consistent (or rather, given the simplicity of the check, at least accessible) Heikki said the following somewhere else in this thread when I suggested something similar to your proposal: >> There are pretty clear rules on what state clog can be in. When you launch >> postmaster in a standby: >> >> * Any clog preceding the nextXid from the checkpoint record we start >> recovery from, must either be valid, or the clog file must be missing >> altogether (which can happen when it was vacuumed away while the backup in >> progress - if the clog is still needed at the end of backup it must not be >> missing, of course). >> * Any clog following nextXid can be garbled or missing. >> >> Recovery will overwrite any clog after nextXid from the WAL, but not the >> clog before it. I think Simon's theory that we're starting recovery from the wrong place, i.e. should start with an earlier WAL location, is probably correct. The question is, why? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Oct24, 2011, at 01:27 , Simon Riggs wrote: > FATAL: could not access status of transaction 21110784 > which, in pg_subtrans, is the first xid on a new subtrans page. So we > have missed zeroing a page. > > pg_control shows ... Latest checkpoint's oldestActiveXID: 2111 > which shows quite clearly that the pg_control file is later than it should be. But shouldn't pg_control be largely irrelevant in a hot backup scenario? Most (all?) of the information contained therein should be overwritten with the contents of the checkpoint referenced by the backup label, shouldn't it? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] autovacuum and orphaned large objects
Hi, The main point of autovacuum is maintenance tasks. Currently, it executes VACUUM and ANALYZE commands. I want to propose that we incorporate vacuumlo functionality into it. While dealing with large objects (LO), we have lo contrib module that helps with LO maintenance but has some limitations (does not handle DROP TABLE and TRUNCATE cases) and vacuumlo that does an excellent job but have to be executed outside DBMS. The proposal is to clean up LO when autovacuum triggers VACUUM; cleanup LO routine will starts after(?) VACUUM command. In a near future I want to propose that orphaned LO be cleaned up by VACUUM but that a history for another thread... Comments? -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Sun, Oct 23, 2011 at 9:48 PM, Daniel Farina wrote: > Having digged at this a little -- but not too much -- the problem > seems to be that postgres is reading the commit logs way, way too > early, that is to say, before it has played enough WAL to be > 'consistent' (the WAL between pg_start and pg_stop backup). I have > not been able to reproduce this problem (I think) after the message > from postgres suggesting it has reached a consistent state; at that > time I am able to go into hot-standby mode. The WAL appears too early because the other control info is later than it should be. So this is approx backwards and nothing related to consistent state, but thanks for drawing my attention to this. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Sun, Oct 16, 2011 at 2:33 AM, Chris Redekop wrote: > pg_subtrans: http://pastebin.com/qAXEHAQt I confirm this as a HS issue and will investigate from here. FATAL: could not access status of transaction 21110784 which, in pg_subtrans, is the first xid on a new subtrans page. So we have missed zeroing a page. pg_control shows ... Latest checkpoint's oldestActiveXID: 2111 which shows quite clearly that the pg_control file is later than it should be. Chris, can you rearrange the backup so you copy the pg_control file as the first act after the pg_start_backup? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXECUTE tab completion
Andreas Karlsson writes: > Thanks for cleaning up the code to some sanity, I should have done so > myself when I noticed the problem. > A new version is attached. Committed with minor adjustments --- I didn't see any need to make this wait for the next commitfest. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] So, is COUNT(*) fast now?
On Fri, Oct 21, 2011 at 12:52 PM, Robert Haas wrote: > > Also, this line is kind of expensive: > > if (!visibilitymap_test(scandesc->heapRelation, > ItemPointerGetBlockNumber(tid), > &node->ioss_VMBuffer)) > > Around 2%. But I don't see any way to avoid that, or even make it cheaper. Could we cache by ItemPointerGetBlockNumber(tid) the results of those tests, for groups of tids on the same index page? How useful this would be would depend on how well-clustered the table and index are. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Index only scans and visibilitymap.c
With index only scans, the comments in src/backend/access/heap/visibilitymap.c are probably out of date, starting with: "Currently, the visibility map is only used as a hint" Also, is there a discussion of how and why index-only scans is safe? i.e. what lock, if any, has to be held while nodeIndexonlyscan.c calls visibilitymap_test? Do index-only scans need a README file? Thanks, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Sun, Oct 23, 2011 at 9:48 PM, Daniel Farina wrote: > On Mon, Oct 17, 2011 at 11:30 PM, Chris Redekop wrote: >> Well, on the other hand maybe there is something wrong with the data. >> Here's the test/steps I just did - >> 1. I do the pg_basebackup when the master is under load, hot slave now will >> not start up but warm slave will. >> 2. I start a warm slave and let it catch up to current >> 3. On the slave I change 'hot_standby=on' and do a 'service postgresql >> restart' >> 4. The postgres fails to restart with the same error. >> 5. I turn hot_standby back off and postgres starts back up fine as a warm >> slave >> 6. I then turn off the load, the slave is all caught up, master and slave >> are both sitting idle >> 7. I, again, change 'hot_standby=on' and do a service restart >> 8. Again it fails, with the same error, even though there is no longer any >> load. >> 9. I repeat this warmstart/hotstart cycle a couple more times until to my >> surprise, instead of failing, it successfully starts up as a hot standby >> (this is after maybe 5 minutes or so of sitting idle) >> So...given that it continued to fail even after the load had been turned of, >> that makes me believe that the data which was copied over was invalid in >> some way. And when a checkpoint/logrotation/somethingelse occurred when not >> under load it cleared itself upI'm shooting in the dark here >> Anyone have any suggestions/ideas/things to try? > > Having digged at this a little -- but not too much -- the problem > seems to be that postgres is reading the commit logs way, way too > early, that is to say, before it has played enough WAL to be > 'consistent' (the WAL between pg_start and pg_stop backup). I have > not been able to reproduce this problem (I think) after the message > from postgres suggesting it has reached a consistent state; at that > time I am able to go into hot-standby mode. > > The message is like: "consistent recovery state reached at %X/%X". > (this is the errmsg) > > It doesn't seem meaningful for StartupCLOG (or, indeed, any of the > hot-standby path functionality) to be called before that code is > executed, but it is anyway right now. I'm not sure if this oversight > is simply an oversight, or indicative of a misplaced assumption > somewhere. Basically, my thoughts for a fix are to suppress > hot_standby = on (in spirit) before the consistent recovery state is > reached. Not sure about that, but I'll look at where this comes from. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] termination of backend waiting for sync rep generates a junk log message
Robert Haas writes: > Well, there is a general problem that anything which throws an ERROR > too late in the commit path is Evil; and sync rep makes that worse to > the extent that it adds more stuff late in the commit path, but it > didn't invent the problem. BTW, it strikes me that if we want to do something about that, it ought to be possible; but it has to be built into error handling, not a localized hack for sync rep. Consider a design along these lines: we invent a global flag that gets set at some appropriate point in RecordTransactionCommit (probably right where we exit the commit critical section) and is not cleared until we send a suitable message to the client --- I think either command-complete or an error message would qualify, but that would have to be analyzed more carefully than I've done so far. If elog.c is told to send an error message while this flag is set, then it does something special to inform the client that this was a post-commit error and the xact is in fact committed. My inclination for the "something special" would be to add a new error message field, but that could be difficult for clients to examine depending on what sort of driver infrastructure they're dealing with. You could also imagine emitting a separate NOTICE or WARNING message, which is analogous to the current hack in SyncRepWaitForLSN, but seems pretty ugly because it requires clients to re-associate that event with the later error message. (But it might be worth doing anyway for human users, even if we provide a different flag mechanism that is intended for program consumption.) Or maybe we could override the SQLSTATE with some special value. Or something else. Given infrastructure like this, it would be reasonable for SyncRepWaitForLSN to just throw an ERROR if it gets an interrupt, instead of trying to kluge its own solution. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] So, is COUNT(*) fast now?
Jeff Janes writes: > count(*) and sum(1) do different things internally, and in my hands > sum(1) is ~10% slower. > I don't know how to dump the output of ExecBuildProjectionInfo into a > human readable form, so I don't know the basis of the difference. But > I wonder if using count(*) would lower the weight of the ExecProject > function. Probably. count() doesn't actually have any arguments, so there's nothing for ExecProject to do. sum(1) invokes the generic case there (ExecTargetList). I suppose we could add another special-case path for constant tlist elements, but I suspect that would mostly be optimizing for benchmarks rather than helping real-world cases. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Mon, Oct 17, 2011 at 11:30 PM, Chris Redekop wrote: > Well, on the other hand maybe there is something wrong with the data. > Here's the test/steps I just did - > 1. I do the pg_basebackup when the master is under load, hot slave now will > not start up but warm slave will. > 2. I start a warm slave and let it catch up to current > 3. On the slave I change 'hot_standby=on' and do a 'service postgresql > restart' > 4. The postgres fails to restart with the same error. > 5. I turn hot_standby back off and postgres starts back up fine as a warm > slave > 6. I then turn off the load, the slave is all caught up, master and slave > are both sitting idle > 7. I, again, change 'hot_standby=on' and do a service restart > 8. Again it fails, with the same error, even though there is no longer any > load. > 9. I repeat this warmstart/hotstart cycle a couple more times until to my > surprise, instead of failing, it successfully starts up as a hot standby > (this is after maybe 5 minutes or so of sitting idle) > So...given that it continued to fail even after the load had been turned of, > that makes me believe that the data which was copied over was invalid in > some way. And when a checkpoint/logrotation/somethingelse occurred when not > under load it cleared itself upI'm shooting in the dark here > Anyone have any suggestions/ideas/things to try? Having digged at this a little -- but not too much -- the problem seems to be that postgres is reading the commit logs way, way too early, that is to say, before it has played enough WAL to be 'consistent' (the WAL between pg_start and pg_stop backup). I have not been able to reproduce this problem (I think) after the message from postgres suggesting it has reached a consistent state; at that time I am able to go into hot-standby mode. The message is like: "consistent recovery state reached at %X/%X". (this is the errmsg) It doesn't seem meaningful for StartupCLOG (or, indeed, any of the hot-standby path functionality) to be called before that code is executed, but it is anyway right now. I'm not sure if this oversight is simply an oversight, or indicative of a misplaced assumption somewhere. Basically, my thoughts for a fix are to suppress hot_standby = on (in spirit) before the consistent recovery state is reached. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] So, is COUNT(*) fast now?
On Fri, Oct 21, 2011 at 12:07 PM, Robert Haas wrote: > On Fri, Oct 21, 2011 at 2:33 PM, Tom Lane wrote: >>> I think HeapTupleSatisfiesMVCC is probably being skipped anyway in >>> this case, since all the heap pages should be PD_ALL_VISIBLE. >> >> Proves my point ;-) ... you're comparing a code path that's been beat on >> for *years* with one that just got written. > > I know. I wrote a chunk of it. :-) My point is just that it'd be > nice to make it better. > > Anyhow, here's the scoop. On my desktop machine running F14, running > SELECT sum(1) FROM pgbench_accounts in a tight loop, 60 s worth of > oprofile data: > > 176830 13.0801 postgres postgres > ExecProject Hi Robert, count(*) and sum(1) do different things internally, and in my hands sum(1) is ~10% slower. I don't know how to dump the output of ExecBuildProjectionInfo into a human readable form, so I don't know the basis of the difference. But I wonder if using count(*) would lower the weight of the ExecProject function. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] termination of backend waiting for sync rep generates a junk log message
Robert Haas writes: > On Tue, Oct 18, 2011 at 11:27 PM, Tom Lane wrote: >> One thing worth asking is why we're willing to violate half a dozen >> different coding rules if we see ProcDiePending, yet we're perfectly >> happy to rely on the client understanding a WARNING for the >> QueryCancelPending case. Another is whether this whole function isn't >> complete BS in the first place, since it appears to be coded on the >> obviously-false assumption that nothing it calls can throw elog(ERROR) >> --- and of course, if any of those functions do throw ERROR, all the >> argumentation here goes out the window. > Well, there is a general problem that anything which throws an ERROR > too late in the commit path is Evil; and sync rep makes that worse to > the extent that it adds more stuff late in the commit path, but it > didn't invent the problem. What it did do is add stuff late in the > commit path that can block for a potentially unbounded period of time, > and I don't see that there are any solutions to that problem that > aren't somewhat grotty. After further reflection, you're right that all sync rep is really doing is extending the time duration of the interval wherein clients will have a hard time telling whether the commit occurred or not. It's always been the case that if a cancel/die interrupt occurs during CommitTransaction, that will get serviced at the RESUME_INTERRUPTS call at the end, and the client will see an apparent failure even though the transaction was committed. Even without that, an interrupt occurring just after this code sequence, but before we reach the point of sending a command-complete response message, is going to result in client confusion, and there's very little we can do about that. I think what we should do in SyncRepWaitForLSN is just send a warning and abandon waiting. Trying to fool with the interrupt response behavior beyond that is simply broken, and it doesn't help any that we chose to break it in two different, but equally indefensible, ways for cancel versus die interrupts. It would help BTW for the warning to have its own SQLSTATE, if we're imagining that "some clients may be able to interpret" it. Also, this code is supposing that it must be called within a HOLD_INTERRUPTS context, but it doesn't look to me like that is being done for the various calls from twophase.c. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visual Studio 2010/Windows SDK 7.1 support
thready wrote: [...] I don't know how to apply this patch. What's the exe that runs this patch? Will it ask me where the source folder root is when I run it? As the archives seem to be unreachable at the moment I'm copying text from the following message which I would otherwise ask you to read: http://archives.postgresql.org/message-id/4e2813ec.2050...@gmx.de In short (for the records): download and install msysgit from http://code.google.com/p/msysgit/downloads/list Open git bash and cd into an empty directory like: cd c: mkdir pgdev cd pgdev Clone the postgresql repository: git clone git://git.postgresql.org/git/postgresql.git cd postgresql Copy the patch (the part with the strange perl-like text of the message) from http://archives.postgresql.org/message-id/4e14fd1a.8080...@gmx.de and save it to a file (like VS2010v9.patch) which you put into the directory which contains your newly created repository (c:\pgdev). Apply the Patch: patch -Ec -p 1 -i ../VS2010v9.patch (which doesn't seem to apply cleanly anymore which will break things for you - but I currently have no time to fix it) Put bison and flex into your path like: echo "\$ENV{PATH}=\$ENV{PATH} . ';C:\Program Files (x86)\Git\bin';" > src/tools/msvc/buildenv.pl (you might have to adopt the path) Open a Visual Studio 2010 command prompt and cd into the src/tools/msvc directory of your repository like: cd C:\pgdev\postgresql\src\tools\msvc Start the build: build.bat Please use the commitfest app (https://commitfest.postgresql.org/action/patch_view?id=523) to find out about the current state of the patch and to get the latest version. If you have problems applying the patch when copied from the archives (v10 - http://archives.postgresql.org/message-id/4e837b20.4020...@gmx.de), please drop me a line and I'll send you the latest version off list. Regards, Brar -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Deferrable unique constraints vs join removal -- bug?
Marti Raudsepp writes: > On Sun, Oct 23, 2011 at 06:44, Tom Lane wrote: >> Yeah, this seems like the right fix. > Oh, that sounds pretty obvious now that you mention it. :) > I will try to come up with a new patch in a few days (haven't had too > much time lately). Oh, I did it already. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Deferrable unique constraints vs join removal -- bug?
On Sun, Oct 23, 2011 at 06:44, Tom Lane wrote: > Robert Haas writes: >> I think maybe what we should do is add >> an "immediate" field to IndexOptInfo, mirroring the existing unique >> flag, and have get_relation_info() populate it from indimmediate, and >> then make relation_has_unique_index() disqualify any non-immediate >> index. > > Yeah, this seems like the right fix. Oh, that sounds pretty obvious now that you mention it. :) I will try to come up with a new patch in a few days (haven't had too much time lately). Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visual Studio 2010/Windows SDK 7.1 support
Hi Brar, Thanks for sharing your work compiling Postgres with VS2010. I see that there's a patch file you created to fix the files that need fixing to do so, but I don't know how to apply this patch. What's the exe that runs this patch? Will it ask me where the source folder root is when I run it? Thanks for your help! Mike -- View this message in context: http://postgresql.1045698.n5.nabble.com/Visual-Studio-2010-Windows-SDK-7-1-support-tp3325421p4929680.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronized snapshots
On 23 October 2011 03:15, Tom Lane wrote: > Thom Brown writes: >> Can I ask why it doesn't return the same snapshot ID each time? >> Surely it can't change since you can only export the snapshot of a >> serializable or repeatable read transaction? > > No, that's incorrect. You can export from a READ COMMITTED transaction; > indeed, you'd more or less have to, if you want the control transaction > to be able to see what the slaves do. My bad. I didn't read the documentation carefully enough. I can make sense of it now. Thanks -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers