Re: [HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]
On 01/21/2013 02:17 AM, Magnus Hagander wrote: On Jan 21, 2013 3:06 AM, Craig Ringer cr...@2ndquadrant.com mailto:cr...@2ndquadrant.com wrote: On 01/21/2013 10:03 AM, Craig Ringer wrote: On 01/19/2013 04:08 AM, Boszormenyi Zoltan wrote: However, I am not sure whether Cygwin provides the mkstemp() call or not. Searching... Found bugzilla reports against mkstemp on Cygwin. Is Cygwin a platform that should be targeted for the server backend these days? I can understand making sure that libpq works on Cygwin, but is there any reason at all to run a Pg server backend on Cygwin rather than as native Windows binaries? I'm not suggesting immediately dropping working support, since this is so trivially worked around. I'm just wondering why anybody cares about the platform. I have suggested similar before, and been voted down :) iirc Andrew uses it, no? Either way, the consensus earlier had been that as long as it doesn't require major surgery or blocks something else, we should try to keep it working. And as you say this sounds like something that can be handled trivially, I think now is not the time. No, I only use the client. But then I support plenty of things I don't use. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] standby, pg_basebackup and last xlog file
On 21.01.2013 09:14, Миша Тюрин wrote: Is there any reason why pg_basebackup has limitation in an online backup from the standby: The backup history file is not created in the database cluster backed up. ? WAL archiving isn't active in a standby, so even if it created a backup history file, it wouldn't go anywhere. Also, the way the backup history files are named, if you take a backup on the master at the same time (or another backup at the same time in the standby), you would end up with multiple backup history files with the same name. So i can't get last xlog file needed to restore :( Good point. That was an oversight in the patch that allowed base backups from a standby. Also maybe i can use something like ( pg_last_xlog_replay_location() + 1 ) after pg_basebackup finished. Yeah, that should work. Does anybody know true way to getting last xlog file in case of applying pg_basebackup to standby? How pg_basebackup gets last xlog file in case of standby and -x option? The server returns the begin and end WAL locations to pg_basebackup, pg_basebackup just doesn't normally print them out to the user. In verbose mode, it does actually print them out, but only with -x, so that doesn't help you either. If you can compile from source, you could modify pg_basebackup.c to print the locations without -x and --verbose, search lines that print out transaction log start point / end position. As a workaround, without modifying the source, you can do this after pg_basebackup has finished, to get the first WAL file you need: $ pg_controldata backupdir | grep REDO WAL file Latest checkpoint's REDO WAL file:00030009 And as you suggested, pg_last_xlog_replay_location() for the last WAL file you need. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Fix NULL checking in check_TSCurrentConfig()
On 1/20/13 10:35 PM, Tom Lane wrote: Great catch, will commit. (But first I'm looking through commit 2594cf0e to see if I made the same mistake anywhere else :-(.) How did you find that, coverity or some such tool? Thanks for reviewing the patch. It was found using a homemade static undefined behavior checker. - xi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing PD_ALL_VISIBLE
On Mon, 2013-01-21 at 12:49 +0530, Pavan Deolasee wrote: At the minimum your patch will need to have one additional buffer pinned for every K 8192 * 8 heap pages. I assume it's the same K I referred to when responding to Robert: the max number of heap buffers we read before we unpin and repin the VM buffer. Right now it's unlimited, because there is currently no need to have it at all -- I was just offering the solution in case we did want to do VM page cleanup in the future or something. For most cases, the value of K will be large enough to ignore the overhead, but for small values of K, I'm not sure if we can say that with confidence. It's a constant, and we can easily set it high enough that it wouldn't matter. There's no reason at all to choose a small value for K. Consider that an index root page pin is held for the entire scan, and we don't have a problem with that. Of course, for very small tables the real contention might be somewhere else and so this change may not show up anything on the benchmarks. Doesn't your own tests for 33 page tables confirm that there is indeed contention for this case ? I agree its not huge, but I don't know if there are workloads where it will matter. That appears to happen because of the fraction of pinned pages being too high (aside: it was fairly challenging to construct a test where that happened). I believe it was mostly solved by adding a threshold to use the VM, and I can probably solve it completely by doing some more experiments and finding a better threshold value. I understand that my patch is probably not going in, Sorry about that. I know how that feels. But we need some more reasons to justify this change, especially because the performance numbers themselves are not showing any signs either way. That confuses me. The testing was to show it didn't hurt other workloads (like scans or inserts/updates/deletes); so the best possible result is that they don't show signs either way. But yes, I see that others are not interested in the benefits offered by the patch, which is why I'm giving up on it. If people are concerned about the costs, then I can fix those; but there's nothing I can do to increase the benefits. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visual Studio 2012 RC
On 01/01/2013 10:54 AM, Noah Misch wrote: On Mon, Oct 15, 2012 at 07:53:51AM -0400, Noah Misch wrote: The only matter still requiring attention is a fix for IsoLocaleName(). Following off-list coordination with Brar, I went about finishing up this patch. The above problem proved deeper than expected. For Windows Vista, Microsoft made RFC 4646 tags the preferred way to denote a locale in Windows. Microsoft calls them locale names. Starting with Visual Studio 2012, setlocale() accepts locale names in addition to all the things it previously accepted. One can now use things like initdb --locale=zh-CN and CREATE DATABASE foo LC_CTYPE = 'pl'. This meant updating win32_langinfo() and find_matching_ts_config() to handle the new formats. In passing, I fixed an unchecked malloc() in win32_langinfo(). In addition to expanding the set of valid locale inputs, VS2012 changes the (undocumented) content of _locale_t to hold locale names where it previously held locale identifiers. I taught IsoLocaleName() to handle the new material. I also sought to improve the comments on IsoLocaleName(); its significance was not previously clear to me. This thread has some background: http://archives.postgresql.org/message-id/4964b45e.5080...@hagander.net Though I'm not entirely sanguine about digging into the officially-opaque _locale_t, we have been doing it that way for several years. None of the alternatives are clearly-satisfying. In particular, I found no good way to look up the code page corresponding to a locale name on pre-Vista systems. The CRT itself handles this by translating locale names to locale identifiers using a lookup table. The Gnulib localename and setlocale modules are also interesting studies on the topic. In previous reviews, I missed the need to update pgwin32_putenv(). The addition of VS2010 support had also missed it, so this catches up. That function has other problems, but I will hold them for another patch. Tester warning: if you currently have some form of VS2010 installed, including the compilers of Windows SDK 7.1, beware of this problem: http://stackoverflow.com/questions/10888391/link-fatal-error-lnk1123-failure-during-conversion-to-coff-file-invalid-or-c FYI, you can properly fix this without uninstalling anything, giving you a system with a working VS 2012 as well as working SDK 7.1 / VS 2010 SP1 32-bit and 64-bit compilers. Install the tools in the following order: * VS Express 2010: http://www.microsoft.com/visualstudio/eng/products/visual-studio-2010-express * Windows SDK 7.1 * VS 2010 SP1: http://www.microsoft.com/en-au/download/details.aspx?id=23691 * VS 2010 SP1 Compiler Update: http://www.microsoft.com/en-au/download/details.aspx?id=4422 * VS Express 2012 Note that SDK 7.1 and VS 2010 will fail to install if you have a newer version of the Visual c++ 2010 runtime installed. Newer programs often install this for you. As a workaround, you must uninstall the newer runtime, install Visual Studio, the SDK, and service pack, then reinstall the newer runtime to get the programs that require it to work again. I've written more about both of these in the TROUBLESHOOTING section of https://github.com/2ndQuadrant/pg_build_win/blob/master/README.md -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Removing PD_ALL_VISIBLE
On 21.01.2013 11:10, Jeff Davis wrote: That confuses me. The testing was to show it didn't hurt other workloads (like scans or inserts/updates/deletes); so the best possible result is that they don't show signs either way. I went back to look at the initial test results that demonstrated that keeping the pin on the VM buffer mitigated the overhead of pinning the vm page. The obvious next question is, what is the impact when that's inefficient, ie. when you update pages across different 512 MB sections, so that the vm pin has to be dropped and reacquired repeatedly. I tried to construct a worst case scenario for that: create unlogged table testtable (i int4); insert into testtable select generate_series(1, 1500); insert into testtable select generate_series(1, 1500); create index testtable_index on testtable (i); When you traverse tuples using that index, the tuples will come alternating from low-numbered pages and high-numbered pages, which defeats keeping the last vm page pinned. To test, I ran this: set enable_bitmapscan=off; set enable_seqscan=off; begin; delete from testtable where i = 0; rollback; I repeated a few times with and without the patch (rm-pd-all-visible-0118.patch). According to \timing, the delete takes about 12.6 seconds without the patch, and 15.3 seconds with it. This is a worst-case scenario, and the slowdown isn't huge, so maybe it's a worthwhile tradeoff. But it shows that removing PD_ALL_VISIBLE is not completely free. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_dump
On 12/09/2012 04:05 AM, Bruce Momjian wrote: FYI, I will be posting pg_upgrade performance numbers using Unix processes. I will try to get the Windows code working but will also need help. I'm interested ... or at least willing to help ... re the Windows side. Let me know if I can be of any assistance as I have build environments set up for a variety of Windows compiler variants. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] More subtle issues with cascading replication over timeline switches
On 19.01.2013 14:26, Amit kapila wrote: On Friday, January 18, 2013 5:27 PM Heikki Linnakangas wrote: Indeed, looking at the pg_xlog, it's not there (I did a couple of extra timeline switches: ~/pgsql.master$ ls -l data-master/pg_xlog/ total 131084 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00010001 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00010002 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00010003 -rw--- 1 heikki heikki 41 Jan 18 13:38 0002.history -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00020003 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00020004 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00020005 -rw--- 1 heikki heikki 83 Jan 18 13:38 0003.history -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00030005 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00030006 drwx-- 2 heikki heikki 4096 Jan 18 13:38 archive_status ~/pgsql.master$ ls -l data-standbyB/pg_xlog/ total 81928 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00010001 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00010002 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00020003 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00020004 -rw--- 1 heikki heikki 83 Jan 18 13:38 0003.history -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00030005 drwx-- 2 heikki heikki 4096 Jan 18 13:38 archive_status This can be thought of as another variant of the same issue that was fixed by commit 60df192aea0e6458f20301546e11f7673c102101. When standby B scans for the latest timeline, it finds it to be 3, and it reads the timeline history file for 3. After that patch, it also saves it in pg_xlog. It doesn't save the timeline history file for timeline 2, because that's included in the history of timeline 3. However, when standby C connects, it will try to fetch all the history files that it doesn't have, including 0002.history, which throws the error. Is the file 0002.history really required by standby C for any useful purpose? No, not really. Can we think of change in current design such that when standby C connects, even if some old history file (like 0002.history) is not present, it ignores the same and continue. That would be possible too, with some rejiggering of the code. At the moment, however, the code to find the latest timeline works by checking the existence of timeline history files in order. So it first checks for 0002.history, then 0003.history, then 0004.history and so on, until it gets a file-not-found. That logic doesn't work if there are gaps in the sequence. So I'm inclined to just make sure the history files are always copied. I think it's good to have them around anyway, for debugging purposes. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visual Studio 2012 RC
On Mon, Jan 21, 2013 at 05:32:37PM +0800, Craig Ringer wrote: On 01/01/2013 10:54 AM, Noah Misch wrote: Tester warning: if you currently have some form of VS2010 installed, including the compilers of Windows SDK 7.1, beware of this problem: http://stackoverflow.com/questions/10888391/link-fatal-error-lnk1123-failure-during-conversion-to-coff-file-invalid-or-c FYI, you can properly fix this without uninstalling anything, giving you a system with a working VS 2012 as well as working SDK 7.1 / VS 2010 SP1 32-bit and 64-bit compilers. Install the tools in the following order: * VS Express 2010: http://www.microsoft.com/visualstudio/eng/products/visual-studio-2010-express * Windows SDK 7.1 * VS 2010 SP1: http://www.microsoft.com/en-au/download/details.aspx?id=23691 * VS 2010 SP1 Compiler Update: http://www.microsoft.com/en-au/download/details.aspx?id=4422 * VS Express 2012 Note that SDK 7.1 and VS 2010 will fail to install if you have a newer version of the Visual c++ 2010 runtime installed. Newer programs often install this for you. As a workaround, you must uninstall the newer runtime, install Visual Studio, the SDK, and service pack, then reinstall the newer runtime to get the programs that require it to work again. I've written more about both of these in the TROUBLESHOOTING section of https://github.com/2ndQuadrant/pg_build_win/blob/master/README.md What fun. Thanks for working that out. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visual Studio 2012 RC
On 01/21/2013 07:23 PM, Noah Misch wrote: What fun. Thanks for working that out. It's made even more fun by Microsoft's answer to how do I silent-install the VS 2012 SP1 compiler update - you don't. Yeah, it's great. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]
On Mon, Jan 21, 2013 at 9:01 AM, Andrew Dunstan and...@dunslane.net wrote: On 01/21/2013 02:17 AM, Magnus Hagander wrote: On Jan 21, 2013 3:06 AM, Craig Ringer cr...@2ndquadrant.com mailto:cr...@2ndquadrant.com wrote: On 01/21/2013 10:03 AM, Craig Ringer wrote: On 01/19/2013 04:08 AM, Boszormenyi Zoltan wrote: However, I am not sure whether Cygwin provides the mkstemp() call or not. Searching... Found bugzilla reports against mkstemp on Cygwin. Is Cygwin a platform that should be targeted for the server backend these days? I can understand making sure that libpq works on Cygwin, but is there any reason at all to run a Pg server backend on Cygwin rather than as native Windows binaries? I'm not suggesting immediately dropping working support, since this is so trivially worked around. I'm just wondering why anybody cares about the platform. I have suggested similar before, and been voted down :) iirc Andrew uses it, no? Either way, the consensus earlier had been that as long as it doesn't require major surgery or blocks something else, we should try to keep it working. And as you say this sounds like something that can be handled trivially, I think now is not the time. No, I only use the client. But then I support plenty of things I don't use. Oh, I somehow thought you were. And yes, we all support things we don't use - but it certainly helps if there is *someone* out there who uses it. Having a buildfarm animal (which we do) only goes so far... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visual Studio 2012 RC
On 01/21/2013 04:32 AM, Craig Ringer wrote: On 01/01/2013 10:54 AM, Noah Misch wrote: On Mon, Oct 15, 2012 at 07:53:51AM -0400, Noah Misch wrote: The only matter still requiring attention is a fix for IsoLocaleName(). Following off-list coordination with Brar, I went about finishing up this patch. The above problem proved deeper than expected. For Windows Vista, Microsoft made RFC 4646 tags the preferred way to denote a locale in Windows. Microsoft calls them locale names. Starting with Visual Studio 2012, setlocale() accepts locale names in addition to all the things it previously accepted. One can now use things like initdb --locale=zh-CN and CREATE DATABASE foo LC_CTYPE = 'pl'. This meant updating win32_langinfo() and find_matching_ts_config() to handle the new formats. In passing, I fixed an unchecked malloc() in win32_langinfo(). In addition to expanding the set of valid locale inputs, VS2012 changes the (undocumented) content of _locale_t to hold locale names where it previously held locale identifiers. I taught IsoLocaleName() to handle the new material. I also sought to improve the comments on IsoLocaleName(); its significance was not previously clear to me. This thread has some background: http://archives.postgresql.org/message-id/4964b45e.5080...@hagander.net Though I'm not entirely sanguine about digging into the officially-opaque _locale_t, we have been doing it that way for several years. None of the alternatives are clearly-satisfying. In particular, I found no good way to look up the code page corresponding to a locale name on pre-Vista systems. The CRT itself handles this by translating locale names to locale identifiers using a lookup table. The Gnulib localename and setlocale modules are also interesting studies on the topic. In previous reviews, I missed the need to update pgwin32_putenv(). The addition of VS2010 support had also missed it, so this catches up. That function has other problems, but I will hold them for another patch. Tester warning: if you currently have some form of VS2010 installed, including the compilers of Windows SDK 7.1, beware of this problem: http://stackoverflow.com/questions/10888391/link-fatal-error-lnk1123-failure-during-conversion-to-coff-file-invalid-or-c FYI, you can properly fix this without uninstalling anything, giving you a system with a working VS 2012 as well as working SDK 7.1 / VS 2010 SP1 32-bit and 64-bit compilers. Install the tools in the following order: * VS Express 2010: http://www.microsoft.com/visualstudio/eng/products/visual-studio-2010-express * Windows SDK 7.1 * VS 2010 SP1: http://www.microsoft.com/en-au/download/details.aspx?id=23691 * VS 2010 SP1 Compiler Update: http://www.microsoft.com/en-au/download/details.aspx?id=4422 * VS Express 2012 Note that SDK 7.1 and VS 2010 will fail to install if you have a newer version of the Visual c++ 2010 runtime installed. Newer programs often install this for you. As a workaround, you must uninstall the newer runtime, install Visual Studio, the SDK, and service pack, then reinstall the newer runtime to get the programs that require it to work again. I've written more about both of these in the TROUBLESHOOTING section of https://github.com/2ndQuadrant/pg_build_win/blob/master/README.md That's useful information, which should perhaps be put somewhere more obvious for people to find, like the wiki. I realise you're doing a lot of stuff, but you seem to be ahead of just about everyone (including me and I suspect Magnus) on this, so maybe you could take a peek at this patch? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Fri, Jan 18, 2013 at 7:50 AM, Amit Kapila amit.kap...@huawei.com wrote: On Wednesday, January 16, 2013 4:02 PM Heikki Linnakangas wrote: On 07.01.2013 16:23, Boszormenyi Zoltan wrote: Since my other patch against pg_basebackup is now committed, this patch doesn't apply cleanly, patch rejects 2 hunks. The fixed up patch is attached. Now that I look at this a high-level perspective, why are we only worried about timeouts in the Copy-mode and when connecting? The initial checkpoint could take a long time too, and if the server turns into a black hole while the checkpoint is running, pg_basebackup will still hang. Then again, a short timeout on that phase would be a bad idea, because the checkpoint can indeed take a long time. True, but IMO, if somebody want to take basebackup, he should do that when the server is not loaded. A lot of installations don't have such an optino, because there is no time whe nthe server is not loaded. In streaming replication, the keep-alive messages carry additional information, the timestamps and WAL locations, so a keepalive makes sense at that level. But otherwise, aren't we just trying to reimplement TCP keepalives? TCP keepalives are not perfect, but if we want to have an application level timeout, it should be implemented in the FE/BE protocol. I don't think we need to do anything specific to pg_basebackup. The user can simply specify TCP keepalive settings in the connection string, like with any libpq program. I think currently user has no way to specify TCP keepalive settings from pg_basebackup, please let me know if there is any such existing way? You can set it through environment variables. As was discussed elsewhere, it would be good to have the ability to do it natively to pg_basebackup as well. I think specifying TCP settings is very cumbersome for most users, that's the reason most standard interfaces (ODBC/JDBC) have such application level timeout mechanism. By implementing in FE/BE protocol (do you mean to say that make such non-blocking behavior inside Libpq or something else), it might be generic and can be used for others as well but it might need few interface changes. If it's specifying them that is cumbersome, then that's the part we should fix, rather than modifying the protocol, no? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_dump
On 01/21/2013 06:02 PM, Craig Ringer wrote: On 12/09/2012 04:05 AM, Bruce Momjian wrote: FYI, I will be posting pg_upgrade performance numbers using Unix processes. I will try to get the Windows code working but will also need help. I'm interested ... or at least willing to help ... re the Windows side. Let me know if I can be of any assistance as I have build environments set up for a variety of Windows compiler variants. Andrew's git branch has a squashed copy of HEAD on top of it, so I've tidied it up and pushed it to git://github.com/ringerc/postgres.git in the branch parallel_pg_dump ( https://github.com/ringerc/postgres/tree/parallel_pg_dump) . It builds and passes vcregress check on VS 2010 / WinSDK 7.1 on Win7. I haven't had a chance to test the actual parallel dump feature yet; pending. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]
On 01/21/2013 07:05 AM, Magnus Hagander wrote: No, I only use the client. But then I support plenty of things I don't use. Oh, I somehow thought you were. And yes, we all support things we don't use - but it certainly helps if there is *someone* out there who uses it. Having a buildfarm animal (which we do) only goes so far... It is being used. I get emails from time to time from people asking about it. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making testing on Windows easier
On Mon, Jan 21, 2013 at 10:01:29AM +0800, Craig Ringer wrote: I've found EC2 to be unusably slow for Windows builds, with a medium instance taking an hour and a half to do a simple build and vcregress check. They're also restrictive in disk space terms, so you land up needing to add a second EBS volume. Yikes. The build DEBUG step takes 5-7 minutes for me; I use EBS-optimized m1.xlarge spot instances in US East (lately about US$0.19/hr). Fairly sure I once used a c1.medium, though, and it still took 10 minutes. I don't know why your experience has been so different. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump transaction's read-only mode
Pavan Deolasee pavan.deola...@gmail.com writes: On Sun, Jan 20, 2013 at 4:29 AM, Tom Lane t...@sss.pgh.pa.us wrote: As submitted, this broke pg_dump for dumping from pre-8.0 servers. (7.4 didn't accept commas in SET TRANSACTION syntax, and versions before that didn't have the READ ONLY option at all.) My bad. I did not realize that pg_dump is still supported for pre-8.0 releases. It supports servers back to 7.0. At some point we ought to consciously desupport old servers and pull out the corresponding code, but that needs to be an act of intention not omission ;-) (It's entirely likely that the 7.0 server I keep around for testing this is the last one in captivity anywhere. But IIRC, we've heard fairly recent reports of people still using 7.2. We'd have to desupport before 7.3 to save any meaningful amount of pg_dump code, so I'm not convinced it's time to pull the trigger on this quite yet.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making testing on Windows easier
On 01/21/2013 08:55 PM, Noah Misch wrote: On Mon, Jan 21, 2013 at 10:01:29AM +0800, Craig Ringer wrote: I've found EC2 to be unusably slow for Windows builds, with a medium instance taking an hour and a half to do a simple build and vcregress check. They're also restrictive in disk space terms, so you land up needing to add a second EBS volume. Yikes. The build DEBUG step takes 5-7 minutes for me; I use EBS-optimized m1.xlarge spot instances in US East (lately about US$0.19/hr). Fairly sure I once used a c1.medium, though, and it still took 10 minutes. I don't know why your experience has been so different. I was using m1.medium instances, but the same instance type gets Linux builds done in 15-20 mins. Slow, but not that slow. Performance was consistently miserable across several instances, including one full clean rebuild from scratch. Weird. I should perhaps give the bigger instances a go. Unfortunately Jenkins can't auto-start and auto-stop Windows instances yet and I don't have time to improve the Jenkins ec2 plugin right now, so using any instance bigger than an m1.medium would pretty much require manually stopping and starting it for builds. Yick. Instead I'm using my home media PC, a Pentium G630 (like a cut-down Core 2 i3) with a laptop hard drive. It completes builds in 20 minutes. My more power-hungry laptop does it in 7. I was never able to determine why the Windows instances were so much slower than the corresponding Linux instance of the same type. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
Jeff Davis pg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the index will surely kill any possible benefit. Even if there are some cases where it miraculously fails to lose, how many users are going to realize that applies to their case and make use of the option? Sounds good to me. If I remember correctly, there was also an argument that it may be useful for repeatable test results. That's a little questionable for performance (except in those cases where few penalties are identical anyway), but could plausibly be useful for a crash report or something. Meh. There's already a random decision, in the equivalent place and for a comparable reason, in btree (cf _bt_findinsertloc). Nobody's ever complained about that being indeterminate, so I'm unconvinced that there's a market for it with gist. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visual Studio 2012 RC
On 01/21/2013 08:44 PM, Andrew Dunstan wrote: On 01/21/2013 04:32 AM, Craig Ringer wrote: On 01/01/2013 10:54 AM, Noah Misch wrote: On Mon, Oct 15, 2012 at 07:53:51AM -0400, Noah Misch wrote: The only matter still requiring attention is a fix for IsoLocaleName(). Following off-list coordination with Brar, I went about finishing up this patch. The above problem proved deeper than expected. For Windows Vista, Microsoft made RFC 4646 tags the preferred way to denote a locale in Windows. Microsoft calls them locale names. Starting with Visual Studio 2012, setlocale() accepts locale names in addition to all the things it previously accepted. One can now use things like initdb --locale=zh-CN and CREATE DATABASE foo LC_CTYPE = 'pl'. This meant updating win32_langinfo() and find_matching_ts_config() to handle the new formats. In passing, I fixed an unchecked malloc() in win32_langinfo(). In addition to expanding the set of valid locale inputs, VS2012 changes the (undocumented) content of _locale_t to hold locale names where it previously held locale identifiers. I taught IsoLocaleName() to handle the new material. I also sought to improve the comments on IsoLocaleName(); its significance was not previously clear to me. This thread has some background: http://archives.postgresql.org/message-id/4964b45e.5080...@hagander.net Though I'm not entirely sanguine about digging into the officially-opaque _locale_t, we have been doing it that way for several years. None of the alternatives are clearly-satisfying. In particular, I found no good way to look up the code page corresponding to a locale name on pre-Vista systems. The CRT itself handles this by translating locale names to locale identifiers using a lookup table. The Gnulib localename and setlocale modules are also interesting studies on the topic. In previous reviews, I missed the need to update pgwin32_putenv(). The addition of VS2010 support had also missed it, so this catches up. That function has other problems, but I will hold them for another patch. Tester warning: if you currently have some form of VS2010 installed, including the compilers of Windows SDK 7.1, beware of this problem: http://stackoverflow.com/questions/10888391/link-fatal-error-lnk1123-failure-during-conversion-to-coff-file-invalid-or-c FYI, you can properly fix this without uninstalling anything, giving you a system with a working VS 2012 as well as working SDK 7.1 / VS 2010 SP1 32-bit and 64-bit compilers. Install the tools in the following order: * VS Express 2010: http://www.microsoft.com/visualstudio/eng/products/visual-studio-2010-express * Windows SDK 7.1 * VS 2010 SP1: http://www.microsoft.com/en-au/download/details.aspx?id=23691 * VS 2010 SP1 Compiler Update: http://www.microsoft.com/en-au/download/details.aspx?id=4422 * VS Express 2012 Note that SDK 7.1 and VS 2010 will fail to install if you have a newer version of the Visual c++ 2010 runtime installed. Newer programs often install this for you. As a workaround, you must uninstall the newer runtime, install Visual Studio, the SDK, and service pack, then reinstall the newer runtime to get the programs that require it to work again. I've written more about both of these in the TROUBLESHOOTING section of https://github.com/2ndQuadrant/pg_build_win/blob/master/README.md That's useful information, which should perhaps be put somewhere more obvious for people to find, like the wiki. I realise you're doing a lot of stuff, but you seem to be ahead of just about everyone (including me and I suspect Magnus) on this, so maybe you could take a peek at this patch? It's building on my Jenkins instance at the moment, to make sure it doesn't break VS 2010 / Windows SDK 7.1 . I've merged it into HEAD and pushed to https://github.com/ringerc/postgres/tree/vs2012; the only conflict was a trivial one in the docs. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
[HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay
On Mon, Jan 21, 2013 at 12:23:21AM +, Peter Geoghegan wrote: On 19 January 2013 20:38, Noah Misch n...@leadboat.com wrote: staticloud.com seems to be gone. Would you repost these? I've pushed these to a git repo, hosted on github. https://github.com/petergeoghegan/commit_delay_benchmarks I'm sorry that I didn't take the time to make the html benchmarks easily viewable within a browser on this occasion. That's plenty convenient; thanks. What filesystem did you use for testing? Would you also provide /proc/cpuinfo or a rough description of the system's CPUs? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making testing on Windows easier
On Mon, Jan 21, 2013 at 1:00 PM, Craig Ringer cr...@2ndquadrant.com wrote: On 01/21/2013 08:55 PM, Noah Misch wrote: On Mon, Jan 21, 2013 at 10:01:29AM +0800, Craig Ringer wrote: I've found EC2 to be unusably slow for Windows builds, with a medium instance taking an hour and a half to do a simple build and vcregress check. They're also restrictive in disk space terms, so you land up needing to add a second EBS volume. Yikes. The build DEBUG step takes 5-7 minutes for me; I use EBS-optimized m1.xlarge spot instances in US East (lately about US$0.19/hr). Fairly sure I once used a c1.medium, though, and it still took 10 minutes. I don't know why your experience has been so different. I was using m1.medium instances, but the same instance type gets Linux builds done in 15-20 mins. Slow, but not that slow. Performance was consistently miserable across several instances, including one full clean rebuild from scratch. Weird. FYI, in my experience VC++ is typically much faster than GCC, on comparable hardware - particularly with C++. I should perhaps give the bigger instances a go. Unfortunately Jenkins can't auto-start and auto-stop Windows instances yet and I don't have time to improve the Jenkins ec2 plugin right now, so using any instance bigger than an m1.medium would pretty much require manually stopping and starting it for builds. Yick. It should be trivial to script I would think - it's a one-liner to create an instance with ec2 tools. Instead I'm using my home media PC, a Pentium G630 (like a cut-down Core 2 i3) with a laptop hard drive. It completes builds in 20 minutes. My more power-hungry laptop does it in 7. I was never able to determine why the Windows instances were so much slower than the corresponding Linux instance of the same type. Full vs. para-virtualisation perhaps? -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On 21.01.2013 15:06, Tom Lane wrote: Jeff Davispg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the index will surely kill any possible benefit. Even if there are some cases where it miraculously fails to lose, how many users are going to realize that applies to their case and make use of the option? Sounds good to me. If I remember correctly, there was also an argument that it may be useful for repeatable test results. That's a little questionable for performance (except in those cases where few penalties are identical anyway), but could plausibly be useful for a crash report or something. Meh. There's already a random decision, in the equivalent place and for a comparable reason, in btree (cf _bt_findinsertloc). Nobody's ever complained about that being indeterminate, so I'm unconvinced that there's a market for it with gist. I wonder if it would work for gist to do something similar to _bt_findinsertloc, and have a bias towards the left page, but sometimes descend to one of the pages to the right. You would get the cache locality of usually descending down the same subtree, but also fill the pages to the right. Jeff / Alexander, want to give that a shot? When building an index from scratch, using the new buffered index build, you could do a lot better than fill each page like with regular inserts and split when one fills up. You could e.g buffer 10 pages worth of tuples, and perform a 10-way split of all the buffered tuples together, distributing them equally to 10 pages (or 10+something, to leave some room for updates). But that's clearly a separate and much larger patch. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making testing on Windows easier
On 01/21/2013 08:11 AM, Dave Page wrote: I was never able to determine why the Windows instances were so much slower than the corresponding Linux instance of the same type. Full vs. para-virtualisation perhaps? No, Windows builds just are slower. For some time the buildfarm has been reporting run times for various members, so there's plenty of data on this. For example, nightjar and currawong are respectively FreeBSD/gcc and WindowsXP/VC2008 members running in VMs on the same host. currawong takes three times as long for a buildfarm run even though it's doing less work. cheers andrdew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making testing on Windows easier
On Mon, Jan 21, 2013 at 1:59 PM, Andrew Dunstan and...@dunslane.net wrote: On 01/21/2013 08:11 AM, Dave Page wrote: I was never able to determine why the Windows instances were so much slower than the corresponding Linux instance of the same type. Full vs. para-virtualisation perhaps? No, Windows builds just are slower. For some time the buildfarm has been reporting run times for various members, so there's plenty of data on this. For example, nightjar and currawong are respectively FreeBSD/gcc and WindowsXP/VC2008 members running in VMs on the same host. currawong takes three times as long for a buildfarm run even though it's doing less work. Hmm, OK. I don't build PostgreSQL interactively enough to notice I guess. For C++ it's definitely the other way round - I can build pgAdmin in a fraction of the time in a Windows VM than I can on the host Mac it runs on. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making testing on Windows easier
On Mon, Jan 21, 2013 at 3:03 PM, Dave Page dp...@pgadmin.org wrote: On Mon, Jan 21, 2013 at 1:59 PM, Andrew Dunstan and...@dunslane.net wrote: On 01/21/2013 08:11 AM, Dave Page wrote: I was never able to determine why the Windows instances were so much slower than the corresponding Linux instance of the same type. Full vs. para-virtualisation perhaps? No, Windows builds just are slower. For some time the buildfarm has been reporting run times for various members, so there's plenty of data on this. For example, nightjar and currawong are respectively FreeBSD/gcc and WindowsXP/VC2008 members running in VMs on the same host. currawong takes three times as long for a buildfarm run even though it's doing less work. Hmm, OK. I don't build PostgreSQL interactively enough to notice I guess. For C++ it's definitely the other way round - I can build pgAdmin in a fraction of the time in a Windows VM than I can on the host Mac it runs on. Yes, for C++ it's a difference in at least one order of magnitude. For C it definitely isn't. MSVC tends to be faster than gcc (both on windows), which I think is mostly because it builds multiple files in one run. However, actually starting each build step takes significantly longer. We've also added some things like the DEF file magic that can definitely take quite some time, particularly when building the backend. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] More subtle issues with cascading replication over timeline switches
On 18.01.2013 13:57, Heikki Linnakangas wrote: When a standby starts up, and catches up with the master through the archive, it copies the target timeline's history file from the archive to pg_xlog. That's enough for that standby's purposes, but if there is a cascading standby or pg_receivexlog connected to it, it will request the timeline history files of *all* timelines between the starting point and current target. For example, create a master, and take a base backup from it. Use the base backup to initialize two standby servers. Now perform failover first to the first standby, and once the second standby has caught up, fail over again, to the second standby. (Or as a shortcut, forget about the standbys, and just create a recovery.conf file in the master with restore_command='/bin/false' and restart it. That causes a timeline switch. Repeat twice) Now use the base backup to initialize a new standby server (you can kill and delete the old ones), using the WAL archive. Set up a second, cascading, standby server that connects to the first standby using streaming replication, not WAL archive. This cascading standby will fail to cross the timeline switch, because it doesn't find all the history files in the standby: C 2013-01-18 13:38:46.047 EET 7695 FATAL: could not receive timeline history file from the primary server: ERROR: could not open file pg_xlog/0002.history: No such file or directory Indeed, looking at the pg_xlog, it's not there (I did a couple of extra timeline switches: ~/pgsql.master$ ls -l data-master/pg_xlog/ total 131084 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00010001 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00010002 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00010003 -rw--- 1 heikki heikki 41 Jan 18 13:38 0002.history -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00020003 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00020004 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00020005 -rw--- 1 heikki heikki 83 Jan 18 13:38 0003.history -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00030005 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00030006 drwx-- 2 heikki heikki 4096 Jan 18 13:38 archive_status ~/pgsql.master$ ls -l data-standbyB/pg_xlog/ total 81928 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00010001 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00010002 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00020003 -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00020004 -rw--- 1 heikki heikki 83 Jan 18 13:38 0003.history -rw--- 1 heikki heikki 16777216 Jan 18 13:38 00030005 drwx-- 2 heikki heikki 4096 Jan 18 13:38 archive_status This can be thought of as another variant of the same issue that was fixed by commit 60df192aea0e6458f20301546e11f7673c102101. When standby B scans for the latest timeline, it finds it to be 3, and it reads the timeline history file for 3. After that patch, it also saves it in pg_xlog. It doesn't save the timeline history file for timeline 2, because that's included in the history of timeline 3. However, when standby C connects, it will try to fetch all the history files that it doesn't have, including 0002.history, which throws the error. A related problem is that at the segment containing the timeline switch, standby has only restored from archive the WAL file of the new timeline, not the old one. For example above, the timeline switch 1 - 2 happened while inserting to segment 00010003, and a copy of that partial segment was created with the timeline's ID as 00020003. The standby only has the segment from the new timeline in pg_xlog, which is enough for that standby's purposes, but will cause an error when the cascading standby tries to stream it: C 2013-01-18 13:46:12.334 EET 8579 FATAL: error reading result of streaming command: ERROR: requested WAL segment 00010003 has already been removed A straightforward fix would be for the standby to restore those files that the cascading standby needs from the WAL archive, even if they're not strictly required for that standby itself. But actually, isn't it a bad idea that we store the partial segment, 00010003 in this case, in the WAL archive? There's no way to tell that it's partial, and it can clash with a complete segment if more WAL is generated on that timeline. I just argued that pg_receivexlog should not do that, and hence keep the .partial suffix in the same situation, in http://www.postgresql.org/message-id/50f56245.8050...@vmware.com. I came up with the attached. I decided to use different approaches for the timeline history files and the WAL segments. For timeline history files, try to copy any missing timeline history files from
Re: [HACKERS] pg_upgrade and system() return value
On Sun, Jan 20, 2013 at 11:14:47AM -0500, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Can someone comment on the attached patch? pg_upgrade was testing if system() returned a non-zero value, while I am thinking I should be adjusting system()'s return value with WEXITSTATUS(). AFAIK it's not very good style to test the result as an integer, and testing for -1 is not an improvement on that. Actually it's a disimprovement, because the only case where the standard guarantees anything about the integer representation is that zero corresponds to exited with status 0. See the Single Unix Spec, wherein system's result code is defined in terms of wait's, and the wait man page saith If and only if the status returned is from a terminated child process that returned 0 from main() or passed 0 as the status argument to _exit() or exit(), the value stored at the location pointed to by stat_loc will be 0. Regardless of its value, this information may be interpreted using the following macros ... If you want to do something different, then you could test for WIFEXITED WEXITSTATUS == 0. (Testing the latter alone is flat wrong.) But I'm not particularly convinced that that's an improvement on what's there now. I note that your proposed patch would prevent any possibility of printing debug information about failure cases, since it loses the original result value. In short: it's not broken now, but this patch would break it. I thought checking for non-zero was sufficient too, but my Debian Squeeze system() manual page says: The value returned is -1 on error (e.g. fork(2) failed), and the return status of the command otherwise. I am good with the above sentence, but the next sentences have me confused: This latter return status is in the format specified in wait(2). Thus, the exit code of the command will be WEXITSTATUS(status). In case /bin/sh could not be executed, the exit status will be that of a command that does exit(127). I assume my pg_upgrade waitpid() code is OK: ret = waitpid(-1, work_status, wait_for_child ? 0 : WNOHANG); /* no children or, for WNOHANG, no dead children */ if (ret = 0 || !WIFEXITED(work_status)) return false; if (WEXITSTATUS(work_status) != 0) pg_log(PG_FATAL, child worker exited abnormally: %s\n, strerror(errno)); Can that be simplified too? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch making firm recommendation for setting the value of commit_delay
On 21 January 2013 13:10, Noah Misch n...@leadboat.com wrote: What filesystem did you use for testing? Would you also provide /proc/cpuinfo or a rough description of the system's CPUs? Unfortunately, I don't have access to that server at the moment. It's under Greg Smith's control. I believe you yourself had an account on this server at one point. I do know that the CPU is an Intel Core i7-870: http://ark.intel.com/products/41315/Intel-Core-i7-870-Processor-8M-Cache-2_93-GHz I am pretty sure that the filesystem that the various block devices were mounted with was ext4 (without LVM), but it might have been XFS. I don't recall. The operating system was Debian Lenny. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump transaction's read-only mode
On Mon, Jan 21, 2013 at 8:00 AM, Tom Lane t...@sss.pgh.pa.us wrote: Pavan Deolasee pavan.deola...@gmail.com writes: On Sun, Jan 20, 2013 at 4:29 AM, Tom Lane t...@sss.pgh.pa.us wrote: As submitted, this broke pg_dump for dumping from pre-8.0 servers. (7.4 didn't accept commas in SET TRANSACTION syntax, and versions before that didn't have the READ ONLY option at all.) My bad. I did not realize that pg_dump is still supported for pre-8.0 releases. It supports servers back to 7.0. At some point we ought to consciously desupport old servers and pull out the corresponding code, but that needs to be an act of intention not omission ;-) (It's entirely likely that the 7.0 server I keep around for testing this is the last one in captivity anywhere. But IIRC, we've heard fairly recent reports of people still using 7.2. We'd have to desupport before 7.3 to save any meaningful amount of pg_dump code, so I'm not convinced it's time to pull the trigger on this quite yet.) The final release of the PostgreSQL 7.3 series was stamped in January, 2008, at which point the most current release was PostgreSQL 8.2, which is now itself EOL. Also, the original release of PostgreSQL 7.3 was in November of 2012, so we've also passed the ten-year mark for that code. Both of those seem like reasonable yardsticks for saying that we don't need to support that any more. On a more practical level, if someone is trying to upgrade from PostgreSQL 7.3 or older to a modern version, it's probably going to be pretty hard for other reasons anyway. We change a few things in a backward-incompatible fashion in every release, and over time those add up. So I think anyone doing an upgrade from 7.3 probably has their work cut out for them. I would likely advise someone in that situation to consider an intermediate upgrade to 8.2 first, and then go on to 9.2 from there. All that having been said, I'm in no rush to drop the older code this release. We have enough other things to do right now. But, it might be something to consider for 9.4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER command reworks
Tom Lane escribió: Kohei KaiGai kai...@kaigai.gr.jp writes: About ALTER FUNCTION towards aggregate function, why we should raise an error strictly? I agree we probably shouldn't --- traditionally we have allowed that, AFAIR, so changing it would break existing applications for little benefit. Okay, I have pushed the version I posted last week. Similarly, you should not be throwing error when ALTER TABLE is applied to a view, sequence, etc, and the command would otherwise be sensible. As far as ALTER some-obj RENAME goes, this is already working, so I haven't changed anything. Thanks, -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl idempotent option
On Fri, Jan 18, 2013 at 8:48 PM, Phil Sorber p...@omniti.com wrote: +1 Is there more work being done on this, or is the current patch ready to review? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Teaching pg_receivexlog to follow timeline switches
On Fri, Jan 18, 2013 at 7:55 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 18.01.2013 06:38, Phil Sorber wrote: Is it possible to re-use walreceiver code from the backend? I was thinking that it would actually be very useful to have the whole replication functionality modularized and in a standalone binary that could act as a replication proxy and WAL archiver that could run without all the overhead of an entire PG instance There's much sense in trying to extract that into a stand-along module. src/bin/pg_basebackup/receivelog.c is about 1000 lines of code at the moment, and it looks quite different from the corresponding code in the backend, because it doesn't have all the backend infrastructure available. - Heikki That's fair. What do you think about the idea of a full WAL proxy? Probably not for 9.3 at this point though. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages
On Sunday, January 20, 2013 10:50 AM Amit kapila wrote: On Sunday, January 20, 2013 4:04 AM Dickson S. Guedes wrote: 2013/1/18 Amit kapila amit.kap...@huawei.com: Please find the rebased Patch for Compute MAX LSN. The function 'remove_parent_refernces' couldn't be called 'remove_parent_references' ? Shall fix this. Fixed in attached version. With Regards, Amit Kapila. pg_computemaxlsn_v6.patch Description: pg_computemaxlsn_v6.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request for vote to move forward with recovery.conf overhaul
On Fri, Dec 21, 2012 at 2:46 PM, Bruce Momjian br...@momjian.us wrote: There has been discussion in the past of removing or significantly changing the way streaming replication/point-in-time-recovery (PITR) is setup in Postgres. Currently the file recovery.conf is used, but that was designed for PITR and does not serve streaming replication well. This all should have been overhauled when streaming replication was added in 2010 in Postgres 9.0. However, time constraints and concern about backward compatibility has hampered this overhaul. At this point, backward compatibility seems to be hampering our ability to move forward. I would like a vote that supports creation of a new method for setting up streaming replication/point-in-time-recovery, where backward compatibility is considered only where it is minimally invasive. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers +1 I seemed to have lost track of the thread that this spawned out of. Is there a coherent plan for a way forward at this point? Last I recall it was Simon's plan vs Bruce's plan. I also seem to recall there was a patch out there too. I think it's even in the commitfest waiting on author. /me searches Here: https://commitfest.postgresql.org/action/patch_view?id=1026 Perhaps we can get the two plans enumerated, vote, and then get a patch out? I'd really like to see this in 9.3, but not sure if that ship has sailed for this feature or not. Thanks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
On 13-01-21 02:28 AM, Andres Freund wrote: I haven't removed it from the patch afaik, so it would be great to get a profile here! Its only for xlogdump, but that tool helped me immensely and I don't want to maintain it independently... Here is the output from tprof Here is the baseline: Average time 37862 Total Ticks For All Processes (./postgres_perftest) = 19089 Subroutine Ticks% Source Address Bytes == = == == === = .AllocSetAlloc 2270 0.99 aset.c 3bfb84b0 .base_yyparse 1217 0.53 gram.c fe644 168ec .text 1015 0.44 copyfuncs.c 11bfb4 4430 .MemoryContextAllocZero 535 0.23 mcxt.c 3b990 f0 .MemoryContextAlloc 533 0.23 mcxt.c 3b780 e0 .check_stack_depth 392 0.17 postgres.c 568ac100 .core_yylex 385 0.17 gram.c fb5b4 1c90 .expression_tree_walker 347 0.15 nodeFuncs.c 50da4750 .AllocSetFree308 0.13 aset.c 3c9981b0 .grouping_planner242 0.11 planner.c 2399f0 1d10 .SearchCatCache 198 0.09 catcache.c 7ec20550 ._SPI_execute_plan 195 0.09 spi.c 2f0c187c0 .pfree 195 0.09 mcxt.c 3b3b0 70 query_dependencies_walker185 0.08 setrefs.c 2559441b0 .GetSnapshotData 183 0.08 procarray.c 69efc460 .query_tree_walker 176 0.08 nodeFuncs.c 50ae4210 .strncpy 168 0.07 strncpy.s ba080130 .fmgr_info_cxt_security 166 0.07 fmgr.c 3f7b0850 .transformStmt 159 0.07 analyze.c 29091c 12d0 .text141 0.06 parse_collate.c 28ddf8 1 .ExecInitExpr137 0.06 execQual.c 17f18c 15f0 .fix_expr_common 132 0.06 setrefs.c 2557e4160 .standard_ExecutorStart 127 0.06 execMain.c 1d9a00940 .GetCachedPlan 125 0.05 plancache.c ce664310 .strcpy 121 0.05 noname 3bd401a8 With your changes (same test as I described before) Average: 37938 Total Ticks For All Processes (./postgres_perftest) = 19311 Subroutine Ticks% Source Address Bytes == = == == === = .AllocSetAlloc 2162 2.17 aset.c 3bfb84b0 .base_yyparse 1242 1.25 gram.c fdc7c 167f0 .text 1028 1.03 copyfuncs.c 11b4d0 4210 .palloc 553 0.56 mcxt.c 3b4c8 d0 .MemoryContextAllocZero 509 0.51 mcxt.c 3b9e8 f0 .core_yylex 413 0.41 gram.c fac2c 1c60 .check_stack_depth 404 0.41 postgres.c 56730100 .expression_tree_walker 320 0.32 nodeFuncs.c 50d28750 .AllocSetFree261 0.26 aset.c 3c9981b0 ._SPI_execute_plan 232 0.23 spi.c 2ee6247c0 .GetSnapshotData 221 0.22 procarray.c 69d54460 .grouping_planner211 0.21 planner.c 237b60 1cf0 .MemoryContextAlloc 190 0.19 mcxt.c 3b738 e0 .query_tree_walker 184 0.18 nodeFuncs.c 50a68210 .SearchCatCache 182 0.18 catcache.c 7ea08550 .transformStmt 181 0.18 analyze.c 28e774 12d0 query_dependencies_walker180 0.18 setrefs.c 25397c1b0 .strncpy 175 0.18 strncpy.s b9a60130 .MemoryContextCreate 167 0.17 mcxt.c 3bad8160 .pfree 151 0.15 mcxt.c 3b208 70 .strcpy 150 0.15 noname 3bd401a8 .fmgr_info_cxt_security 146 0.15 fmgr.c 3f790850 .text132 0.13 parse_collate.c 28bc50 1 .ExecInitExpr125 0.13 execQual.c 17de28 15e0 .expression_tree_mutator 124 0.12 nodeFuncs.c 53268 1080 .strcmp 122 0.12 noname 1e44158 .fix_expr_common 117 0.12 setrefs.c 25381c160 Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
On 2013-01-21 11:59:18 -0500, Steve Singer wrote: On 13-01-21 02:28 AM, Andres Freund wrote: I haven't removed it from the patch afaik, so it would be great to get a profile here! Its only for xlogdump, but that tool helped me immensely and I don't want to maintain it independently... Here is the output from tprof Here is the baseline: Any chance to do the test ontop of HEAD? The elog stuff has gone in only afterwards and might actually effect this enough to be relevant. Otherwise I have to say I am a bit confused by the mighty difference in costs attributed to AllocSetAlloc given the amount of calls should be exactly the same and the number of trampoline function calls should also stay the same. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] pg_isready (was: [WIP] pg_ping utility)
Changing up the subject line because this is no longer a work in progress nor is it pg_ping anymore. On Sun, Jan 20, 2013 at 10:36 PM, Craig Ringer cr...@2ndquadrant.com wrote: On 01/21/2013 11:26 AM, Robert Haas wrote: On Sun, Jan 20, 2013 at 2:59 PM, Phil Sorber p...@omniti.com wrote: Ok. I can add something to the notes section of the docs. I can also add some code comments for this and for grabbing the default params. Sounds good. I added the code comments, but realized it was already in the docs. I will provide a separate patch for the PQping docs. I also added the handling of extra params as Robert suggested. Oh, I see. Is it really important to have the host and port in the output, or should we trim that down to just e.g. accepting connections? It seems useful to have that if a human is looking at the output, but maybe not if a machine is looking at the output. And if somebody doesn't want it, getting rid of it with sed or awk is nontrivial - imagine: pg_isready -d /tmp:5432 - accepting connections If you are scripting I'd assume you would use the return code value. It might be reasonable to make adding the host and port the verbose method and have just accepting connections as the default output, but my concern there is a misdiagnosis because someone doesn't realize what server they are connecting to. This way they can't miss it and they don't have to add another command line option to get that output. It's a fair concern. Does anyone else have an opinion on this? I have a strong view that the host and port *should* be printed in output. One of the most common issues on Stack Overflow questions from new users is with I can't connect problems that turn out to be them connecting to the wrong host, port, or socket path. It is absolutely vital that the unix socket path being used be printed if unix socket connections are tested, as Mac OS X's insane setup means that PostgreSQL tool builds on that platform regularly use the system libpq not the user-installed PostgreSQL's libpq, and it defaults to a different socket path. If users use pg_ping to verify that their PostgreSQL instance is running it'll use the user-installed libpq - fine, that's what we want. But the user will then wonder why the heck Ruby on Rails's `pg' gem reports it can't connect to the unix socket. They can't compare the unix socket paths printed in the error message if pg_ping doesn't show it. I would like to see pg_ping produce a similar error to psql on connection failure. As stated above this is no longer called pg_ping. Probably should have switched the subject line a while ago. I've left it outputting the host and port as default. Also, we went over a lot of iterations on how the errors should look. I am hesitant to change that now. I think the current errors are good because not being able to connect isn't necessarily an unexpected outcome like it is for psql. $ psql -p psql: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket /tmp/.s.PGSQL.? $ psql -h localhost -p psql: could not connect to server: Connection refused Is the server running on host localhost (::1) and accepting TCP/IP connections on port ? could not connect to server: Connection refused Is the server running on host localhost (127.0.0.1) and accepting TCP/IP connections on port ? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services pg_isready_bin_v11.diff Description: Binary data pg_isready_docs_v11.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Event Triggers: adding information
Robert Haas robertmh...@gmail.com writes: Ok. Will prepare a non controversial patch for ddl_command_end. Thanks. I will make a forceful effort to review that in a timely fashion when it's posted. Please find it attached to this email. COLUMNS=72 git diff --stat doc/src/sgml/event-trigger.sgml | 93 ++- src/backend/commands/event_trigger.c| 122 +++-- src/backend/tcop/utility.c | 273 ++- src/backend/utils/cache/evtcache.c |2 + src/include/commands/event_trigger.h|1 + src/include/utils/evtcache.h|3 +- src/test/regress/expected/event_trigger.out |6 +- src/test/regress/sql/event_trigger.sql |4 + 8 files changed, 359 insertions(+), 145 deletions(-) What is the next part you want to see separated away? We have a choice of: - Context exposing and filtering My preference for that point is to include it and limit it to event triggers written in C, leaving the door open to add new events in the future, and allowing existing consumers of that framework see through implementation details should they need it. It's better than just a ProcessUtility_hook because you can see it in \dy, decide to disable it with ALTER EVENT TRIGGER (e.g. in a transcation to avoid recursive calling into itself), and it won't run in single user mode even when in shared_preload_libraries. Another idea is to protect it with a GUC. Yet another idea is to add a new property in the CREATE EVENT TRIGGER command so that user can request explicitely to have that and know the feature will break at next release and all the ones after that: CREATE EVENT TRIGGER foo ON ddl_command_start WHEN show_internals in ('Yes I know what I''m doing', 'Thanks') EXECUTE PROCEDURE …(); That particular phrasing and option choice might be revisited before commit, though, I'm not wedded to it. - Additional Information to PL/pgSQL We're talking about : - operation (CREATE, ALTER, DROP) - object type (TABLE, FUNCTION, TEXT SEARCH DICTIONARY, …) - object OID(when there's a single object target) - object shema name (when there's a single object target) - object name (when there's a single object target) - Additional Information to PL/* It's separated away because you did opt-out from reviewing that part, so I guess we will continue to make that a separate patch, and it will still need to happen. I've been said that I shouldn't be on the hook to provide for that, and I wonder, but anyway I did already implement it in a previous version so I can do it again as soon as we know what we expose. - Command String Normalisation I removed it already from the main patch because we needed to talk about it more, and it's kind of a big part in itself. Meanwhile it seems to me that an agreement has been reached and that I will be able to follow the consensus and resume working on that part. - ddl_event_trace Now is the time to kill it (for 9.3) or allow it in. It means another cache lookup at DDL Event time (so 2 cache lookups per DDL in 9.3); and allow users who just want to have the information not to have to care about when it is available. - Non DDL related Events Extra bonus items, a table_rewrite event. Unlikely for 9.3 at best, unless someone else wants to have at it, maybe? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support ddl_command_end.v0.patch.gz Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] count(*) of zero rows returns 1
On Tue, Jan 15, 2013 at 5:47 AM, Gurjeet Singh singh.gurj...@gmail.com wrote: postgres=# select * from test_0_col_table ; -- (20 rows) Interestingly, PostgreSQL 9.2 has regressed here. Not sure if we care, but worth mentioning: psql (9.2.2) test=# select count(*) from foo1; count -- 1000 (1 row) Time: 632.907 ms test=# select * from foo1; (No rows) Time: 1012.567 ms Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: fix corner use case of variadic fuctions usage
Hello 2013/1/19 Tom Lane t...@sss.pgh.pa.us: Pavel Stehule pavel.steh...@gmail.com writes: 2013/1/18 Tom Lane t...@sss.pgh.pa.us: The approach is also inherently seriously inefficient. ... What is important - for this use case - there is simple and perfect possible optimization - in this case non variadic manner call of variadic any function all variadic parameters will share same type. Inside variadic function I have not information so this situation is in this moment, but just I can remember last used type - and I can reuse it, when parameter type is same like previous parameter. So there no performance problem. Well, if we have to hack each variadic function to make it work well in this scenario, that greatly weakens the major argument for the proposed patch, namely that it provides a single-point fix for VARIADIC behavior. BTW, I experimented with lobotomizing array_in's caching of I/O function lookup behavior, by deleting the if-test at arrayfuncs.c line 184. That seemed to make it about 30% slower for a simple test involving converting two-element float8 arrays. So while failing to cache this stuff isn't the end of the world, arguing that it's not worth worrying about is simply wrong. But large arrays have a worse problem: the approach flat out does not work for arrays of more than FUNC_MAX_ARGS elements, because there's no place to put their values in the FunctionCallInfo struct. This last problem is, so far as I can see, unfixable within this approach; surely we are not going to accept a feature that only works for small arrays. So I am going to mark the CF item rejected not just RWF. disagree - non variadic manner call should not be used for walk around FUNC_MAX_ARGS limit. So there should not be passed big array. That's utter nonsense. Why wouldn't people expect concat(), for example, to work for large (or even just moderate-sized) arrays? This problem *is* a show stopper for this patch. I suggested a way you can fix it without having such a limitation. If you don't want to go that way, well, it's not going to happen. I agree the prospect that each variadic-ANY function would have to deal with this case for itself is a tad annoying. But there are only two of them in the existing system, and it's not like a variadic-ANY function isn't a pretty complicated beast anyway. You propose now something, what you rejected four months ago. Well, at the time it wasn't apparent that this approach wouldn't work. It is now, though. so here is rewritten patch * there are no limit for array size that holds variadic arguments * iteration over mentioned array is moved to variadic function implementation * there is a new function get_fn_expr_arg_variadic, that returns true, when last argument has label VARIADIC - via FuncExpr node * fix all our variadic any functions - concat(), concat_ws() and format() * there is a small optimization - remember last used typOutput function and reuse it when possible * it doesn't change any previous test or documented behave I hope so almost all issues and questions are solved and there are agreement on implemented behave. Regards Pavel regards, tom lane variadic_any_fix_20130121.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump transaction's read-only mode
On 01/21/2013 02:00 PM, Tom Lane wrote: Pavan Deolasee pavan.deola...@gmail.com writes: On Sun, Jan 20, 2013 at 4:29 AM, Tom Lane t...@sss.pgh.pa.us wrote: As submitted, this broke pg_dump for dumping from pre-8.0 servers. (7.4 didn't accept commas in SET TRANSACTION syntax, and versions before that didn't have the READ ONLY option at all.) My bad. I did not realize that pg_dump is still supported for pre-8.0 releases. It supports servers back to 7.0. At some point we ought to consciously desupport old servers and pull out the corresponding code, but that needs to be an act of intention not omission ;-) (It's entirely likely that the 7.0 server I keep around for testing this is the last one in captivity anywhere. But IIRC, we've heard fairly recent reports of people still using 7.2. We'd have to desupport before 7.3 to save any meaningful amount of pg_dump code, so I'm not convinced it's time to pull the trigger on this quite yet.) old versions are still alive - just yesterday we had someone on IRC trying to build 7.1.3 on a modern ubuntu installation because he has an old app depending on it., and especially 7.2 shows up regulary as well. If the effort to keep pg_dump support for those alive is not too bad, we should try to support them as long as we can. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] PQping Docs
Attached is a patch that adds a note about the FATAL messages that appear in the logs if you don't pass a valid user or dbname to PQping or PQpingParams. This was requested in the pg_isready thread. libpq_pqping_doc.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
IMHO that's the single most important task of a review. Really? I'd say the most important task for a review is does the patch do what it says it does?. That is, if the patch is supposed to implement feature X, does it actually? If it's a performance patch, does performance actually improve? If the patch doesn't implement what it's supposed to, who cares what the code looks like? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
Josh Berkus j...@agliodbs.com writes: IMHO that's the single most important task of a review. Really? I'd say the most important task for a review is does the patch do what it says it does?. That is, if the patch is supposed to implement feature X, does it actually? If it's a performance patch, does performance actually improve? If the patch doesn't implement what it's supposed to, who cares what the code looks like? But even before that, you have to ask whether what it's supposed to do is something we want. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
But even before that, you have to ask whether what it's supposed to do is something we want. The reviewer can't usually answer that though. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] count(*) of zero rows returns 1
Marti Raudsepp ma...@juffo.org writes: Interestingly, PostgreSQL 9.2 has regressed here. Not sure if we care, but worth mentioning: Regressed? The output looks the same to me as it has for some time. test=# select * from foo1; (No rows) Time: 1012.567 ms How did you get that? I don't believe it's possible in the default output format. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
On Mon, Jan 21, 2013 at 2:09 PM, Josh Berkus j...@agliodbs.com wrote: But even before that, you have to ask whether what it's supposed to do is something we want. The reviewer can't usually answer that though. They can answer whether THEY want it, though. And Tom, Andrew, and I all just got through arguing that that is one of the most, if not the most, important parts of a review. Seriously. Opinions are good. Lack of opinions leads to ivory tower syndrome, which of course we've got, but I think most of us are sufficiently self-aware to at least know that it isn't a good thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump transaction's read-only mode
Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes: On 01/21/2013 02:00 PM, Tom Lane wrote: (It's entirely likely that the 7.0 server I keep around for testing this is the last one in captivity anywhere. But IIRC, we've heard fairly recent reports of people still using 7.2. We'd have to desupport before 7.3 to save any meaningful amount of pg_dump code, so I'm not convinced it's time to pull the trigger on this quite yet.) old versions are still alive - just yesterday we had someone on IRC trying to build 7.1.3 on a modern ubuntu installation because he has an old app depending on it., and especially 7.2 shows up regulary as well. If the effort to keep pg_dump support for those alive is not too bad, we should try to support them as long as we can. Of course, the counter-argument is that people who are still using those versions are probably not going to be interested in upgrading to a modern version anyway. Or if they are, dumping with their existing version of pg_dump is likely to not be much worse than dumping with the target version's pg_dump --- as Robert mentioned upthread, if you're migrating from such an old version you're in for a lot of compatibility issues anyhow, most of which pg_dump can't save you from. Having said that, I'm not in a hurry to pull pg_dump support for old versions. But we can't keep it up forever. In particular, once that old HPUX box dies, I'm not likely to want to try to get 7.0 to compile on a newer box just so I can keep checking pg_dump compatibility. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] count(*) of zero rows returns 1
On Mon, Jan 21, 2013 at 9:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: test=# select * from foo1; (No rows) Time: 1012.567 ms How did you get that? I don't believe it's possible in the default output format. Oh I see, it's because I have \x auto in my .psqlrc. If I set \x auto or \x on then it says (No rows) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing PD_ALL_VISIBLE
On Mon, Jan 21, 2013 at 1:52 AM, Jeff Davis pg...@j-davis.com wrote: Of course, there is an argument that this patch will simplify the code, but I'm not sure if its enough to justify the additional contention which may or may not show up in the benchmarks we are running, but we know its there. What additional contention? How do you know it's there? We know it's there because every additional page that you access has to be looked up and locked and the memory that contains it brought into the CPU cache. If you look up and lock more pages, you WILL have more contention - both for memory, and for locks - and more runtime cost. Whether or not you can currently detect that contention and/or runtime cost experimentally is another matter. Failure to do so could indicate either that you haven't got the right test case - Heikki seems to have found one that works - or that it's being masked by other things, but might not be always. To raise an example which I believe is similar to this one, consider Kevin Grittner's work on SSI. He set himself a goal that SSI should impose a performance regression of no more than 2% - and he met that goal, at the time the code was committed. But then, my read scalability work during the 9.2 cycle blew the lid off of read performance for certain workloads, and now SSI is FAR slower on those workloads. His code always had a scalability problem, but you couldn't measure it experimentally before I did that work, because there were equally bad scalability problems elsewhere. Now there aren't, and you can, and I have. We might well have refused to commit that patch with the locking scheme it uses today if my scalability work had been done first - or perhaps we would have decided that the case where the difference is large is too narrow to be worth worrying about, but I think it would have at least provoked discussion. My biggest concern about the visibility map is that it will act as a focal point for contention. Map half a gigabyte of heap down to 1 page of visibility map and it's easy to think that you could have some situation in which that page gets very, very hot. We know that cache line stealing is a significant cost of ProcArray manipulation, which is why the PGXACT patch makes a significant difference in write throughput. Now your argument seems to be that we can't measure any such effect here, but maybe we're just not measuring the right thing. Heikki's example reminded me that I was concerned about the cost of repeatedly pinning different VM buffers during an index-only scan, if the relation were very large. That's seems easy to justify on the grounds that you're saving so much I/O and memory traffic that the pins will seem cheap by comparison ... but they don't, always. IIRC, an index-only scan on a fully-cached relation is not necessarily faster than a regular index-scan, even if the heap is entirely all-visible. I realize these examples aren't directly applicable to this case, so I'm not sure if they help to advance the discussion or not. I advance them only as evidence that simple tests don't always manage to capture the real costs and benefits of a feature or optimization, and that predicting the system-wide effects of changes in this area is hard. For a large benefit I think we would perhaps bite the bullet and take our chances, but in my (human and imperfect) judgement the benefit here is not large enough to justify it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
On 13-01-21 12:15 PM, Andres Freund wrote: On 2013-01-21 11:59:18 -0500, Steve Singer wrote: On 13-01-21 02:28 AM, Andres Freund wrote: I haven't removed it from the patch afaik, so it would be great to get a profile here! Its only for xlogdump, but that tool helped me immensely and I don't want to maintain it independently... Here is the output from tprof Here is the baseline: Any chance to do the test ontop of HEAD? The elog stuff has gone in only afterwards and might actually effect this enough to be relevant. HEAD(:fb197290) Average: 28877 .AllocSetAlloc 1442 1.90 aset.c 3a9384b0 .base_yyparse 1220 1.61 gram.c fdbc0 168ec .MemoryContextAlloc 485 0.64 mcxt.c 3a154 e0 .core_yylex 407 0.54 gram.c fab30 1c90 .AllocSetFree320 0.42 aset.c 3b3181b0 .MemoryContextAllocZero 316 0.42 mcxt.c 3a364 f0 .grouping_planner271 0.36 planner.c 2b0ce8 1d10 .strncpy 256 0.34 strncpy.s b8ca0130 .expression_tree_walker 222 0.29 nodeFuncs.c 4f734750 ._SPI_execute_plan 221 0.29 spi.c 2fb230840 .SearchCatCache 182 0.24 catcache.c 7d870550 .GetSnapshotData 161 0.21 procarray.c 68acc460 .fmgr_info_cxt_security 155 0.20 fmgr.c 3e130850 .pfree 152 0.20 mcxt.c 39d84 70 .expression_tree_mutator 151 0.20 nodeFuncs.c 51c84 1170 .check_stack_depth 142 0.19 postgres.c 5523c100 .text126 0.17 parse_collate.c 233d90 1 .transformStmt 125 0.16 analyze.c 289e88 12c0 .ScanKeywordLookup 123 0.16 kwlookup.c f7ab4154 .new_list120 0.16 list.c 40f74 b0 .subquery_planner115 0.15 planner.c 2b29f8dc0 .GetCachedPlan 115 0.15 plancache.c cdab0310 .ExecInitExpr114 0.15 execQual.c 17e690 15f0 .set_plan_refs 113 0.15 setrefs.c 252630cb0 .standard_ExecutorStart 110 0.14 execMain.c 1d9264940 .heap_form_tuple 107 0.14 heaptuple.c 177c842f0 .query_tree_walker 105 0.14 nodeFuncs.c 4f474210 HEAD + patch: Average: 29035 Total Ticks For All Processes (./postgres_perftest) = 15044 Subroutine Ticks% Source Address Bytes == = == == === = .AllocSetAlloc 1422 1.64 aset.c 3a9384b0 .base_yyparse 1201 1.38 gram.c fd1e8 167f0 .palloc 470 0.54 mcxt.c 39e04 d0 .core_yylex 364 0.42 gram.c fa198 1c60 .MemoryContextAllocZero 282 0.33 mcxt.c 3a324 f0 ._SPI_execute_plan 250 0.29 spi.c 2f8c18840 .AllocSetFree244 0.28 aset.c 3b3181b0 .strncpy 234 0.27 strncpy.s b86a0130 .expression_tree_walker 229 0.26 nodeFuncs.c 4f698750 .grouping_planner176 0.20 planner.c 2aea84 1d30 .SearchCatCache 170 0.20 catcache.c 7d638550 .GetSnapshotData 168 0.19 procarray.c 68904460 .assign_collations_walker155 0.18 parse_collate.c 231f4c7e0 .subquery_planner141 0.16 planner.c 2b07b4dc0 .fmgr_info_cxt_security 141 0.16 fmgr.c 3e110850 .check_stack_depth 140 0.16 postgres.c 550a0100 .ExecInitExpr138 0.16 execQual.c 17d2f8 15e0 .pfree 137 0.16 mcxt.c 39b44 70 .transformStmt 132 0.15 analyze.c 287dec 12c0 .new_list128 0.15 list.c 40f44 90 .expression_tree_mutator 125 0.14 nodeFuncs.c 51bd8 1080 .preprocess_expression 118 0.14 planner.c 2adf541a0 .strcmp 118 0.14 noname 1e44158 .standard_ExecutorStart 115 0.13 execMain.c 1d77c0940 .MemoryContextAlloc 111 0.13 mcxt.c 3a074 e0 .set_plan_refs 109 0.13 setrefs.c 2506c4ca0 Otherwise I have to say I am a bit confused by the mighty difference in costs attributed to AllocSetAlloc given the amount of calls should be exactly the same and the number of trampoline function calls should also stay the same. Greetings, Andres Freund -- Andres Freundhttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Teaching pg_receivexlog to follow timeline switches
This patch was in Needs Review status, but you committed it on 2013-01-17. I have marked it as such in the CF app. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
Robert Haas robertmh...@gmail.com writes: On Sun, Jan 20, 2013 at 7:07 PM, Jeff Janes jeff.ja...@gmail.com wrote: As a junior reviewer, I'd like to know if my main task should be to decide between 1) writing a review convincing you or Tom that your judgement is hasty, or 2) to convince the author that your judgement is correct. Even if you as a reviewer don't know the answer to those questions, clearly delineating the key issues may enable others to comment without having to read and understand the whole patch, which can expedite things considerably. We wouldn't have to convince anyone about a judgement being hasty if only those making judgement took some time and read the patch before making those judgements on list (even if doing so is not changing the judgement, it changes the hasty part already). Offering advice and making judgments are two very different things. And I'm lucky enough to have received plenty of very good advice from senior developers here that seldom did read my patches before offering them. I'm yet to see good use of anyone's time once a hasty judgement has been made about a patch. And I think there's no value judgement in calling a judgement hasty here: it's a decision you took and published *about a patch* before reading the patch. That's about it. If you don't have the time to read the patch, offer advice. Anything more, and you can set a watch to count the cumulative time we are all loosing on trying to convince each other about something else. Ok, now you will tell me there's no difference and it's just playing with words. It's not. Judgement is about how the patch do things, Advice is about how to do things in general. Ok, here's an hypothetical example, using some words we tend to never use here because we are all very polite and concerned about each other happiness when talking about carefully crafted code: Advice:You don't do things that way, this way is the only one we will ever accept, because we've been sweating blood over the years to get in a position where it now works. Hint: it's not talking about the patch content, but what is supposedly a problem with the patch. It's easy to answer I'm so happy I didn't actually do it that way. Judgement: You need to think about the problem you want to solve before sending a patch, because there's a hole in it too big for me to be able to count how many bugs are going to to dance in there. It's not a patch, it's a quagmire. BTW, I didn't read it, it stinks too much. Hint: imagine it was your patch and now you have to try and convince any other commiter to have a look at it. Now, I've been reviewing patches in all commit fests but one, and began reviewing well before the idea of a commit fest even existed. My idea of a good review is being able to come up with one of only 3 possible summaries: - it looks good, please consider a commit, any remaining work is going to have to be done by a commiter anyway - it's not ready, please fix this and that, think about this use case, review docs, whatever - it's ahead of me, this patch now needs a commiter (or just someone else) to read it then weigth in. at least it compiles, follow the usual code and comment style, the documentation is complete and free or errors, and I tried it and it does what it says (perfs, features, etc etc, bonus points for creative usage and trying to make it crash). Of course, reviewers don't get much feedback in general, so I can only hope that guideline is good enough for most commiters. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Event Triggers: adding information
On Mon, Jan 21, 2013 at 12:27 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Please find it attached to this email. Nice clean patch, thanks! Committed, after tinkering with the CommandCounterIncrement() stuff a bit. I will respond to the rest of your email later. Reading through this patch left me with a slight concern regarding both ddl_command_start and ddl_command_end: what happens if there's more than one event trigger scheduled to fire, and one of them does something like drop (with cascade) the function that a later one uses? Admittedly, that seems like an unlikely case, but we probably want to check that nothing too awful happens (e.g. crashing the server) and maybe add a regression test to cover this scenario. Another thing is that we might want to document that if a command errors out, ddl_command_end will never be reached; and perhaps also that if ddl_command_start errors out, the command itself will never be reached. Perhaps this is so obvious as to not bear mentioning, I don't know, but the thought crossed my mind that someone might fail to realize it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
On Wed, Jan 16, 2013 at 8:18 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: Here's a breakdown based purely on the names from the CF page (i.e. I didn't check archives to see who actually posted reviews, and didn't take into account reviews posted without updating the CF page). FWIW, I reviewed at least one at the point you did this survey, and I did update the CF page, but I didn't put my name into that box marked reviewers because it is an extra step (Edit Patch) that isn't immediatly obvious. Isn't there a way to automatically populate that based on people linking in their reviews? I guess it might be difficult when a CF manager comes along to add them and they aren't their reviews. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request for vote to move forward with recovery.conf overhaul
On Tue, Jan 22, 2013 at 1:11 AM, Phil Sorber p...@omniti.com wrote: On Fri, Dec 21, 2012 at 2:46 PM, Bruce Momjian br...@momjian.us wrote: There has been discussion in the past of removing or significantly changing the way streaming replication/point-in-time-recovery (PITR) is setup in Postgres. Currently the file recovery.conf is used, but that was designed for PITR and does not serve streaming replication well. This all should have been overhauled when streaming replication was added in 2010 in Postgres 9.0. However, time constraints and concern about backward compatibility has hampered this overhaul. At this point, backward compatibility seems to be hampering our ability to move forward. I would like a vote that supports creation of a new method for setting up streaming replication/point-in-time-recovery, where backward compatibility is considered only where it is minimally invasive. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers +1 I seemed to have lost track of the thread that this spawned out of. Is there a coherent plan for a way forward at this point? Last I recall it was Simon's plan vs Bruce's plan. I also seem to recall there was a patch out there too. I think it's even in the commitfest waiting on author. /me searches Here: https://commitfest.postgresql.org/action/patch_view?id=1026 Perhaps we can get the two plans enumerated, vote, and then get a patch out? I'd really like to see this in 9.3, but not sure if that ship has sailed for this feature or not. Yes, that is one of the most important patches in the list, and I could put some effort in it for either review or coding. It is an 17-month-old-patch, so of course it does not apply on master. However before taking any actions, I would like to know the following: - Simon, are you planning to update this patch? - As we are rushing to finish wrapping up 9.3, do you consider it is too late to begin that? Thanks, -- Michael Paquier http://michael.otacoo.com
[HACKERS] Prepared statements fail after schema changes with surprising error
A user reported an interesting issue today. After restoring a dump created with --clean on a running application in his development environment his application started complaining of missing tables despite those tables very clearly existing. After a little thinking, we determined that this was due to the now-default behaviour of Rails to create prepared statements for most queries. The prepared statements error out because the old relation they point to is missing, but this gives a misleading report thus: PG::Error: ERROR: relation xxx does not exist I'm not sure what the best outcome here would be. A very simple solution might be to expand the error message or add a hint to make it descriptive enough that a user might be able to figure out the cause on their own without happening to have the unusual intersection of Rails and Postgres internals knowlege I (unfortunately) possess. A better solution might be to attempt to re-prepare the statement before throwing an error. -pvh -- Peter van Hardenberg San Francisco, California Everything was beautiful, and nothing hurt. -- Kurt Vonnegut
Re: [HACKERS] Event Triggers: adding information
2013/1/21 Robert Haas robertmh...@gmail.com: Another thing is that we might want to document that if a command errors out, ddl_command_end will never be reached; and perhaps also that if ddl_command_start errors out, the command itself will never be reached. Perhaps this is so obvious as to not bear mentioning, I don't know, but the thought crossed my mind that someone might fail to realize it. I think that should be a mention about that in docs, someone could expect that ddl_command_end be reached even when ddl_command_start erros out, and try to use it in some way. Regards, -- Dickson S. Guedes mail/xmpp: gue...@guedesoft.net - skype: guediz http://github.com/guedes - http://guedesoft.net http://www.postgresql.org.br -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prepared statements fail after schema changes with surprising error
Peter van Hardenberg p...@pvh.ca writes: A user reported an interesting issue today. After restoring a dump created with --clean on a running application in his development environment his application started complaining of missing tables despite those tables very clearly existing. After a little thinking, we determined that this was due to the now-default behaviour of Rails to create prepared statements for most queries. The prepared statements error out because the old relation they point to is missing, but this gives a misleading report thus: PG::Error: ERROR: relation xxx does not exist I'm not sure what the best outcome here would be. A very simple solution might be to expand the error message or add a hint to make it descriptive enough that a user might be able to figure out the cause on their own without happening to have the unusual intersection of Rails and Postgres internals knowlege I (unfortunately) possess. A better solution might be to attempt to re-prepare the statement before throwing an error. Works for me ... regression=# create table z1 (f1 int , f2 int); CREATE TABLE regression=# prepare sz1 as select * from z1; PREPARE regression=# insert into z1 values(1,2); INSERT 0 1 regression=# execute sz1; f1 | f2 + 1 | 2 (1 row) regression=# drop table z1; DROP TABLE regression=# create table z1 (f1 int , f2 int); CREATE TABLE regression=# insert into z1 values(3,4); INSERT 0 1 regression=# execute sz1; f1 | f2 + 3 | 4 (1 row) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
On Fri, Jan 18, 2013 at 6:20 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Hmm, so it's the same issue I thought I fixed yesterday. My patch only fixed it for the case that the timeline switch is in the first page of the segment. When it's not, you still get two calls for a WAL record, first one for the first page in the segment, to verify that, and then the page that actually contains the record. The first call leads XLogPageRead to think it needs to read from the old timeline. We didn't have this problem before the xlogreader refactoring because XLogPageRead() was always called with the RecPtr of the record, even when we actually read the segment header from the file first. We'll have to somehow get that same information, the RecPtr of the record we're actually interested in, to XLogPageRead(). We could add a new argument to the callback for that, or we could keep xlogreader.c as it is and pass it through from ReadRecord to XLogPageRead() in the private struct. An explicit argument to the callback is probably best. That's straightforward, and it might be useful for the callback to know the actual WAL position that xlogreader.c is interested in anyway. See attached. Just to let you know that I am still getting the error even after commit 2ff6555. With the same scenario: 1) Start a master with 2 slaves 2) Kill/Stop slave 3) Promote slave 1, it switches to timeline 2 Log on slave 1 LOG: selected new timeline ID: 2 4) Reconnect slave 2 to save 1, slave 2 remains stuck in timeline 1 even if recovery_target_timeline is set to latest Log on slave 1 at this moment: DEBUG: received replication command: IDENTIFY_SYSTEM DEBUG: received replication command: TIMELINE_HISTORY 2 DEBUG: received replication command: START_REPLICATION 0/500 TIMELINE 1 Slave 1 receives command to start replication with timeline 1, while it is sync with timeline 2. Log on slave 2 at this moment: LOG: restarted WAL streaming at 0/500 on timeline 1 LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 at 0/5014200 DEBUG: walreceiver ended streaming and awaits new instructions The timeline history file is the same for both nodes: $ cat 0002.history 10/5014200no recovery target specified I might be wrong, but shouldn't there be also an entry for timeline 2 in this file? Am I missing something? -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
On Tue, Jan 22, 2013 at 9:06 AM, Michael Paquier michael.paqu...@gmail.comwrote: On Fri, Jan 18, 2013 at 6:20 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Hmm, so it's the same issue I thought I fixed yesterday. My patch only fixed it for the case that the timeline switch is in the first page of the segment. When it's not, you still get two calls for a WAL record, first one for the first page in the segment, to verify that, and then the page that actually contains the record. The first call leads XLogPageRead to think it needs to read from the old timeline. We didn't have this problem before the xlogreader refactoring because XLogPageRead() was always called with the RecPtr of the record, even when we actually read the segment header from the file first. We'll have to somehow get that same information, the RecPtr of the record we're actually interested in, to XLogPageRead(). We could add a new argument to the callback for that, or we could keep xlogreader.c as it is and pass it through from ReadRecord to XLogPageRead() in the private struct. An explicit argument to the callback is probably best. That's straightforward, and it might be useful for the callback to know the actual WAL position that xlogreader.c is interested in anyway. See attached. Just to let you know that I am still getting the error even after commit 2ff6555. With the same scenario: 1) Start a master with 2 slaves 2) Kill/Stop slave 3) Promote slave 1, it switches to timeline 2 Log on slave 1 LOG: selected new timeline ID: 2 4) Reconnect slave 2 to save 1, slave 2 remains stuck in timeline 1 even if recovery_target_timeline is set to latest Log on slave 1 at this moment: DEBUG: received replication command: IDENTIFY_SYSTEM DEBUG: received replication command: TIMELINE_HISTORY 2 DEBUG: received replication command: START_REPLICATION 0/500 TIMELINE 1 Slave 1 receives command to start replication with timeline 1, while it is sync with timeline 2. Log on slave 2 at this moment: LOG: restarted WAL streaming at 0/500 on timeline 1 LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 at 0/5014200 DEBUG: walreceiver ended streaming and awaits new instructions The timeline history file is the same for both nodes: $ cat 0002.history 10/5014200no recovery target specified I might be wrong, but shouldn't there be also an entry for timeline 2 in this file? Am I missing something? Sorry, there are no problems... I simply forgot to set up recovery_target_timeline to 'latest' in recovery.conf... -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] Request for vote to move forward with recovery.conf overhaul
On Mon, Jan 21, 2013 at 6:23 PM, Michael Paquier michael.paqu...@gmail.com wrote: Yes, that is one of the most important patches in the list, and I could put some effort in it for either review or coding. I think it would be great if you could elaborate on your reasons for feeling that this patch is particularly important. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
On Mon, Jan 21, 2013 at 6:23 PM, Phil Sorber p...@omniti.com wrote: On Wed, Jan 16, 2013 at 8:18 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: Here's a breakdown based purely on the names from the CF page (i.e. I didn't check archives to see who actually posted reviews, and didn't take into account reviews posted without updating the CF page). FWIW, I reviewed at least one at the point you did this survey, and I did update the CF page, but I didn't put my name into that box marked reviewers because it is an extra step (Edit Patch) that isn't immediatly obvious. Isn't there a way to automatically populate that based on people linking in their reviews? Sadly, the CF application doesn't have access to the user name - real name mappings. And while it's only mildly annoying that the updates are displayed under user name rather than realname, showing username for the author and reviewers fields would be really annoying, especially because not all patch authors necessarily even have accounts. I guess it might be difficult when a CF manager comes along to add them and they aren't their reviews. That's also an issue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prepared statements fail after schema changes with surprising error
On 22 January 2013 00:00, Tom Lane t...@sss.pgh.pa.us wrote: Works for me ... That's what I thought. But looking at RangeVarGetRelidExtended() and recomputeNamespacePath(), do you suppose that the problem could be that access privileges used by the app differed for a schema (or, more accurately, two physically distinct namespaces with the same nspname) between executions of the prepared query? -- Regards, Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request for vote to move forward with recovery.conf overhaul
On Tue, Jan 22, 2013 at 9:27 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Jan 21, 2013 at 6:23 PM, Michael Paquier michael.paqu...@gmail.com wrote: Yes, that is one of the most important patches in the list, and I could put some effort in it for either review or coding. I think it would be great if you could elaborate on your reasons for feeling that this patch is particularly important. Sure. recovery.conf has been initially created for PITR management, but since 9.0 and the introduction of streaming replication it is being used for too many things that it was first targeting for, like now it can be used to define where a slave can connect to a root node, fetch the archives, etc. I am seeing for a long time on hackers (2010?) that postgres should make the move on giving up recovery.conf and merge it with postgresql.conf. I didn't know about the existence of a patch aimed to merge the parameters of postgresql.conf and recovery.conf, and, just by looking at the patch, half of the work looks to be already done. I thought it might be worth to at least update the patch or provide some feedback. I agree that this might break backward-compatibility and that it would be more suited for a 10.0(?), but as 9.3 development is already close to its end, progressing on this discussion and decide whether this could be shipped for 9.3 or later release is important. If it is decided to give up on this feature, well let's do that later. If it is worth the shot, let's put some effort for it. -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] Prepared statements fail after schema changes with surprising error
2013/1/21 Peter van Hardenberg p...@pvh.ca: A user reported an interesting issue today. After restoring a dump created with --clean on a running application in his development environment his application started complaining of missing tables despite those tables very clearly existing. After a little thinking, we determined that this was due to the now-default behaviour of Rails to create prepared statements for most queries. The prepared statements error out because the old relation they point to is missing, but this gives a misleading report thus: PG::Error: ERROR: relation xxx does not exist Isn't that something with search_path? -- Dickson S. Guedes mail/xmpp: gue...@guedesoft.net - skype: guediz http://github.com/guedes - http://guedesoft.net http://www.postgresql.org.br -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
On Mon, Jan 21, 2013 at 5:48 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Advice:You don't do things that way, this way is the only one we will ever accept, because we've been sweating blood over the years to get in a position where it now works. Hint: it's not talking about the patch content, but what is supposedly a problem with the patch. It's easy to answer I'm so happy I didn't actually do it that way. Judgement: You need to think about the problem you want to solve before sending a patch, because there's a hole in it too big for me to be able to count how many bugs are going to to dance in there. It's not a patch, it's a quagmire. BTW, I didn't read it, it stinks too much. Hint: imagine it was your patch and now you have to try and convince any other commiter to have a look at it. I'm not going to pretend that all review comments are constructive, but I also think that to some degree the difference between these two things depends on your perspective. I recall, in particular, the email that prompted the famous in short: -1 from me regards tom lane T-shirt, which I believe to be this one: http://www.postgresql.org/message-id/28927.1236820...@sss.pgh.pa.us That's not a positive review, but when it comes down to it, it's a pretty factual email. IMHO, anyway, and YMMV. My own experience is different from yours, I guess. I actually like it when I post a patch, or suggest a concept, and Tom fires back with a laundry list of reasons it won't work. It often induces me to step back and approach the same problem from a different and better angle, and the result is often better for it. What I don't like is when I (or anyone) posts a patch and somebody says something that boils down to no one wants that. *That* ticks me off. Because you know what? At a minimum, *I* want that. If I didn't, I wouldn't have written a patch. And usually, the customers I support want that, too. Now, somebody else may not want it, and that is fine. But IMHO, posting a patch should be considered prima facie evidence of non-zero demand for the associated feature. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prepared statements fail after schema changes with surprising error
Peter Geoghegan peter.geoghega...@gmail.com writes: On 22 January 2013 00:00, Tom Lane t...@sss.pgh.pa.us wrote: Works for me ... That's what I thought. But looking at RangeVarGetRelidExtended() and recomputeNamespacePath(), do you suppose that the problem could be that access privileges used by the app differed for a schema (or, more accurately, two physically distinct namespaces with the same nspname) between executions of the prepared query? What I'm suspicious of is that Peter is complaining about an old version, or that there's some other critical piece of information he left out. I don't plan to speculate about causes without a concrete test case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
Robert, http://www.postgresql.org/message-id/28927.1236820...@sss.pgh.pa.us That's not a positive review, but when it comes down to it, it's a pretty factual email. IMHO, anyway, and YMMV. Really? I've always thought that was a pretty constructive review. It certainly gave me the laundry list of things I'd have to fix to ever get that change in, and how what looked like a simple patch is actually fairly complicated. My own experience is different from yours, I guess. I actually like it when I post a patch, or suggest a concept, and Tom fires back with a laundry list of reasons it won't work. This can be a problem with new submitters, though. If you're not used to the current community dialog, that email can be taken as your idea is stupid because rather than what it actually means, which is fix these issues and resubmit, please. That's often not clearly communicated, and is important with new submitters. It often induces me to step back and approach the same problem from a different and better angle, and the result is often better for it. What I don't like is when I (or anyone) posts a patch and somebody says something that boils down to no one wants that. *That* ticks me off. Because you know what? At a minimum, *I* want that. If I didn't, I wouldn't have written a patch. And usually, the customers I support want that, too. Now, somebody else may not want it, and that is fine. But IMHO, posting a patch should be considered prima facie evidence of non-zero demand for the associated feature. On the other hand, saying this patch has potential side effects / peformance penalty / makes admin more complex / has a lot of code to maintain. How common is the use-case you're talking about? is legit. Features which impose a cost on people who don't use them need to be justified as useful enough to be worth the cost. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dividing privileges for replication role.
Hi, Magnus, Josh, Michael, Craig Thank you for comments and registring to CommitFest. I made a patch to divide privileges for replication role. Currently(9.2), the privilege for replication role is true / false which means that standby server is able to connect to another server or not with the replication role. Why is it better to do this with a privilege, rather than just using pg_hba.conf? It doesn't represent an actual permission level after all - it's just an administrative flag to say you can't connect. Which AFAICS can just as easily be handled in pg_hba.conf? I guess I'm missing something? You are right. Handling with pg_hba.conf is an easy way. But I think many users think about switch over, so the pg_hba.conf is same on master and standby. it's not convinient that we have to rewrite pg_hba.conf whenever switch over occurs. In the other hand, using a privilege, although we have to prepare each roles before, we don't need to rewrite pg_hba.conf. So I think it's better with a privilege than using only pg_hba.conf In this patch, I made below. a) adding new privileges for replication:MASTER REPLICATION and CASCADE REPLICATION MASTER REPLICATION: Replication-connection to master server is only allowed CASCADE REPLICATION: Replication-connection to cascade server is only allowed (REPLICATION already implemented means replication-connection to both servers is allowed) This seems to me a case of making things more complicated for everyone in order to satisfy a very narrow use-case. It certainly doesn't seem to me to do anything about the accidental cycle issue. Am I missing something? I agreed that it is a very narrow use-case and accidental thing. But I think we should provide a kind of method to avoid it, because it has been different of before release. And I don't think it's complicated, because REPLICATION and NOREPLICATION do same behavior with before release. a) adding new privileges for replication:MASTER REPLICATION and CASCADE REPLICATION MASTER REPLICATION: Replication-connection to master server is only allowed CASCADE REPLICATION: Replication-connection to cascade server is only allowed (REPLICATION already implemented means replication-connection to both servers is allowed) This does not really solve the case you reported because, as reported in your bug, you could still have each slave connecting to each other using the privilege CASCADE REPLICATION. It makes even the privilege level more complicated. Yes, the patch can not solve the case at all. I just added a parameter for users. It is responsibility of users to connect with a right role. What would be necessary to solve your problem would be to have each standby being aware that it is connected to a unique master. This is not really an issue with privileges but more of something like having a standby scanning its upper cluster node tree and check if there is a master connected. While checking the cluster node tree, you will also need to be aware if a node has already been found when you scanned it to be sure that the same node has not been scanned, what would mean that you are in a cycle. I think this is very complicated. At least, now I can't solve it... If someday we can detect it, this kind of switch will be needed. Because some users may need the cyclic situation. I'm not insisting to use replication-role, but I want something to control this behavior. regards, NTT Software Corporation Tomonari Katsumata
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
On Mon, Jan 21, 2013 at 8:23 PM, Robert Haas robertmh...@gmail.com wrote: What I don't like is when I (or anyone) posts a patch and somebody says something that boils down to no one wants that. *That* ticks me off. Because you know what? At a minimum, *I* want that. If I didn't, I wouldn't have written a patch. And usually, the customers I support want that, too. Now, somebody else may not want it, and that is fine. But IMHO, posting a patch should be considered prima facie evidence of non-zero demand for the associated feature. +1 I'd much rather hear what you are trying to accomplish is much better done this other way. rather than, why would you want to do this? or as you said no one wants that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
On Mon, Jan 21, 2013 at 8:47 PM, Josh Berkus j...@agliodbs.com wrote: My own experience is different from yours, I guess. I actually like it when I post a patch, or suggest a concept, and Tom fires back with a laundry list of reasons it won't work. This can be a problem with new submitters, though. If you're not used to the current community dialog, that email can be taken as your idea is stupid because rather than what it actually means, which is fix these issues and resubmit, please. That's often not clearly communicated, and is important with new submitters. +1 to this as well. I definitely felt that way when submitting my first patch. Robert might even recall a convo at a conference that he and I had about it. If not for that I might have given up long ago. Now I can say I am used to it though, and I appreciate the honest and constructive criticism I receive because over time I have seen that Tom and others are even critical of themselves and it's really for the betterment of the final product. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: fix corner use case of variadic fuctions usage
Pavel Stehule pavel.steh...@gmail.com writes: so here is rewritten patch I've applied the infrastructure parts of this, but not the changes to format() and concat(). Why are the format and concat patches so randomly different? Not only is the internal implementation completely different for no obvious reason, but the user-visible behavior is inconsistent too. You've got one of them iterating over elements and one over slices. That seems pretty bogus. Personally I'd vote for the element-level behavior in both cases, because that's generally what happens in other PG functions when there's no particular reason to pay attention to the structure of a multidimensional array. I certainly don't see a reason why they should be making opposite choices. Also, I'm not sure that it's appropriate to throw an error if the given argument is null or not an array. Previous versions did not throw an error in such cases. Perhaps just fall back to behaving as if it weren't marked VARIADIC? You could possibly make an argument for not-an-array-type being an error, since that's a statically inconsistent type situation, but I really don't like a null value being an error. A function could easily receive a null value for an array parameter that it wants to pass on to format() or concat(). BTW, using array_iterate as a substitute for deconstruct_array is neither efficient nor good style. array_iterate is for processing the values as you scan the array. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dividing privileges for replication role.
Tomonari Katsumata t.katsumata1...@gmail.com writes: Why is it better to do this with a privilege, rather than just using pg_hba.conf? You are right. Handling with pg_hba.conf is an easy way. But I think many users think about switch over, so the pg_hba.conf is same on master and standby. it's not convinient that we have to rewrite pg_hba.conf whenever switch over occurs. In the other hand, using a privilege, although we have to prepare each roles before, we don't need to rewrite pg_hba.conf. That sounds good, but if the behavior is controlled by a privilege (ie, it's stored in system catalogs) then it's impossible to have different settings on different slave servers --- or indeed to change the settings locally on a slave at all. You can only change settings on the master and let the change replicate to all the slaves. Quite aside from whether you want to manage things like that, what happens if your master has crashed and you find you need to change the settings on the way to getting a slave to take over? The crash-recovery worry is one of the main reasons that things like pg_hba.conf aren't stored in system catalogs already. It's not always convenient to need a running server before you can change the settings. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
On Mon, Jan 21, 2013 at 8:47 PM, Josh Berkus j...@agliodbs.com wrote: http://www.postgresql.org/message-id/28927.1236820...@sss.pgh.pa.us That's not a positive review, but when it comes down to it, it's a pretty factual email. IMHO, anyway, and YMMV. Really? I've always thought that was a pretty constructive review. It certainly gave me the laundry list of things I'd have to fix to ever get that change in, and how what looked like a simple patch is actually fairly complicated. I agree that it was constructive, but it wasn't exactly endorsing the concept you put forward, so it was not a positive, aka favorable, review. Also, you did put it on a T-shirt. :-) My own experience is different from yours, I guess. I actually like it when I post a patch, or suggest a concept, and Tom fires back with a laundry list of reasons it won't work. This can be a problem with new submitters, though. If you're not used to the current community dialog, that email can be taken as your idea is stupid because rather than what it actually means, which is fix these issues and resubmit, please. That's often not clearly communicated, and is important with new submitters. Yep. Even long-time participants in the process sometimes get demoralized. It's always fun to be the smartest guy in the room, and one rarely gets that luxury with any regularity on pgsql-hackers. It often induces me to step back and approach the same problem from a different and better angle, and the result is often better for it. What I don't like is when I (or anyone) posts a patch and somebody says something that boils down to no one wants that. *That* ticks me off. Because you know what? At a minimum, *I* want that. If I didn't, I wouldn't have written a patch. And usually, the customers I support want that, too. Now, somebody else may not want it, and that is fine. But IMHO, posting a patch should be considered prima facie evidence of non-zero demand for the associated feature. On the other hand, saying this patch has potential side effects / peformance penalty / makes admin more complex / has a lot of code to maintain. How common is the use-case you're talking about? is legit. Features which impose a cost on people who don't use them need to be justified as useful enough to be worth the cost. Totally agreed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for removng unused targets
I'd like to rework on this optimization and submit a patch at the next CF. Is that okay? Thanks, Best regards, Etsuro Fujita From: Craig Ringer [mailto:cr...@2ndquadrant.com] Sent: Friday, January 18, 2013 8:30 PM To: Alexander Korotkov Cc: Tom Lane; Etsuro Fujita; pgsql-hackers Subject: Re: [HACKERS] Patch for removng unused targets On 12/05/2012 04:15 AM, Alexander Korotkov wrote: On Tue, Dec 4, 2012 at 11:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alexander Korotkov aekorot...@gmail.com writes: On Mon, Dec 3, 2012 at 8:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: But having said that, I'm wondering (without having read the patch) why you need anything more than the existing resjunk field. Actually, I don't know all the cases when resjunk flag is set. Is it reliable to decide target to be used only for ORDER BY if it's resjunk and neither system or used in grouping? If it's so or there are some other cases which are easy to determine then I'll remove resorderbyonly flag. resjunk means that the target is not supposed to be output by the query. Since it's there at all, it's presumably referenced by ORDER BY or GROUP BY or DISTINCT ON, but the meaning of the flag doesn't depend on that. What you would need to do is verify that the target is resjunk and not used in any clause besides ORDER BY. I have not read your patch, but I rather imagine that what you've got now is that the parser checks this and sets the new flag for consumption far downstream. Why not just make the same check in the planner? A more invasive, but possibly cleaner in the long run, approach is to strip all resjunk targets from the query's tlist at the start of planning and only put them back if needed. BTW, when I looked at this a couple years ago, it seemed like the major problem was that the planner assumes that all plans for the query should emit the same tlist, and thus that tlist eval cost isn't a distinguishing factor. Breaking that assumption seemed to require rather significant refactoring. I never found the time to try to actually do it. May be there is some way to not remove items from tlist, but evade actual calculation? Did you make any headway on this? Is there work in a state that's likely to be committable for 9.3, or is it perhaps best to defer this to post-9.3 pending further work and review? https://commitfest.postgresql.org/action/patch_view?id=980 -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
On Tue, Jan 22, 2013 at 7:57 AM, Phil Sorber p...@omniti.com wrote: On Mon, Jan 21, 2013 at 8:47 PM, Josh Berkus j...@agliodbs.com wrote: My own experience is different from yours, I guess. I actually like it when I post a patch, or suggest a concept, and Tom fires back with a laundry list of reasons it won't work. This can be a problem with new submitters, though. If you're not used to the current community dialog, that email can be taken as your idea is stupid because rather than what it actually means, which is fix these issues and resubmit, please. That's often not clearly communicated, and is important with new submitters. +1 to this as well. I definitely felt that way when submitting my first patch. Robert might even recall a convo at a conference that he and I had about it. If not for that I might have given up long ago. Now I can say I am used to it though, and I appreciate the honest and constructive criticism I receive because over time I have seen that Tom and others are even critical of themselves and it's really for the betterment of the final product. For me our reluctance for any kind of change is a major demoralizing factor. For example, the recent discussion on Jeff Davis's patch on PD_ALL_VISIBLE flag. Our committers are quite candid in accepting that they may not have put in so much thought or did any real testing while adding the original code, but when it comes to making a change we show reluctance, ask for several test results and the patch author really needs to show extra worth for the change. That looks a bit unfair because the original code may not have received the same attention. Though I completely understand that the committers usually have a better sense and gut feel. If Heikki had not put that page level flag in the first place and if someone would have suggested adding that flag today, my sense is that we would have pushed him/her back and asked for many explanations like why waste a bit in the page or the performance increase is just 10% and that too in the worst case, so not worth it. Another example is the SKIP_PAGES_THRESHOLD in vacuumlazy.c. I'm quite convinced that the current default of 32 is quite arbitrary and the entire logic to not skip heap pages unless there are 32 in sequence is hard to justify, but if that needs to be changed, we would need hours of testing to show its worth. (BTW, I know both these changes are probably committed by Heikki. But thats just a coincidence :-) I've great respect for Heikki's talent and he is often right in his judgement) Having said that, I quite understand the reasoning behind the reluctance for change - we are a small community and can't afford to spend cycles spending on unnecessary regressions. But is that changing in the recent years ? I mean, aren't there a lot more developers and a lot more companies using/testing the new features/releases and reporting issues ? I see a marked increase in the number of bugs reported (haven't really counted but just observation and it could be wrong). Not sure if its a result of increased testing/adaption or a slight drop in the quality or just because of the sheer increase in the number of features/changes we are doing. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
Pavan Deolasee pavan.deola...@gmail.com writes: For me our reluctance for any kind of change is a major demoralizing factor. I hardly think we're reluctant for any kind of change --- the rate of commits belies that. What we want is a convincing case that a proposed change is an improvement over what we have. (You're entirely right that there are plenty of dubious places in the existing code, but most of them replaced nothing at all, and were thus improvements. To improve further requires, well, clear improvement.) In the case of PD_ALL_VISIBLE, I believe there is very serious reason to think removing it would be a performance loss at least some of the time. We need proof it's an overall improvement, not lack of proof it isn't. Having said that, I quite understand the reasoning behind the reluctance for change - we are a small community and can't afford to spend cycles spending on unnecessary regressions. But is that changing in the recent years ? I mean, aren't there a lot more developers and a lot more companies using/testing the new features/releases and reporting issues ? Yeah, and a lot more fairly-new developers who don't understand all the connections in the existing system. Let me just push back a bit here: based on the amount of time I've had to spend fixing bugs over the past five months, 9.2 was our worst release ever. I don't like that trend, and I don't want to see it continued because we get laxer about accepting patches. IMO we are probably too lax already. In this connection I refer you to Sturgeon's Law(*): 90% of everything is crud. Applied to our problem, it says that 90% of all patch ideas are bad. Therefore, we should be expecting to reject a large fraction of submitted patches. It distresses me that people seem to think the default result for a submitted patch is that it gets committed. That's backwards. For a very long time we've tried to encourage people to submit rough ideas to pgsql-hackers for discussion *before* they start coding. The point of that is to weed out, or fix if possible, (some of) the bad ideas before a lot of effort has gotten expended on them. It seems though that less and less of that is actually happening, so I wonder whether the commitfest process is encouraging inefficiency by making people think they should start a discussion with a mostly-complete patch. Or maybe CF review is just crowding out any serious discussion of rough ideas. There was some discussion at the last dev meeting of creating a design review process within commitfests, but nothing got done about it ... regards, tom lane (*) The wikipedia entry is good enough. Whether Ted had the percentage spot-on is not the key issue here. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Monday, January 21, 2013 6:22 PM Magnus Hagander On Fri, Jan 18, 2013 at 7:50 AM, Amit Kapila amit.kap...@huawei.com wrote: On Wednesday, January 16, 2013 4:02 PM Heikki Linnakangas wrote: On 07.01.2013 16:23, Boszormenyi Zoltan wrote: Since my other patch against pg_basebackup is now committed, this patch doesn't apply cleanly, patch rejects 2 hunks. The fixed up patch is attached. Now that I look at this a high-level perspective, why are we only worried about timeouts in the Copy-mode and when connecting? The initial checkpoint could take a long time too, and if the server turns into a black hole while the checkpoint is running, pg_basebackup will still hang. Then again, a short timeout on that phase would be a bad idea, because the checkpoint can indeed take a long time. True, but IMO, if somebody want to take basebackup, he should do that when the server is not loaded. A lot of installations don't have such an optino, because there is no time whe nthe server is not loaded. Good to know about it. I have always heard that customer will run background maintenance activities (Reindex, Vacuum Full, etc) when the server is less loaded. For example a. Billing applications in telecom, at night times they can be relatively less loaded. b. Any databases used for Sensex transactions, they will be relatively free once the market is closed. c. Banking solutions, because transactions are done mostly in day times. There will be many cases where Database server will be loaded all the times, if you can give some example, it will be a good learning for me. In streaming replication, the keep-alive messages carry additional information, the timestamps and WAL locations, so a keepalive makes sense at that level. But otherwise, aren't we just trying to reimplement TCP keepalives? TCP keepalives are not perfect, but if we want to have an application level timeout, it should be implemented in the FE/BE protocol. I don't think we need to do anything specific to pg_basebackup. The user can simply specify TCP keepalive settings in the connection string, like with any libpq program. I think currently user has no way to specify TCP keepalive settings from pg_basebackup, please let me know if there is any such existing way? You can set it through environment variables. As was discussed elsewhere, it would be good to have the ability to do it natively to pg_basebackup as well. Sure, already modifying the existing patch to support connection string in pg_basebackup and pg_receivexlog. I think specifying TCP settings is very cumbersome for most users, that's the reason most standard interfaces (ODBC/JDBC) have such application level timeout mechanism. By implementing in FE/BE protocol (do you mean to say that make such non-blocking behavior inside Libpq or something else), it might be generic and can be used for others as well but it might need few interface changes. If it's specifying them that is cumbersome, then that's the part we should fix, rather than modifying the protocol, no? That can be done as part of point 2 of initial proposal (2. Support recv_timeout separately to provide a way to users who are not comfortable tcp keepalives). To achieve this there can be 2 ways. 1. Change in FE/BE protocol - I am not sure exactly how this can be done, but as per Heikki this is better way of implementing it. 2. Make the socket as non-blocking in pg_basebackup. Advantage of Approach-1 is that if we do in such a fashion that in lower layers (libpq) it is addressed then all other apps (pg_basebackup, etc) can use it, no need to handle separately in each application. So now as changes in Approach-1 seems to be invasive, we decided to do it later. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for removng unused targets
On 01/22/2013 01:24 PM, Etsuro Fujita wrote: I'd like to rework on this optimization and submit a patch at the next CF. Is that okay? That sounds very sensible to me, given how busy CF2013-01 is and the remaining time before 9.3. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Prepared statements fail after schema changes with surprising error
Hm - I'm still able to recreate the test the user's running using pg_dump/pg_restore. I'm still working to see if I can minimize the test-case, but this is against 9.2.2. Would you prefer I test against HEAD? regression=# create table z1 (f1 int); CREATE TABLE regression=# prepare sz1 as select * from z1; PREPARE regression=# insert into z1 values (1); INSERT 0 1 regression=# execute sz1; f1 1 (1 row) # In another terminal window $ pg_dump -F c regression test.dump $ pg_restore --clean --no-acl --no-owner -d regression test.dump ERROR: cannot drop the currently open database STATEMENT: DROP DATABASE regression; pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 2185; 1262 16384 DATABASE regression pvh pg_restore: [archiver (db)] could not execute query: ERROR: cannot drop the currently open database Command was: DROP DATABASE regression; WARNING: errors ignored on restore: 1 $ # back in the same backend regression=# execute sz1; ERROR: relation z1 does not exist regression=# select * from z1; f1 1 (1 row) regression=# On Mon, Jan 21, 2013 at 5:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Geoghegan peter.geoghega...@gmail.com writes: On 22 January 2013 00:00, Tom Lane t...@sss.pgh.pa.us wrote: Works for me ... That's what I thought. But looking at RangeVarGetRelidExtended() and recomputeNamespacePath(), do you suppose that the problem could be that access privileges used by the app differed for a schema (or, more accurately, two physically distinct namespaces with the same nspname) between executions of the prepared query? What I'm suspicious of is that Peter is complaining about an old version, or that there's some other critical piece of information he left out. I don't plan to speculate about causes without a concrete test case. regards, tom lane -- Peter van Hardenberg San Francisco, California Everything was beautiful, and nothing hurt. -- Kurt Vonnegut
Re: [HACKERS] Prepared statements fail after schema changes with surprising error
Okay - I've narrowed it down to an interaction with schema recreation. Here's a minimal test-case I created by paring back the restore from the pg_restore output until I only had the essence remaining: -- setup drop table z1; create table z1 (f1 int); insert into z1 values (1); prepare sz1 as select * from z1; select 'executing first prepared statement'; execute sz1; -- remainder of minimized pg_restore SQL output DROP TABLE public.z1; DROP SCHEMA public; CREATE SCHEMA public; CREATE TABLE z1 ( f1 integer ); -- proof of regression select 'executing second prepared statement'; execute sz1; select 'selecting from z1 to prove it exists'; select * from z1; On Mon, Jan 21, 2013 at 10:45 PM, Peter van Hardenberg p...@pvh.ca wrote: Hm - I'm still able to recreate the test the user's running using pg_dump/pg_restore. I'm still working to see if I can minimize the test-case, but this is against 9.2.2. Would you prefer I test against HEAD? regression=# create table z1 (f1 int); CREATE TABLE regression=# prepare sz1 as select * from z1; PREPARE regression=# insert into z1 values (1); INSERT 0 1 regression=# execute sz1; f1 1 (1 row) # In another terminal window $ pg_dump -F c regression test.dump $ pg_restore --clean --no-acl --no-owner -d regression test.dump ERROR: cannot drop the currently open database STATEMENT: DROP DATABASE regression; pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 2185; 1262 16384 DATABASE regression pvh pg_restore: [archiver (db)] could not execute query: ERROR: cannot drop the currently open database Command was: DROP DATABASE regression; WARNING: errors ignored on restore: 1 $ # back in the same backend regression=# execute sz1; ERROR: relation z1 does not exist regression=# select * from z1; f1 1 (1 row) regression=# On Mon, Jan 21, 2013 at 5:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Geoghegan peter.geoghega...@gmail.com writes: On 22 January 2013 00:00, Tom Lane t...@sss.pgh.pa.us wrote: Works for me ... That's what I thought. But looking at RangeVarGetRelidExtended() and recomputeNamespacePath(), do you suppose that the problem could be that access privileges used by the app differed for a schema (or, more accurately, two physically distinct namespaces with the same nspname) between executions of the prepared query? What I'm suspicious of is that Peter is complaining about an old version, or that there's some other critical piece of information he left out. I don't plan to speculate about causes without a concrete test case. regards, tom lane -- Peter van Hardenberg San Francisco, California Everything was beautiful, and nothing hurt. -- Kurt Vonnegut -- Peter van Hardenberg San Francisco, California Everything was beautiful, and nothing hurt. -- Kurt Vonnegut
Re: [HACKERS] Prepared statements fail after schema changes with surprising error
Peter van Hardenberg p...@pvh.ca writes: Okay - I've narrowed it down to an interaction with schema recreation. Here's a minimal test-case I created by paring back the restore from the pg_restore output until I only had the essence remaining: Hm ... I'm too tired to trace through the code to prove this theory, but I think what's happening is that this bit: DROP SCHEMA public; CREATE SCHEMA public; changes the OID of schema public, whereas the search_path that's cached for the cached plan is cached in terms of OIDs. So while there is a table named public.z1 at the end of the sequence, it's not in any schema found in the cached search path. We could possibly fix that by making the path be cached as textual names not OIDs, but then people would complain (rightly, I think) that renaming a schema caused unexpected behavior. There's also the other issues that have been discussed again recently about whether we should be attempting to reinstall an old search path in the first place. We could easily dodge this issue if we redefined what the desired behavior is ... but I'm not sure if we want to risk the fallout of that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CF3+4 (was Re: Parallel query execution)
On Jan 22, 2013 1:31 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Jan 21, 2013 at 6:23 PM, Phil Sorber p...@omniti.com wrote: On Wed, Jan 16, 2013 at 8:18 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: Here's a breakdown based purely on the names from the CF page (i.e. I didn't check archives to see who actually posted reviews, and didn't take into account reviews posted without updating the CF page). FWIW, I reviewed at least one at the point you did this survey, and I did update the CF page, but I didn't put my name into that box marked reviewers because it is an extra step (Edit Patch) that isn't immediatly obvious. Isn't there a way to automatically populate that based on people linking in their reviews? Sadly, the CF application doesn't have access to the user name - real name mappings. And while it's only Sure it does. It only has it for users that have logged in at least once. But that wouldn't be a problem for this scenario as they would have already logged in to post said link. /Magnus