Re: [HACKERS] 9.4 broken on alpha
Re: Michael Cree 2015-08-26 20150826052530.GA4256@tower I reported the failure to build on Alpha, with an explanation and a patch to fix it, to the Debian package maintainers over a year ago, and within about of a month of version 9.4 being uploaded to Debian. My recollection is that prior versions (9.2 and 9.3) compiled on Alpha so the use of the wrong barrier, and the fix, was in fact reported in a timely fashion following the first reasonable chance to observe the problem. It has been built and running at Debian-Ports for over a year now as I uploaded the fixed version to the Alpha unreleased distribution. Hi Michael, (I've discovered this branch of this thread only now, I got removed from CC.) Sorry for letting that rot for so long - I'd blame the Debian infrastructure for not showing ports information in the usual places. I've really only discovered the problem because buildd.debian.org is now showing the non-main architectures as well. (Of course we could just have looked at the bug report...) I guess we should look into making that even more visible. Is there a list of packages that have ports-only patches applied which we could use to make maintainers aware via ddpo/pts/tracker? Having a porter box available would help as well. It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth the trouble, since we're desupporting Alpha as of 9.5 anyway. That is disappointing to hear. Why is that? It is still in use on Alpha. What is the maintenance load for keeping the Alpha arch specific code? Fwiw I'd be curious to see if 9.5 still works using the generic primitives, but atm it's blocking on perl5.20: Dependency installability problem for postgresql-9.5 on alpha: postgresql-9.5 build-depends on: - alpha:libipc-run-perl alpha:libipc-run-perl depends on: - alpha:libio-pty-perl alpha:libio-pty-perl depends on missing: - alpha:perlapi-5.20.0 Christoph -- c...@df7cb.de | http://www.df7cb.de/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On-demand running query plans using auto_explain and signals
On Aug 29, 2015 7:31 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2015-08-29 18:36 GMT+02:00 Andres Freund and...@anarazel.de: On 2015-08-29 18:27:59 +0200, Pavel Stehule wrote: 2015-08-29 18:25 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de Good point. There's still hope to set a flag and process it later on. Will have to check if it's possible to stay in the scope of a loaded module though. I had a workable prototype - and It was implemented very similar as handling CANCEL Where did you put the handling of that kind of interrupt? Directly into ProcessInterrupts()? Probably. I don't remember it well, but it need hack code - it cannot be used from extension. Do you still have the code somewhere around? Did it see production use? Thanks! -- Alex
[HACKERS] datestyle=postgres broken with timezone=UTC+N
Discovered when debugging libpqtypes test failures: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=795729 postgres =# set timezone = 'Etc/UTC+1'; SET postgres =# set datestyle = 'postgres'; SET postgres =# select '2015-01-01 01:00:00 +0100'::timestamptz; Wed 31 Dec 23:00:00 2014 ETC/UTC postgres =# select 'Wed 31 Dec 23:00:00 2014 ETC/UTC'::timestamptz; Wed 31 Dec 22:00:00 2014 ETC/UTC That is, feeding back a value will to the UTC offset shift again. (9.6 head, also seen on 9.4.) Christoph -- c...@df7cb.de | http://www.df7cb.de/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On-demand running query plans using auto_explain and signals
2015-08-30 10:30 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de : On Aug 29, 2015 7:31 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2015-08-29 18:36 GMT+02:00 Andres Freund and...@anarazel.de: On 2015-08-29 18:27:59 +0200, Pavel Stehule wrote: 2015-08-29 18:25 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de Good point. There's still hope to set a flag and process it later on. Will have to check if it's possible to stay in the scope of a loaded module though. I had a workable prototype - and It was implemented very similar as handling CANCEL Where did you put the handling of that kind of interrupt? Directly into ProcessInterrupts()? Probably. I don't remember it well, but it need hack code - it cannot be used from extension. Do you still have the code somewhere around? Did it see production use? I am not sure I am able to find it - I'll try. We didn't use it on production. Thanks! -- Alex
Re: [HACKERS] Potential GIN vacuum bug
Jeff Janes jeff.ja...@gmail.com writes: Attached is a patch to deal with this without the heavyweight locks. I realized that using the clean up lock on the meta page, rather than the pending head page, would be easier to implement as a pin was already held on the meta page throughout, and the pending head page can change during the cleanup if we need multiple passes. ... I still prefer the heavy-weight approach. The buffer clean up lock for vacuuming seems fragile to start with, and abusing it for other purposes doesn't improve on that. FWIW, I would go with the heavyweight lock approach as well. The extra cycles needed for a heavyweight lock don't seem significant in this context, and you have far more control over which other operations conflict or don't conflict with the lock. Taking a buffer cleanup lock on the metapage sounds really scary from that viewpoint; it's basically going to conflict with everything else, even if the other operations only take it for short intervals, and you have no freedom to adjust that. Your earlier point about how the current design throttles insertions to keep the pending list from growing without bound seems like a bigger deal to worry about. I think we'd like to have some substitute for that. Perhaps we could make the logic in insertion be something like if (pending-list-size threshold) { if (conditional-lock-acquire(...)) { do-pending-list-cleanup; lock-release; } else if (pending-list-size threshold * 2) { unconditional-lock-acquire(...); if (pending-list-size threshold) do-pending-list-cleanup; lock-release; } } so that once the pending list got too big, incoming insertions would wait for it to be cleared. Whether to use a 2x safety margin or something else could be a subject for debate, of course. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing dead support for pre-POSIX signals
Hi, On 2015-08-30 14:59:41 -0400, Tom Lane wrote: 1. No buildfarm member in the available history (going back to 2012-01-01) has ever reported not having the POSIX signal interface, nor sig_atomic_t. (We don't run the POSIX-signals check on Windows systems, though.) We, afaik, don't use any signals on windows anyway... I think we should rip out the configure checks for HAVE_POSIX_SIGNALS, HAVE_SIGPROCMASK, HAVE_SIGSETJMP, and probably HAVE_SIG_ATOMIC_T, as well as the C code that tries to make up for not having these things (on non-Windows systems that is). It's not being exercised and it's fair to doubt that those code paths would even work reliably anymore. For instance, it seems likely that our current latch code has never been run on a system without these APIs, and even more likely that Andres' recent fooling around with signal handling (eg commit 675f5) has never been checked on such a system. Sounds good to me. HAVE_SIG_ATOMIC_T is a debatable case, in that the only thing we're doing with it is c.h's /* sig_atomic_t is required by ANSI C, but may be missing on old platforms */ #ifndef HAVE_SIG_ATOMIC_T typedef int sig_atomic_t; #endif which should be safe enough (if int isn't atomically stored/fetched we already have big problems elsewhere). Still, the configure test for it appears to be a complete waste of cycles. What are you proposing to do instead? Replace sig_atomic_t by int everywhere? Or unconditionally do the typedef? Because the latter won't work well if it's already typedef'ed... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Horizontal scalability/sharding
On Sun, Aug 30, 2015 at 5:31 PM, Simon Riggs si...@2ndquadrant.com wrote: On 30 August 2015 at 03:17, Bruce Momjian br...@momjian.us wrote: I have recently increased my public statements about the idea of adding horizontal scaling/sharding to Postgres. Glad to see it. Many people have been pushing such things for years, so it is good to finally see some debate about this on Hackers. I wanted to share with hackers a timeline of how we got here, and where I think we are going in the short term: 2012-2013: As part of writing my scaling talk (http://momjian.us/main/presentations/overview.html#scaling), studying Oracle RAC, and talking to users, it became clear that an XC-like architecture (sharding) was the only architecture that was going to allow for write scaling. What other architectures were discussed? Where was that discussion? Users and conference attendees I talked to were increasingly concerned about the ability of Postgres to scale for high write volumes. They didn't necessarily need that scale now, but they needed to know they could get it if they wanted it, and wouldn't need to switch to a new database in the future. This is similar to wanting a car that can get you on a highway on-ramp fast --- even if you don't need it, you want to know it is there. +1 2014: I started to shop around the idea that we could use FDWs, parallelism, and a transaction/snapshot manager to get XC features as built-in to Postgres. (I don't remember where the original idea came from.) It was clear that having separate forks of the source code in XC and XL was never going to achieve critical mass --- there just aren't enough people who need high right scale right now, and the fork maintenance overhead is a huge burden. I personally support the view that we should put scalability features into Postgres core, rather than run separate forks. I realized that we would never get community acceptance to dump the XC (or XL) code needed for sharding into community Postgres How or why did you realize that? There has never been any such discussion, AFAIK. Surely it can be possible to move required subsystems across? , but with FDWs, we could add the features as _part_ of improving FDWs, which would benefit FDWs _and_ would be useful for sharding. (We already see some of those FDW features in 9.5.) That is a huge presumption. Not discussed or technically analyzed in any way with the community. October, 2014: EDB and NTT started working together in the community to start improving FDWs as a basis for an FDW-based sharding solution. Many of the 9.5 FDW improvements that also benefit sharding were developed by a combined EDB/NTT team. The features improved FDWs independent of sharding, so they didn't need community buy-in on sharding to get them accepted. June, 2015: I attended the PGCon sharding unconference session and there was a huge discussion about where we should go with sharding. I think the big take-away was that most people liked the FDW approach, but had business/customer reasons for wanting to work on XC or XL because those would be production-ready faster. Cough, cough. You must surely be joking that most people liked the FDW approach? How did we measure the acceptance of this approach? What actually is the FDW approach? Since its not been written down anywhere, or even explained verbally, how can anyone actually agree to it? July, 2015: Oleg Bartunov and his new company Postgres Professional (PP) started to think about joining the FDW approach, rather than working on XL, as they had stated at PGCon in June. A joint NTT/EDB/PP phone-in meeting is scheduled for September 1. A little correction about Postgres Professional. We are concentrated on idea to have one distributed transaction manager, originally DTM, now we have better name XTM, which is neutral to actual cluster realization. For example, we are testing it with XL, ported to 9.4, but we were planning to extend tests to pg_shard, postgres_fdw. My idea was to have at least XTM committed to 9.6, so all parties could work on their implementation much easier. August, 2015: While speaking at SFPUG, Citus Data approached me about joining the FDW sharding team. They have been invited to the September 1 meeting, as have the XC and XL people. 2ndQuadrant is working in this area, specifically bringing XL 9.5 forwards. Please can invites be posted to myself, Pavan Deolasee and Petr Jelinek also? I'll pass on to others also. Koichi Suzuki is arranging a meeting in Hong Long for XC/XL discussions. Presumably EDB is invited also? If Koichi is a leading organizer of this, why are there two meetings? October, 2015: EDB is sponsoring a free 3-hour summit about FDW sharding at the PG-EU conference in Vienna. Everyone is invited, but it is hoped most of the September 1 folks can attend. February, 2016: Oleg is planning a similar meeting at their
Re: [HACKERS] One question about security label command
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/28/2015 07:21 PM, Adam Brightwell wrote: On 08/28/2015 08:37 AM, Joe Conway wrote: So given all that, here is what I propose we do: 1.) Commit Kouhei's patch against HEAD and 9.5 (Joe) 2.) Commit my modified patch against 9.4 and 9.3 (Joe) 3.) Rework patch for 9.2 (Kouhei) 4.) Finish standing up the RHEL/CentOS 7.x buildfarm member to test sepgsql on 9.2 and up. The animal (rhinoceros) is running already, but still needs some custom scripting. (Joe, Andrew) 5.) Additionally stand up a RHEL/CentOS 6.x buildfarm member to test sepgsql on 9.1 (no changes) (Joe). Sound like a plan? I think this makes sense. Getting buildfarm coverage on any level is better than nothing, IMHO. Kind of a bummer that 9.1 is the only version that will work as-is on EL6 but it is what it is for now, I suppose. #1 and 2 above are done -- committed and pushed to HEAD and 9.3-9.5. Additionally I have #5 (the 6.x buildfarm member) created/working and I sent in my registration form. It has not been accepted yet. Joe - -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, Open Source Development -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJV40i4AAoJEDfy90M199hlFK4P/iBkQk1oboo9OlnRgIT3w/be tCoL95E3ZH7SPEuxq0lbbCM6Hvz5lFWw0X9RQBJlUFBaIPjoilGrcf8r6COSfZZt XOTwmQJMDO7iCmBuc89ZrVH5Fz3rc19eA6p7bNplwesWOtXP/RwRObjtffi2XHOz pOuUqHpw9GBK4uVEf7v9Fdqz48OUYGa/MAMlw3XlBzgBKdx8l+bjBh/5YQFvRwBr ujNb9c/1DfEvtr/oaa7rmkKNUVmL53bEZbL6m81+taCKbvt7yD5GAN5Sdqw27iiy HILGwmdzlVzd+VMIJ7+53w2HpimIEVXCnn2yrM1rWL12b9oFmz/rCA1qfuyrGIN9 9ZFb+q3myCq/ofzXoznsBUIYxQhAfuLyRDrAUPEi8h4wWDc3JPJJkKmu/6ZuBGv7 m2g3mUx9sr7BoJKVZ4Vkqc4mgorwDkwzfJpiH0hKBewtO3WkN8LHk/jGTZaPgMwl 4OjJQvmCBk/5vdsfOQ/qZHqw6WM75gnxKGvHo/nFPuztzawXVx4sVWm2e4HxZlAP tfrTZtAYTE1K5y6Kn6yIeQgW4ZTwztIsjuG0XebTAymnHIjqkkj5RHpn93PADbrh wHvs6+nQAePdWoQ2LPwlWNELUxRz107BMnWBCHYk+cqoim48GE0UIHTLZQtR7C5D wIo5lU81nUqPI147940K =AB+P -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Removing dead support for pre-POSIX signals
A question in another project's mailing list prompted me to investigate whether our support for non-POSIX signals is actually still of any value. I grepped through the buildfarm server's configure-stage reports, and found that: 1. No buildfarm member in the available history (going back to 2012-01-01) has ever reported not having the POSIX signal interface, nor sig_atomic_t. (We don't run the POSIX-signals check on Windows systems, though.) 2. No non-Windows member has ever reported not having sigprocmask nor sigsetjmp, save for two one-time failures that look to be due to configure failing for being out of disk space. 3. The check for sys_siglist is still of use, which is unsurprising because it is not required by POSIX. The POSIX signal APIs, including sigprocmask and sigsetjmp, are required by Single Unix Spec v2, which is what we've usually considered to be our minimum baseline for supported Unix-ish systems. I think we should rip out the configure checks for HAVE_POSIX_SIGNALS, HAVE_SIGPROCMASK, HAVE_SIGSETJMP, and probably HAVE_SIG_ATOMIC_T, as well as the C code that tries to make up for not having these things (on non-Windows systems that is). It's not being exercised and it's fair to doubt that those code paths would even work reliably anymore. For instance, it seems likely that our current latch code has never been run on a system without these APIs, and even more likely that Andres' recent fooling around with signal handling (eg commit 675f5) has never been checked on such a system. HAVE_SIG_ATOMIC_T is a debatable case, in that the only thing we're doing with it is c.h's /* sig_atomic_t is required by ANSI C, but may be missing on old platforms */ #ifndef HAVE_SIG_ATOMIC_T typedef int sig_atomic_t; #endif which should be safe enough (if int isn't atomically stored/fetched we already have big problems elsewhere). Still, the configure test for it appears to be a complete waste of cycles. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing dead support for pre-POSIX signals
Andres Freund and...@anarazel.de writes: On 2015-08-30 14:59:41 -0400, Tom Lane wrote: HAVE_SIG_ATOMIC_T is a debatable case, in that the only thing we're doing with it is c.h's /* sig_atomic_t is required by ANSI C, but may be missing on old platforms */ #ifndef HAVE_SIG_ATOMIC_T typedef int sig_atomic_t; #endif which should be safe enough (if int isn't atomically stored/fetched we already have big problems elsewhere). Still, the configure test for it appears to be a complete waste of cycles. What are you proposing to do instead? Replace sig_atomic_t by int everywhere? Or unconditionally do the typedef? Because the latter won't work well if it's already typedef'ed... No no no, I'm proposing to remove the above-quoted lines and the configure test. sig_atomic_t is required by C89; there is no reason anymore to cope with it not being provided by signal.h. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing dead support for pre-POSIX signals
On 2015-08-30 15:28:42 -0400, Tom Lane wrote: No no no, I'm proposing to remove the above-quoted lines and the configure test. sig_atomic_t is required by C89; there is no reason anymore to cope with it not being provided by signal.h. Ok, that works for me. You seemed to be a bit more doubtful about the sig_atomic_t support, that's why I thought you might want to do something but rip it out. Seems like a pretty low risk thing to try. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential GIN vacuum bug
On Sat, Aug 22, 2015 at 11:25 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Tue, Aug 18, 2015 at 8:59 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Aug 17, 2015 at 5:41 PM, Jeff Janes jeff.ja...@gmail.com wrote: User backends attempt to take the lock conditionally, because otherwise they would cause an autovacuum already holding the lock to cancel itself, which seems quite bad. Not that this a substantial behavior change, in that with this code the user backends which find the list already being cleaned will just add to the end of the pending list and go about their business. So if things are added to the list faster than they can be cleaned up, the size of the pending list can increase without bound. Under the existing code each concurrent user backend will try to clean the pending list at the same time. The work doesn't parallelize, so doing this is just burns CPU (and possibly consuming up to maintenance_work_mem for *each* backend) but it does server to throttle the insertion rate and so keep the list from growing without bound. This is just a proof-of-concept patch, because I don't know if this approach is the right approach. I'm not sure if this is the right approach, but I'm a little wary of involving the heavyweight lock manager in this. If pending list cleanups are frequent, this could involve a lot of additional lock manager traffic, which could be bad for performance. Even if they are infrequent, it seems like it would be more natural to handle this without some regime of locks and pins and buffer cleanup locks on the buffers that are storing the pending list, rather than a heavyweight lock on the whole relation. But I am just waving my hands wildly here. I also thought of a buffer clean up lock on the pending list head buffer to represent the right to do a clean up. But with the proviso that once you have obtained the clean up lock, you can then drop the exclusive buffer content lock and continue to hold the conceptual lock just by maintaining the pin. I think that this would be semantically correct, but backends doing a cleanup would have to get the lock conditionally, and I think you would have too many chances for false failures where it bails out when the other party simply holds a pin. I guess I could implement it and see how it fairs in my test case. Attached is a patch to deal with this without the heavyweight locks. I realized that using the clean up lock on the meta page, rather than the pending head page, would be easier to implement as a pin was already held on the meta page throughout, and the pending head page can change during the cleanup if we need multiple passes. Also, I think the clean up lock on the metapage should actually be easier. All queries need to visit the pending head, and they hold it long enough to check all the keys (possibly hundreds, checked with arbitrarily slow SQL functions) on that page. The metapage is only checked for a few variables which are C types. I thought of checking for metadata-head == InvalidBlockNumber with a sharelock before getting the clean-up lock and then again after, but decide against it as the user backends already check that immediately before calling this function, and wouldn't call it if there was no pending list as-of that check. I exchange the exclusive context lock given to us by the LockBufferForCleanup for a share content lock, so as to not hold the exclusive lock over the IO possibly needed to read the pending head page into buffers. I don't know that this is actually a win. This fixed the same problem I was seeing that was fixed by the previous heavy-weight lock patch. I still prefer the heavy-weight approach. The buffer clean up lock for vacuuming seems fragile to start with, and abusing it for other purposes doesn't improve on that. Whichever approach is taken, more work is needed on the comments. And the code that currently checks for concurrent cleanup and bails out needs to be changed to throw errors or something instead. But I don't want to make too many changes until I know which approach to take, and whether it will be back-patched. Cheers, Jeff gin_pending_lwlock.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] datestyle=postgres broken with timezone=UTC+N
Re: Tom Lane 2015-08-30 20976.1440946...@sss.pgh.pa.us Christoph Berg c...@df7cb.de writes: postgres =# set timezone = 'Etc/UTC+1'; SET postgres =# set datestyle = 'postgres'; SET postgres =# select '2015-01-01 01:00:00 +0100'::timestamptz; Wed 31 Dec 23:00:00 2014 ETC/UTC postgres =# select 'Wed 31 Dec 23:00:00 2014 ETC/UTC'::timestamptz; Wed 31 Dec 22:00:00 2014 ETC/UTC Not sure I see your point? Lacking any entry for ETC/UTC in the timezone_abbreviations file, we'll look it up in the zoneinfo database, and that will tell us it means UTC+0. So the input means 2300 UTC, and that gets displayed as 2200 in your UTC-1 display timezone. No, this isn't entirely consistent with the way the timezone name is being used in output. But that's because you've only half-configured the system for a weird timezone name. You would need an abbreviation entry as well if you want ETC/UTC to be read as UTC-1. For more info read http://www.postgresql.org/docs/9.4/static/datatype-datetime.html#DATATYPE-TIMEZONES --- see particularly the FOOBAR0 example, which is exactly what you've Hi Tom, thanks for the heads-up. I have to confess that I was chasing a ghost here; the original report for the libpqtypes test failure I was investigating was mentioning Etc/GMT+12 (instead of UTC) which is a legal zone, and with which the timestamptz output is correct even with datestyle=postgres. I'll report back if the problem is on the PG or libpqtypes side, sorry for the premature noise. got here. By and large, I don't recommend use of POSIX notation in timezone settings; the IANA geographical zone names are much less likely to bite you on the rear. The report is from Debian's reproducible builds project, they try to make the package builds independent from as many environment factors as possible, and one variation they try are timezones on the far plus or minus side. Luckily, PostgreSQL itself is already reproducible (but not cross-build friendly, more on that later), but libpqtypes got caught by a TZ problem in the test suite. https://wiki.debian.org/ReproducibleBuilds Christoph -- c...@df7cb.de | http://www.df7cb.de/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing dead support for pre-POSIX signals
Andres Freund and...@anarazel.de writes: On 2015-08-30 15:28:42 -0400, Tom Lane wrote: No no no, I'm proposing to remove the above-quoted lines and the configure test. sig_atomic_t is required by C89; there is no reason anymore to cope with it not being provided by signal.h. Ok, that works for me. You seemed to be a bit more doubtful about the sig_atomic_t support, that's why I thought you might want to do something but rip it out. Seems like a pretty low risk thing to try. The distinction I was making was that it's fair to suspect that the pre-POSIX-signal code is actively broken. (For instance, as far back as commit 8408f6525 we were aware that libpq's thread-safe signal logic would not work with pre-POSIX signals. How many other cases do you think have snuck in unnoticed since then?) On the other hand, there's no reason to think that the substitute sig_atomic_t typedef wouldn't work fine if it were used. The argument for ripping that out is merely that it's silly to continue expending configure cycles on something that's required by C89. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.4 broken on alpha
David Fetter da...@fetter.org writes: At a minimum, we should de-support every platform on which literally no new deployments will ever happen. I'm looking specifically at you, HPUX, and I could make a pretty good case for the idea that we can relegate 32-bit platforms to the ash heap of history, at least on the server side. This wasn't responded to very much, but I wanted to put on record that I don't agree with the concept. There are several reasons: 1. Run on every platform you can is in the DNA of this and just about every other successful open-source project. You don't want to drive away potential users by not supporting their platform. If they're still getting good use out of an old OS or non-mainstream architecture, who are we to tell them not to? 2. Even if a particular platform is no longer a credible target for production deployments, it can be a useful test case to ensure that we don't get frozen into a narrow FooOS on x86_64 is the only case worth considering straitjacket. Software monocultures are bad news; they tend not to adapt very well when the world changes. So for instance I'm reluctant to shut down pademelon, even though its compiler is old enough to vote, because it's one of not too darn many buildfarm animals whose compilers are not gcc or derivatives. We need cases like that to keep us from building in gcc-isms. In short, supporting old platforms is one of the ways that we stay flexible enough to be able to support new platforms in the future. 3. I see no reason to desupport platforms when we don't gain anything by it. In the case of Alpha, it's pretty clear what we gain: we don't have to worry about its unlike-anything-else memory coherency model. (I'm not very worried that future platforms will adopt that idea, either.) And the lack of any support from its remaining user community tilts the scales pretty heavily against it. I'll be happy to drop testing on HPUX 10.20, or the ancient OS X versions my other buildfarm critters run, the minute there is some feature we have a clear need for that one of them doesn't have. But I don't think it's desirable to cut anything off as long as it's still able to run a buildfarm member. I think those critters are still capable of catching unexpected portability issues that might affect more-viable platforms too. A useful comparison point is the testing Greg Stark did recently for VAX. Certainly no-one's ever again going to try to get useful work done with Postgres on a VAX, but that still taught us some good things about unnecessary IEEE-floating-point dependencies that had snuck into the code. Someday, that might be important; IEEE 754 won't be the last word on float arithmetic forever. As an example of a desupport proposal that I think *is* well-founded, see my nearby message 27975.1440961...@sss.pgh.pa.us. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \rotate in psql
On Sat, Aug 29, 2015 at 12:48:23AM +0200, Daniel Verite wrote: Hi, This is a reboot of my previous proposal for pivoting results in psql, with a new patch that generalizes the idea further through a command now named \rotate, and some examples. Neat! Thanks for putting this together :) Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest remaining Needs Review items
On Wed, Aug 26, 2015 at 4:59 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Tue, Aug 25, 2015 at 4:39 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Aug 10, 2015 at 4:34 PM, Heikki Linnakangas hlinn...@iki.fi wrote: Hello Hackers, There are a few Needs Review items remaining in the July commitfest. Reviewers, please take action - you are holding up the commitfest. In addition to these items, there are a bunch of items in Ready for Committer state. Committers: please help with those. At this stage, there are currently 26 patches in need of actions for the current CF: And now we are down to 2, the following ones: -- Backpatch Resource Owner reassign locks cache. Still not sure if Andres is planning a backpatch to 9.2. This has been closed by Tom. -- self-defined policy for sepgsql regression test. Joe is wrapping up that. The rest has been updated in the CF app. If you have any complaints or remarks, feel free to do so on the thread of the related patch or here that's fine. This has been moved to the next CF with the same status. And the commit fest of 2015-07 is now closed with the following score: Committed: 58. Moved to next CF: 25. Rejected: 9. Returned with Feedback: 25. Total: 117. Thanks! -- Michael
Re: [HACKERS] Horizontal scalability/sharding
On 30 August 2015 at 03:17, Bruce Momjian br...@momjian.us wrote: I have recently increased my public statements about the idea of adding horizontal scaling/sharding to Postgres. Glad to see it. Many people have been pushing such things for years, so it is good to finally see some debate about this on Hackers. I wanted to share with hackers a timeline of how we got here, and where I think we are going in the short term: 2012-2013: As part of writing my scaling talk (http://momjian.us/main/presentations/overview.html#scaling), studying Oracle RAC, and talking to users, it became clear that an XC-like architecture (sharding) was the only architecture that was going to allow for write scaling. What other architectures were discussed? Where was that discussion? Users and conference attendees I talked to were increasingly concerned about the ability of Postgres to scale for high write volumes. They didn't necessarily need that scale now, but they needed to know they could get it if they wanted it, and wouldn't need to switch to a new database in the future. This is similar to wanting a car that can get you on a highway on-ramp fast --- even if you don't need it, you want to know it is there. +1 2014: I started to shop around the idea that we could use FDWs, parallelism, and a transaction/snapshot manager to get XC features as built-in to Postgres. (I don't remember where the original idea came from.) It was clear that having separate forks of the source code in XC and XL was never going to achieve critical mass --- there just aren't enough people who need high right scale right now, and the fork maintenance overhead is a huge burden. I personally support the view that we should put scalability features into Postgres core, rather than run separate forks. I realized that we would never get community acceptance to dump the XC (or XL) code needed for sharding into community Postgres How or why did you realize that? There has never been any such discussion, AFAIK. Surely it can be possible to move required subsystems across? , but with FDWs, we could add the features as _part_ of improving FDWs, which would benefit FDWs _and_ would be useful for sharding. (We already see some of those FDW features in 9.5.) That is a huge presumption. Not discussed or technically analyzed in any way with the community. October, 2014: EDB and NTT started working together in the community to start improving FDWs as a basis for an FDW-based sharding solution. Many of the 9.5 FDW improvements that also benefit sharding were developed by a combined EDB/NTT team. The features improved FDWs independent of sharding, so they didn't need community buy-in on sharding to get them accepted. June, 2015: I attended the PGCon sharding unconference session and there was a huge discussion about where we should go with sharding. I think the big take-away was that most people liked the FDW approach, but had business/customer reasons for wanting to work on XC or XL because those would be production-ready faster. Cough, cough. You must surely be joking that most people liked the FDW approach? How did we measure the acceptance of this approach? What actually is the FDW approach? Since its not been written down anywhere, or even explained verbally, how can anyone actually agree to it? July, 2015: Oleg Bartunov and his new company Postgres Professional (PP) started to think about joining the FDW approach, rather than working on XL, as they had stated at PGCon in June. A joint NTT/EDB/PP phone-in meeting is scheduled for September 1. August, 2015: While speaking at SFPUG, Citus Data approached me about joining the FDW sharding team. They have been invited to the September 1 meeting, as have the XC and XL people. 2ndQuadrant is working in this area, specifically bringing XL 9.5 forwards. Please can invites be posted to myself, Pavan Deolasee and Petr Jelinek also? I'll pass on to others also. Koichi Suzuki is arranging a meeting in Hong Long for XC/XL discussions. Presumably EDB is invited also? If Koichi is a leading organizer of this, why are there two meetings? October, 2015: EDB is sponsoring a free 3-hour summit about FDW sharding at the PG-EU conference in Vienna. Everyone is invited, but it is hoped most of the September 1 folks can attend. February, 2016: Oleg is planning a similar meeting at their February Moscow conference. Anyway, I wanted to explain the work that has been happening around sharding. Thanks As things move forward, I am increasingly convinced that write scaling will be needed soon, +1 that the XC approach is the only reasonable way to do it, and that FDWs are the cleanest way to get it into community Postgres. Those two things aren't at all obvious to me. Please don't presume my opposition. If the technical information were made public, I might understand and agree with the FDW approach, perhaps
[HACKERS] SimpleTee flush
I find the TestLib.pm framework to be more friendly with the attached tweak to SimpleTee.pm. It's not a big deal, so if anyone objects I'll let it drop, and I don't think it merits anything fancier. Objections? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company SimpleTee-flush.diff Description: invalid/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SimpleTee flush
Tom Lane t...@sss.pgh.pa.us wrote: Just out of curiosity --- what's the rationale for flushing after every line, rather than once at the end of the loop? Is there actual input-reading going on underneath the @$self notation? If it's just dumping data from process memory, I should think one flush would be sufficient. Isn't the loop over file handles written to? What would you flush outside the loop? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SimpleTee flush
Kevin Grittner kgri...@ymail.com writes: Tom Lane t...@sss.pgh.pa.us wrote: Just out of curiosity --- what's the rationale for flushing after every line, rather than once at the end of the loop? Isn't the loop over file handles written to? Oh... I just automatically read it as looping over some data, not printing the same data to multiple filehandles. That's kind of odd ... but yeah, you would have to do it like that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] datestyle=postgres broken with timezone=UTC+N
Christoph Berg c...@df7cb.de writes: postgres =# set timezone = 'Etc/UTC+1'; SET postgres =# set datestyle = 'postgres'; SET postgres =# select '2015-01-01 01:00:00 +0100'::timestamptz; Wed 31 Dec 23:00:00 2014 ETC/UTC postgres =# select 'Wed 31 Dec 23:00:00 2014 ETC/UTC'::timestamptz; Wed 31 Dec 22:00:00 2014 ETC/UTC Not sure I see your point? Lacking any entry for ETC/UTC in the timezone_abbreviations file, we'll look it up in the zoneinfo database, and that will tell us it means UTC+0. So the input means 2300 UTC, and that gets displayed as 2200 in your UTC-1 display timezone. No, this isn't entirely consistent with the way the timezone name is being used in output. But that's because you've only half-configured the system for a weird timezone name. You would need an abbreviation entry as well if you want ETC/UTC to be read as UTC-1. For more info read http://www.postgresql.org/docs/9.4/static/datatype-datetime.html#DATATYPE-TIMEZONES --- see particularly the FOOBAR0 example, which is exactly what you've got here. By and large, I don't recommend use of POSIX notation in timezone settings; the IANA geographical zone names are much less likely to bite you on the rear. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SimpleTee flush
Kevin Grittner kgri...@ymail.com writes: I find the TestLib.pm framework to be more friendly with the attached tweak to SimpleTee.pm. It's not a big deal, so if anyone objects I'll let it drop, and I don't think it merits anything fancier. Just out of curiosity --- what's the rationale for flushing after every line, rather than once at the end of the loop? Is there actual input-reading going on underneath the @$self notation? If it's just dumping data from process memory, I should think one flush would be sufficient. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Information of pg_stat_ssl visible to all users
On 2015-08-30 11:33:28 -0400, Stephen Frost wrote: Yeah, I'm not really thrilled with all of this information being available to everyone on the system. We already get ding'd by people for not limiting who can see what connections there are to the database and this is doubling-down on that. I don't buy that the relevant piece of information is the CN when the connection itself is visible. Neither do I buy the argument that later hiding this for ssl once we have more granular permissions is going to be relevantly painful in comparison to changing the contents of pg_stat_activity itself. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: About CMake v2
Alvaro Herrera alvhe...@2ndquadrant.com writes: I imagine that esoteric platforms are not going to have cmake at all and are going to need their own installation anyway. Not sure if that's going to be more onerous than the requirement to install GNU make. If we get to the point where this is starting to look like a real proposal, I'll try to install cmake from source on my buildfarm critters, none of which have it ATM. And we'll need to get other buildfarm owners to do likewise. So we'll be able to acquire data points on whether that's a big problem or not. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Information of pg_stat_ssl visible to all users
* Michael Paquier (michael.paqu...@gmail.com) wrote: On Sun, Aug 30, 2015 at 5:27 AM, Bruce Momjian wrote: I know I am coming in late here, but I know Heroku uses random user names to allow a cluster to have per-user databases without showing external user name details: [...] I can see them having problems with a user being able to see the SSL remote user names of all connected users. Yep, and I can imagine that this is the case of any company managing cloud nodes with Postgres embedded, and at least to me that's a real concern. Yeah, I'm not really thrilled with all of this information being available to everyone on the system. We already get ding'd by people for not limiting who can see what connections there are to the database and this is doubling-down on that. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Potential GIN vacuum bug
On Sun, Aug 30, 2015 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Janes jeff.ja...@gmail.com writes: Your earlier point about how the current design throttles insertions to keep the pending list from growing without bound seems like a bigger deal to worry about. I think we'd like to have some substitute for that. Perhaps we could make the logic in insertion be something like if (pending-list-size threshold) { if (conditional-lock-acquire(...)) { do-pending-list-cleanup; lock-release; } else if (pending-list-size threshold * 2) { unconditional-lock-acquire(...); if (pending-list-size threshold) do-pending-list-cleanup; lock-release; } } so that once the pending list got too big, incoming insertions would wait for it to be cleared. Whether to use a 2x safety margin or something else could be a subject for debate, of course. If the goal is to not change existing behavior (like for back patching) the margin should be 1, always wait. But we would still have to deal with the fact that unconditional acquire attempt by the backends will cause a vacuum to cancel itself, which is undesirable. If we define a new namespace for this lock (like the relation extension lock has its own namespace) then perhaps the cancellation code could be made to not cancel on that condition. But that too seems like a lot of work to backpatch. Would we bother to back-patch a theoretical bug which there is no evidence is triggering in the field? Of course, if people are getting bit by this, they probably wouldn't know. You search for malevolent unicorns, get no hits, and just assume there are no hits, without scouring the table and seeing it is an index problem. Or if you do realize it is an index problem, you would probably never trace it back to the cause of the problem. There are quite a few reports of mysterious index corruptions which never get resolved. If we want to improve the current behavior rather than fix a bug, then I think that if the list is greater than threshold*2 and the cleaning lock is unavailable, what it should do is proceed to insert the tuple's keys into the index itself, as if fastupdate = off. That would require some major surgery to the existing code, as by the time it invokes the clean up, it is too late to not insert into the pending list. Cheers, Jeff
Re: [HACKERS] 9.4 broken on alpha
On Mon, Aug 31, 2015 at 8:42 AM, Tom Lane t...@sss.pgh.pa.us wrote: A useful comparison point is the testing Greg Stark did recently for VAX. Certainly no-one's ever again going to try to get useful work done with Postgres on a VAX, but that still taught us some good things about unnecessary IEEE-floating-point dependencies that had snuck into the code. Someday, that might be important; IEEE 754 won't be the last word on float arithmetic forever. Just by the way, there is at least one example of a non-IEEE floating point format supported by a current production compiler and hardware: IBM XL C on z/OS (and possibly other platforms) can use either IEEE or IBM's hex float format, depending on a compiler option. https://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] Horizontal scalability/sharding
On Mon, Aug 31, 2015 at 11:08 AM, Bruce Momjian br...@momjian.us wrote: On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote: Well, I have had many such discussions with XC/XL folks, and that was my opinion. I have seen almost no public discussion about this because the idea had almost no chance of success. If it was possible, someone would have already suggested it on this list. Or perhaps people invested in this area had other obligations or lacked motivation and/or time to work to push up for things in core. That's not possible to know, and what is done is done. Well, I have talked to everyone privately about this, and concluded that while horizontal scalability/sharding is useful, it is unlikely that the code volume of something like XC or XL would be accepted into the community, and frankly, now that we have FDWs, it is hard to imagine why we would _not_ go in the FDW direction. If I recall correctly in terms of numbers, that's indeed 40k of code, the main areas of XC code being the GTM, the planner changes for expression and join push down, and the connection pooler for parallel query execution. ISTM that FDW is a portion of the puzzle, there are other pieces that could be used toward an in-core integration, like the parallel stuff Amit Kapila is working on to allow remote query execution in parallel of local scans. Also, XC/XL were performing well on OLTP thanks to the connection pooler: this should indeed be part of the FDW portion managing the foreign scans. This may sound like a minor issue compared to the others, but already established connections help a lot when scaling out with foreign servers. Of course, people have concerns, and FDWs might need to be improved, but it is something worth researching. We might find out FDWs can't be used at all, and that we have to either add much more code to Postgres to do sharding, do something like pg_shard, or not implement built-in sharding at all, but at least it is time to research this. I am really looking forward to hearing the arguments of the authors of pg_shard on the matter. OK, I will send you a separate email and you can then supply their email addresses. FWIW, I would be interested in that as well. I worked in this area of things for a couple of years as well FWIW. OK, I will send you an email. Thanks. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Horizontal scalability/sharding
On Sun, Aug 30, 2015 at 10:36:23PM +0300, Oleg Bartunov wrote: Honestly, I still don't know which approach is better, we already played with XL (ported to 9.4) and identified some very strong issues with inconsistency, which scared us, especially taking into account how easy we found them. XC people have fixed them, but I'm not sure if they were fundamental and if we could construct more sophisticated tests and find more issues in XC/XL. We also a bit disappointed by Huawei position about CSN patch, we hoped to use for our XTM. FDW approach has been actively criticized by pg_shard people and that's also made me a bit suspicious. Yep, that has me concerned too. The pg_shard people will be on the September 1 call and are working on a Google document to explain their concerns about FDWs for sharding. It looks like we are doomed to continue several development forks, so we decided to work on very important common project, XTM, which we hoped could be accepted by all parties and eventually committed to 9.6. Now I see we were right, unfortunately. Yes, the ability to add independent parts that can eventually be used for sharding is a strong indication that doing this incrementally is a good approach. Again, could we organize meeting somewhere in September ? US is not good for us, but other places should be ok. I want to have an agreement at least on XTM. We still are testing various approaches, though. We could present results of our experiments and are open to discussion. It's not easy project, but it's something we could do for 9.6. Good. XTM is a must-have for several use-cases, including sharding. I'm very glad Bruce started this discussion in -hackers, since it's silly to me to participate in both threads :) Let's meet in September ! In summary, I think we need to start working on built-in sharding, and FDWs are the only way I can see to do it with minimal code changes, which I think might be a community requirement. It might not work, but right now, it is the only possible approach I can see. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Horizontal scalability/sharding
On Mon, Aug 31, 2015 at 7:29 AM, Bruce Momjian br...@momjian.us wrote: On Sun, Aug 30, 2015 at 03:31:10PM +0100, Simon Riggs wrote: I realized that we would never get community acceptance to dump the XC (or XL) code needed for sharding into community Postgres How or why did you realize that? There has never been any such discussion, AFAIK. Surely it can be possible to move required subsystems across? Well, I have had many such discussions with XC/XL folks, and that was my opinion. I have seen almost no public discussion about this because the idea had almost no chance of success. If it was possible, someone would have already suggested it on this list. Or perhaps people invested in this area had other obligations or lacked motivation and/or time to work to push up for things in core. That's not possible to know, and what is done is done. July, 2015: Oleg Bartunov and his new company Postgres Professional (PP) started to think about joining the FDW approach, rather than working on XL, as they had stated at PGCon in June. A joint NTT/EDB/PP phone-in meeting is scheduled for September 1. August, 2015: While speaking at SFPUG, Citus Data approached me about joining the FDW sharding team. They have been invited to the September 1 meeting, as have the XC and XL people. 2ndQuadrant is working in this area, specifically bringing XL 9.5 forwards. Yes, I saw the blog post about that: http://blog.2ndquadrant.com/working-towards-postgres-xl-9-5/ Please can invites be posted to myself, Pavan Deolasee and Petr Jelinek also? I'll pass on to others also. OK, I will send you a separate email and you can then supply their email addresses. FWIW, I would be interested in that as well. I worked in this area of things for a couple of years as well FWIW. Koichi Suzuki is arranging a meeting in Hong Long for XC/XL discussions. Presumably EDB is invited also? If Koichi is a leading organizer of this, why are there two meetings? I certainly have heard nothing about it, except third-hand people telling me a meeting is happening. I assumed those meetings where XC/XL-specific. Yep, that's my understanding as well and AFAIK as I know things have been carried this way until now, aka XC/XL and Postgres core are aimed to live as separate communities. -- Michael
Re: [HACKERS] Potential GIN vacuum bug
Jeff Janes jeff.ja...@gmail.com writes: On Sun, Aug 30, 2015 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Your earlier point about how the current design throttles insertions to keep the pending list from growing without bound seems like a bigger deal to worry about. I think we'd like to have some substitute for that. ... If the goal is to not change existing behavior (like for back patching) the margin should be 1, always wait. The current behavior is buggy, both as to performance and correctness, so I'm not sure how come don't change the behavior should be a requirement. But we would still have to deal with the fact that unconditional acquire attempt by the backends will cause a vacuum to cancel itself, which is undesirable. Good point. If we define a new namespace for this lock (like the relation extension lock has its own namespace) then perhaps the cancellation code could be made to not cancel on that condition. But that too seems like a lot of work to backpatch. We could possibly teach the autocancel logic to distinguish this lock type from others without using a new namespace. That seems a bit klugy, but maybe better than adding a new namespace. (For example, there are probably only a couple of modes in which we take page-level locks at present. Choosing a currently unused, but self-exclusive, mode for taking such a lock might serve.) Would we bother to back-patch a theoretical bug which there is no evidence is triggering in the field? What's theoretical about it? You seemed pretty sure that the issue in http://www.postgresql.org/message-id/flat/CA+bfosGVGVQhMAa=0-mue6coo7dbsgayxb-xsnr5vm-s39h...@mail.gmail.com was exactly this. If we want to improve the current behavior rather than fix a bug, then I think that if the list is greater than threshold*2 and the cleaning lock is unavailable, what it should do is proceed to insert the tuple's keys into the index itself, as if fastupdate = off. That would require some major surgery to the existing code, as by the time it invokes the clean up, it is too late to not insert into the pending list. Meh. That's introducing a whole new behavioral regime, which quite aside from correctness bugs might introduce new performance bugs of its own. It certainly doesn't sound like a better back-patch candidate than the other thing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Horizontal scalability/sharding
On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote: Well, I have had many such discussions with XC/XL folks, and that was my opinion. I have seen almost no public discussion about this because the idea had almost no chance of success. If it was possible, someone would have already suggested it on this list. Or perhaps people invested in this area had other obligations or lacked motivation and/or time to work to push up for things in core. That's not possible to know, and what is done is done. Well, I have talked to everyone privately about this, and concluded that while horizontal scalability/sharding is useful, it is unlikely that the code volume of something like XC or XL would be accepted into the community, and frankly, now that we have FDWs, it is hard to imagine why we would _not_ go in the FDW direction. Of course, people have concerns, and FDWs might need to be improved, but it is something worth researching. We might find out FDWs can't be used at all, and that we have to either add much more code to Postgres to do sharding, do something like pg_shard, or not implement built-in sharding at all, but at least it is time to research this. OK, I will send you a separate email and you can then supply their email addresses. FWIW, I would be interested in that as well. I worked in this area of things for a couple of years as well FWIW. OK, I will send you an email. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] icc vs. gcc-style asm blocks ... maybe the twain can meet?
I wrote: I came across some websites suggesting that icc will handle gcc-style asm blocks as long as you give it the -fasm-blocks command line option. It would be awfully nice to get rid of the __INTEL_COMPILER special cases in s_lock.h and the atomics headers --- would someone who has icc at hand check into this theory? Hmm ... wait a second. The main collection of asm blocks in s_lock.h believes that Intel's compiler will take gcc-style asm without any help: #if defined(__GNUC__) || defined(__INTEL_COMPILER) It has believed that since 2003. There are just two stanzas in s_lock.h that think icc needs its own implementation; one was introduced in 2005 and the other in 2014, and I'm betting both of them are confused about it. The other places where __INTEL_COMPILER is used to exclude an asm block are also of relatively recent vintage. I'm suspecting that they are cargo-cult programming rather than actually necessary special cases. It's possible that these extra implementations are worth the trouble to carry because icc is smarter about those special intrinsics than it is about asm blocks. However, unless someone can point to some solid evidence of that, I think we should get rid of 'em. That code is quite enough of an #ifdef snake's nest without carrying versions we don't demonstrably need. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal of Table Partition
Hi, everyone! I'd like to propose a postgres partition implementation. First, I would show the design to everyone, and talk about it. If we think the design is not very bad, and can be commit to the PostgreSQL baseline, then I will post the code to the community. (note: my english is not very good.) Table Partition Design = In this design, partitions are normal tables in inheritance hierarchies, with the same table structure with the partitioned table. In pg_class we have an additional relpartition field which has following values: 's'/* single regular table */ 'r'/* partitioned table by range */ 'l'/* partitioned table by list */ 'h'/* partitioned table by hash */ 'c'/* child partition table */ Add a new system schema named 'pg_partition', just like 'pg_toast', we can create the partition catalog table to store the partition entries. let's assume the partition catalog's name is pg_partition_2586 (2586 is the partitioned table's OID in pg_class). a range or interval partition catalog's structure is as follows: columndata typecomment partnamenamea partition's name, this is the primary key partidoida partition's OID in pg_class intervaltexta interval partition's interval(maybe a expression) partkey1depends on partitioned table ... partkeyNdepends on partitioned table partkey1, ..., partkeyN is a partition's upper bound. Finally, make a unique constraint on partkey1, ..., partkeyN. Every time we create a new partition, we insert a new tuple into this partition catalog. Every time we drop an old partition, we delete the related tuple in this partition catalog. For a partitioned table's CREATE action, we should transform the action into the CREATE action of partitioned table and partitions, and the INSERT action into the partition catalog. For INSERT action, we implement a RelationGetTuplePartid method, which can find the partition the tuple belongs to. It will do an index scan on the partition catalog table(assume it is pg_partition_2586) to find the partition. and a ExecGetPartitionResultRel method, which can return the partition's ResultRelInfo to execute INSERT action. For partitioned table's scan action, and JOIN action, we implemented a plan node named 'PartitionExpand'. the plan node can expand the partitioned table scan node into a list of partitions according to the filter and conditions. and it can expand partitioned table JOIN node into a list of partitions JOIN node wisely. For pg_dump backup action, we should dump the partition catalog, and relpartition field in pg_class. so these are the main points of the design, and I can show any detail you wondered later.
[HACKERS] [PROPOSAL] Table Partition
Hi, everyone! I'd like to propose a postgres partition implementation. First, I would show the design to everyone, and talk about it. If we think the design is not very bad, and can be commit to the PostgreSQL baseline, then I will post the code to the community. (note: my english is not very good.) Table Partition Design = In this design, partitions are normal tables in inheritance hierarchies, with the same table structure with the partitioned table. In pg_class we have an additional relpartition field which has following values: 's'/* single regular table */ 'r'/* partitioned table by range */ 'l'/* partitioned table by list */ 'h'/* partitioned table by hash */ 'c'/* child partition table */ Add a new system schema named 'pg_partition', just like 'pg_toast', we can create the partition catalog table to store the partition entries. let's assume the partition catalog's name is pg_partition_2586 (2586 is the partitioned table's OID in pg_class). a range or interval partition catalog's structure is as follows: columndata typecomment partnamenamea partition's name, this is the primary key partidoida partition's OID in pg_class intervaltexta interval partition's interval(maybe a expression) partkey1depends on partitioned table ... partkeyNdepends on partitioned table partkey1, ..., partkeyN is a partition's upper bound. Finally, make a unique constraint on partkey1, ..., partkeyN. Every time we create a new partition, we insert a new tuple into this partition catalog. Every time we drop an old partition, we delete the related tuple in this partition catalog. For a partitioned table's CREATE action, we should transform the action into the CREATE action of partitioned table and partitions, and the INSERT action into the partition catalog. For INSERT action, we implement a RelationGetTuplePartid method, which can find the partition the tuple belongs to. It will do an index scan on the partition catalog table(assume it is pg_partition_2586) to find the partition. and a ExecGetPartitionResultRel method, which can return the partition's ResultRelInfo to execute INSERT action. For partitioned table's scan action, and JOIN action, we implemented a plan node named 'PartitionExpand'. the plan node can expand the partitioned table scan node into a list of partitions according to the filter and conditions. and it can expand partitioned table JOIN node into a list of partitions JOIN node wisely. We implemented a DynamicPrunePartition method, which can expand the partitioned table's scan node into a list of partition's scan node. We implemented a DynamicPrunePartitionJoin method, which can expand the partitioned table's JOIN node into a list of partition's JOIN node. These expand action happend in ExecInitPartitionExpand function, when initialize the executor. and all these action implemented based on the partition catalog. For UPDATE and DELETE action, we just set real partition as the ResultRelInfo, when ExecPartitionExpand is running. For pg_dump backup action, we should dump the partition catalog, and relpartition field in pg_class. so these are the main points of the design, and I can show any detail you wondered later.
Re: [HACKERS] Horizontal scalability/sharding
On Sun, Aug 30, 2015 at 03:31:10PM +0100, Simon Riggs wrote: On 30 August 2015 at 03:17, Bruce Momjian br...@momjian.us wrote: I have recently increased my public statements about the idea of adding horizontal scaling/sharding to Postgres. Glad to see it. Many people have been pushing such things for years, so it is good to finally see some debate about this on Hackers. Agreed. Right now, in our community, we are only seeing users who are happy with what Postgres offers but think they might need massive horizontal scalability in the future. I think there is a larger group that cares about massive horizontal scalability, but those people are using other software right now, so we don't see them yet. Without a roadmap for built-in massive horizontal scalability, I think Postgres adoption will eventually suffer. I wanted to share with hackers a timeline of how we got here, and where I think we are going in the short term: 2012-2013: As part of writing my scaling talk (http://momjian.us/main/presentations/overview.html#scaling), studying Oracle RAC, and talking to users, it became clear that an XC-like architecture (sharding) was the only architecture that was going to allow for write scaling. What other architectures were discussed? Where was that discussion? That was mostly my conclusion. I explained it to small groups at conferences and Postgres user groups. No one said I was wrong, but that is about the level of debate I had. 2014: I started to shop around the idea that we could use FDWs, parallelism, and a transaction/snapshot manager to get XC features as built-in to Postgres. (I don't remember where the original idea came from.) It was clear that having separate forks of the source code in XC and XL was never going to achieve critical mass --- there just aren't enough people who need high right scale right now, and the fork maintenance overhead is a huge burden. I personally support the view that we should put scalability features into Postgres core, rather than run separate forks. Good, I do think it is time, but as I stated above, there is limited interest in our current community, so the tolerance for additional community code to accomplish this is also limited. This is the big thing that had me excited about using FDWs --- FDW improvements can get us closer to sharding without requiring community acceptance of sharding-only features. I realized that we would never get community acceptance to dump the XC (or XL) code needed for sharding into community Postgres How or why did you realize that? There has never been any such discussion, AFAIK. Surely it can be possible to move required subsystems across? Well, I have had many such discussions with XC/XL folks, and that was my opinion. I have seen almost no public discussion about this because the idea had almost no chance of success. If it was possible, someone would have already suggested it on this list. , but with FDWs, we could add the features as _part_ of improving FDWs, which would benefit FDWs _and_ would be useful for sharding. (We already see some of those FDW features in 9.5.) That is a huge presumption. Not discussed or technically analyzed in any way with the community. True. It seemed pretty obvious to me. October, 2014: EDB and NTT started working together in the community to start improving FDWs as a basis for an FDW-based sharding solution. Many of the 9.5 FDW improvements that also benefit sharding were developed by a combined EDB/NTT team. The features improved FDWs independent of sharding, so they didn't need community buy-in on sharding to get them accepted. June, 2015: I attended the PGCon sharding unconference session and there was a huge discussion about where we should go with sharding. I think the big take-away was that most people liked the FDW approach, but had business/customer reasons for wanting to work on XC or XL because those would be production-ready faster. Cough, cough. You must surely be joking that most people liked the FDW approach? How did we measure the acceptance of this approach? Well, I didn't have my audience-meter with me at the time. ;-) The discussion was mostly in the hallway after the unconference session, Future of PostgreSQL shared-nothing cluster by Konstantin Knizhnik, Alexander Korotkov, and Oleg Bartunov. Again, when I explained the ability to use FDWs to get sharding into Postgres with minimal additional code, no one said the idea was crazy, which I took as a big thumbs-up! When I asked why to continue with XC/XL, I was told those were more mature and more customer-ready, which is true. I will not quote people from the from the hallway discussion for privacy reasons. What actually is the FDW approach? Since its not been written down anywhere, or even
Re: [HACKERS] Horizontal scalability/sharding
On Sun, Aug 30, 2015 at 10:08:06PM -0400, Bruce Momjian wrote: On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote: Well, I have had many such discussions with XC/XL folks, and that was my opinion. I have seen almost no public discussion about this because the idea had almost no chance of success. If it was possible, someone would have already suggested it on this list. Or perhaps people invested in this area had other obligations or lacked motivation and/or time to work to push up for things in core. That's not possible to know, and what is done is done. Well, I have talked to everyone privately about this, and concluded that while horizontal scalability/sharding is useful, it is unlikely that the code volume of something like XC or XL would be accepted into the community, and frankly, now that we have FDWs, it is hard to imagine why we would _not_ go in the FDW direction. Actually, there was hope that XC or XL would get popular enough that it would justify adding their code into community Postgres, but that never happened. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Horizontal scalability/sharding
On Mon, Aug 31, 2015 at 11:48 AM, Bruce Momjian br...@momjian.us wrote: On Sun, Aug 30, 2015 at 10:08:06PM -0400, Bruce Momjian wrote: On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote: Well, I have had many such discussions with XC/XL folks, and that was my opinion. I have seen almost no public discussion about this because the idea had almost no chance of success. If it was possible, someone would have already suggested it on this list. Or perhaps people invested in this area had other obligations or lacked motivation and/or time to work to push up for things in core. That's not possible to know, and what is done is done. Well, I have talked to everyone privately about this, and concluded that while horizontal scalability/sharding is useful, it is unlikely that the code volume of something like XC or XL would be accepted into the community, and frankly, now that we have FDWs, it is hard to imagine why we would _not_ go in the FDW direction. Actually, there was hope that XC or XL would get popular enough that it would justify adding their code into community Postgres, but that never happened. Forks are aimed to die without proper maintenance resources. Still, for XC/XL, what does not help is the complication of the architecture and SPOF management, particularly thinking with the GTM that was something completely new and not well understood (there is a GTM standby but this model is weak IMO and does not scale similarly to what you get with standbys, and impacts the overall performance of the cluster). -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] icc vs. gcc-style asm blocks ... maybe the twain can meet?
I came across some websites suggesting that icc will handle gcc-style asm blocks as long as you give it the -fasm-blocks command line option. It would be awfully nice to get rid of the __INTEL_COMPILER special cases in s_lock.h and the atomics headers --- would someone who has icc at hand check into this theory? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Extended query protocol violation?
While playing with Java application using JDBC driver, I noticed an interesting fact: When autocommit is off, JDBC driver issues following messages: 20:10:54.731 (2) FE= Parse(stmt=S_1,query=BEGIN,oids={}) 20:10:54.731 (2) FE= Bind(stmt=S_1,portal=null) 20:10:54.731 (2) FE= Execute(portal=null,limit=0) 20:10:54.732 (2) FE= Parse(stmt=null,query=SELECT ... 20:10:54.733 (2) FE= Bind(stmt=null,portal=null) 20:10:54.733 (2) FE= Describe(portal=null) 20:10:54.733 (2) FE= Execute(portal=null,limit=0) 20:10:54.733 (2) FE= Sync 20:10:54.734 (2) =BE ParseComplete [S_1] 20:10:54.734 (2) =BE BindComplete [null] 20:10:54.734 (2) =BE CommandStatus(BEGIN) 20:10:54.735 (2) =BE ParseComplete [null] 20:10:54.735 (2) =BE BindComplete [null] 20:10:54.735 (2) =BE RowDescription(15) Notice that JDBC driver sends Parse, Bind and Execute without Sync followed then immediately sends another Parse message. I wonder if this violates our extended query protocol. From the manual: At completion of each series of extended-query messages, the frontend should issue a Sync message. each series of extended-query messages is a little vague here but it seems it referes to a sequence of message starting with parse and ending with execute message to me. If so, I think above example of message sequence violates the protocol. The application produces the packet sequence is essentially like this: connection.setAutoCommit(false); stmt = connection.createStatement(); rs = stmt.executeQuery(sql); Comments? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended query protocol violation?
Tatsuo Ishii is...@postgresql.org writes: Notice that JDBC driver sends Parse, Bind and Execute without Sync followed then immediately sends another Parse message. I wonder if this violates our extended query protocol. It does not. At completion of each series of extended-query messages, the frontend should issue a Sync message. That's a should, not a must. The important point here is that if the BEGIN were to fail for some reason, the backend would skip the second command altogether, since it would skip to the Sync before resuming processing messages. If the JDBC driver isn't expecting that behavior, that's a bug in the driver --- but it's not a protocol violation. In fact, the protocol is intentionally designed to let you stack up commands like that. It gives you a tradeoff of potential concurrency vs possible complexity in error recovery handling. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] checkpointer continuous flushing
On Mon, Aug 24, 2015 at 12:45 PM, Fabien COELHO coe...@cri.ensmp.fr wrote: Also check the file: sh file ./avg.py ./avg.py: Python script, UTF-8 Unicode text executable There were some CRLF line terminators, after removing those, it worked fine and here are the results of some of the tests done for sorting patch (checkpoint-continuous-flush-10-a) : Config Used -- M/c details IBM POWER-8 24 cores, 192 hardware threads RAM = 492GB Test details -- warmup=60 scale=300 max_connections=150 shared_buffers=8GB checkpoint_timeout=2min time=7200 synchronous_commit=on max_wal_size=5GB parallelism - 128 clients, 128 threads Sort - off avg over 7200: 8256.382528 ± 6218.769282 [0.00, 76.05, 10975.50, 13105.95, 21729.00] percent of values below 10.0: 19.5% Sort - on avg over 7200: 8375.930639 ± 6148.747366 [0.00, 84.00, 10946.00, 13084.00, 20289.90] percent of values below 10.0: 18.6% Before going to conclusion, let me try to explain above data (I am explaining again even though Fabien has explained, to make it clear if someone has not read his mail) Let's try to understand with data for sorting - off option avg over 7200: 8256.382528 ± 6218.769282 8256.382528 - average tps for 7200s pgbench run 6218.769282 - standard deviation on per second figures [0.00, 84.00, 10946.00, 13084.00, 20289.90] These 5 values can be read as minimum TPS, q1, median TPS, q3, maximum TPS over 7200s pgbench run. As far as I understand q1 and q3 median of subset of values which I didn't focussed much. percent of values below 10.0: 19.5% Above means percent of time the result is below 10 tps. Now about test results, these tests are done for pgbench full speed runs and the above results indicate that there is approximately 1.5% improvement in avg. TPS and ~1% improvement in tps values which are below 10 with sorting on and there is almost no improvement in median or maximum TPS values, instead they or slightly less when sorting is on which could be due to run-to-run variation. I have done more tests as well by varying time and number of clients keeping other configuration same as above, but the results are quite similar. The results of sorting patch for the tests done indicate that the win is not big enough with just doing sorting during checkpoints, we should consider flush patch along with sorting. I would like to perform some tests with both the patches together (sort + flush) unless somebody else thinks that sorting patch alone is beneficial and we should test some other kind of scenarios to see it's benefit. The reason for the tablespace balancing is that in the current postgres buffers are written more or less randomly, so it is (probably) implicitely and statistically balanced over tablespaces because of this randomness, and indeed, AFAIK, people with multi tablespace setup have not complained that postgres was using the disks sequentially. However, once the buffers are sorted per file, the order becomes deterministic and there is no more implicit balancing, which means that if someone has a pg setup with several disks it will write sequentially on these instead of in parallel. What if tablespaces are not on separate disks or not enough hardware support to make Writes parallel? I think for such cases it might be better to do it sequentially. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Horizontal scalability/sharding
On Sun, Aug 30, 2015 at 7:47 AM, Bruce Momjian br...@momjian.us wrote: I have recently increased my public statements about the idea of adding horizontal scaling/sharding to Postgres. I wanted to share with hackers a timeline of how we got here, and where I think we are going in the short term: 2012-2013: As part of writing my scaling talk (http://momjian.us/main/presentations/overview.html#scaling), studying Oracle RAC, and talking to users, it became clear that an XC-like architecture (sharding) was the only architecture that was going to allow for write scaling. I think sharding like architecture is quite useful for certain kind of workloads where users can manage to arrange queries and data layout in an optimized way which I hope users might agree to change if required. One thing to consider here is what kind of scaling are we expecting in such a system and is it sufficient considering we will keep focussed on this architecture for horizontal scalability? Generally speaking, the scaling in such systems is limited by the number of profitable partitions user can create based on data and then cross-partition transactions sucks the performance/scalability in such systems. I understand that there is definitely a benefit in proceeding with sharding like architecture as there are already some PostgreSQL based forks which uses such architecture, so if we follow same way, we can save some effort rather than inventing or following some other architecture, however there is no harm is discussing pros and cons of some other architectures like Oracle RAC, Google F1 or others. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] On-demand running query plans using auto_explain and signals
2015-08-30 10:34 GMT+02:00 Pavel Stehule pavel.steh...@gmail.com: 2015-08-30 10:30 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de: On Aug 29, 2015 7:31 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2015-08-29 18:36 GMT+02:00 Andres Freund and...@anarazel.de: On 2015-08-29 18:27:59 +0200, Pavel Stehule wrote: 2015-08-29 18:25 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de Good point. There's still hope to set a flag and process it later on. Will have to check if it's possible to stay in the scope of a loaded module though. I had a workable prototype - and It was implemented very similar as handling CANCEL Where did you put the handling of that kind of interrupt? Directly into ProcessInterrupts()? Probably. I don't remember it well, but it need hack code - it cannot be used from extension. Do you still have the code somewhere around? Did it see production use? I sent it to mailing list year ago http://www.postgresql.org/message-id/cafj8praxcs9b8abgim-zauvggqdhpzoarz5ysp1_nhv9hp8...@mail.gmail.com Regards Pavel I am not sure I am able to find it - I'll try. We didn't use it on production. Thanks! -- Alex
Re: [HACKERS] Horizontal scalability/sharding
At PGCon we agreed to have such meeting in Vienna at least. But I think we should be prepared and try to clean all our issues before. It looks like we already out of time,but probably we could meet in Hong Kong ? Honestly, I still don't know which approach is better, we already played with XL (ported to 9.4) and identified some very strong issues with inconsistency, which scared us, especially taking into account how easy we found them. XC people have fixed them, but I'm not sure if they were fundamental and if we could construct more sophisticated tests and find more issues in XC/XL. We also a bit disappointed by Huawei position about CSN patch, we hoped to use for our XTM. FDW approach has been actively criticized by pg_shard people and that's also made me a bit suspicious. It looks like we are doomed to continue several development forks, so we decided to work on very important common project, XTM, which we hoped could be accepted by all parties and eventually committed to 9.6. Now I see we were right, unfortunately. Distributed transaction manager should support at least three things 1. Atomic commit 2. Atomic visibility 3. Consistent snapshots (e.g. required for repeatable reads and higher isolation levels). I have submitted patch for implementing first for FDWs. The patch adds infrastructure to be used by all FDWs including postgres_fdw. It also adds postgres_fdw code to use this infrastructure. The same can be used to achieve atomic commit in postgres_fdw based sharding. Please see if XTM can benefit from it. If there are things that are required by XTM, please post the requirements on that thread and I will work on those. You can find the latest patch at http://www.postgresql.org/message-id/CAFjFpRfANWL53+x2HdM9TCNe5pup=opkqssj-kgfr-d2efj...@mail.gmail.com Again, could we organize meeting somewhere in September ? US is not good for us, but other places should be ok. I want to have an agreement at least on XTM. We still are testing various approaches, though. We could present results of our experiments and are open to discussion. It's not easy project, but it's something we could do for 9.6. I'm very glad Bruce started this discussion in -hackers, since it's silly to me to participate in both threads :) Let's meet in September ! -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] [PATCH] SQL function to report log message
On Sun, Aug 30, 2015 at 4:52 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hi I am starting to work review of this patch Hi Pavel, Thanks for your review. 2015-07-13 9:54 GMT+02:00 dinesh kumar dineshkuma...@gmail.com: Hi All, Greetings for the day. Would like to discuss on below feature here. Feature: Having an SQL function, to write messages to log destination. Justification: As of now, we don't have an SQL function to write custom/application messages to log destination. We have RAISE clause which is controlled by log_ parameters. If we have an SQL function which works irrespective of log settings, that would be a good for many log parsers. What i mean is, in DBA point of view, if we route all our native OS stats to log files in a proper format, then we can have our log reporting tools to give most effective reports. Also, Applications can log their own messages to postgres log files, which can be monitored by DBAs too. Implementation: Implemented a new function pg_report_log which takes one argument as text, and returns void. I took, LOG prefix for all the reporting messages.I wasn't sure to go with new prefix for this, since these are normal LOG messages. Let me know, if i am wrong here. Here is the attached patch. This patch is not complex, but the implementation doesn't cover a ereport well. Although this functionality should be replaced by custom function in any PL (now or near future), I am not against to have this function in core. There are lot of companies with strong resistance against stored procedures - and sometimes this functionality can help with SQL debugging. Issues: 1. Support only MESSAGE field in exception - I am expecting to support all fields: HINT, DETAIL, ... 2. Missing regress tests 3. the parsing ereport level should be public function shared with PLpgSQL and other PL 4. should be hidestmt mandatory parameter? 5. the function declaration is strange postgres=# \sf pg_report_log (text, anyelement, boolean) CREATE OR REPLACE FUNCTION pg_catalog.pg_report_log(text, anyelement, boolean) RETURNS void LANGUAGE sql STABLE STRICT COST 1 AS $function$SELECT pg_report_log($1::pg_catalog.text, $2::pg_catalog.text, $3::boolean)$function$ Why polymorphic? It is useless on any modern release postgres=# \sf pg_report_log (text, text, boolean) CREATE OR REPLACE FUNCTION pg_catalog.pg_report_log(text, text, boolean) RETURNS void LANGUAGE internal IMMUTABLE STRICT AS $function$pg_report_log$function$ Why stable, why immutable? This function should be volatile. 6. using elog level enum as errcode is wrong idea - errcodes are defined in table http://www.postgresql.org/docs/9.4/static/errcodes-appendix.html Let me go through each concern and will update you on this. Regards, Dinesh manojadinesh.blogspot.com Regards Pavel Regards, Dinesh manojadinesh.blogspot.com