Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
On 08/06/2015 01:10 PM, Simon Riggs wrote: Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to implement block sampling. 1. Test P for each block individually. This gives a range of possible results, with 0 blocks being possible outcome, though decreasing in probability as P increases for fixed N. This is the same way BERNOULLI works, we just do it for blocks rather than rows. 2. We calculate P/N at start of scan and deliver this number blocks by random selection from N available blocks. At present we do (1), exactly as documented. (2) is slightly harder since we'd need to track which blocks have been selected already so we can use a random selection with no replacement algorithm. On a table with uneven distribution of rows this would still return a variable sample size, so it didn't seem worth changing. Aha, thanks! So, seems like this is just a doc issue? That is, we just need to document that using SYSTEM on very small sample sizes may return unexpected numbers of results ... and maybe also how the algorithm actually works. I agree that implementing (2) makes more sense as an additional algorithm for someone to write in the future. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
Petr Jelinek wrote: On 2015-08-06 22:25, Josh Berkus wrote: If there is no appropriate place, I'll just write a blog. There is a blog post on 2ndQ blog page which tries to describe the sampling methods visually, not sure if it's more obvious from that or not. It's somewhat broken on planet though (only title there). http://blog.2ndquadrant.com/tablesample-in-postgresql-9-5/ -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
On 6 August 2015 at 20:14, Josh Berkus j...@agliodbs.com wrote: This table has around 185 rows per page. As the sample size goes up, the number times I get zero rows goes down, but those results seem to still include data pages with zero rows. For example, here's a series of results from a 0.04 sample against the million-row table. 370 370 370 555 555 185 0 925 Since this is a synthetic table I just generated, it contains almost exactly 185 rows per data page for every data page. So on a 0.04% sample, the variation between 370 rows and 555 rows (whether we have 2 or 3 data pages) is expected, since 0.04% of 5406 data pages is 2.16 pages. The results of 0, 185 and 925 are not. It really seems like SYSTEM is treating 0.04% as a maximum, but taking a random number of data pages somewhere around that maximum, using math which can choose numbers of pages far outside of the % requested by the user, and which includes 0. Speaking from a user perspective, SYSTEM seems broken to me. I can't imagine using it for anything with a that degree of variation in the number of results returned, especially if it's possible to return zero rows from a populated table. The docs say... The SYSTEM method does block-level sampling with each block having the specified chance of being selected; all rows in each selected block are returned. SQLStandard suggests the parameter value should be an integer, which would limit it to 1% sample. Allowing a non-integer sample size is a PostgreSQL extension. There is no guidance on how to implement SYSTEM in the standard. Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to implement block sampling. 1. Test P for each block individually. This gives a range of possible results, with 0 blocks being possible outcome, though decreasing in probability as P increases for fixed N. This is the same way BERNOULLI works, we just do it for blocks rather than rows. 2. We calculate P/N at start of scan and deliver this number blocks by random selection from N available blocks. At present we do (1), exactly as documented. (2) is slightly harder since we'd need to track which blocks have been selected already so we can use a random selection with no replacement algorithm. On a table with uneven distribution of rows this would still return a variable sample size, so it didn't seem worth changing. This is exactly why TABLESAMPLE was designed to allow you to write your own sampling algorithm, so you can do it anyway you want. The context here is that SELECT count(*) from the test table takes 10ms. The intended use case for this feature is against tables that would give some kind of performance problem, which isn't the case here. On a table 100x larger, the sample is 1000 +/- 40, which seems more acceptable. The variability of Bernoulli is much less, but then it is roughly 100x slower. On a 1TB table, SYSTEM is going to be your only choice. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
On 08/06/2015 01:14 PM, Josh Berkus wrote: On 08/06/2015 01:10 PM, Simon Riggs wrote: Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to implement block sampling. 1. Test P for each block individually. This gives a range of possible results, with 0 blocks being possible outcome, though decreasing in probability as P increases for fixed N. This is the same way BERNOULLI works, we just do it for blocks rather than rows. 2. We calculate P/N at start of scan and deliver this number blocks by random selection from N available blocks. At present we do (1), exactly as documented. (2) is slightly harder since we'd need to track which blocks have been selected already so we can use a random selection with no replacement algorithm. On a table with uneven distribution of rows this would still return a variable sample size, so it didn't seem worth changing. Aha, thanks! So, seems like this is just a doc issue? That is, we just need to document that using SYSTEM on very small sample sizes may return unexpected numbers of results ... and maybe also how the algorithm actually works. Following up on this ... where is TABLESAMPLE documented other than in the SELECT command? Doc search on the website is having issues right now. I'm happy to write a doc patch. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Mon, Aug 3, 2015 at 9:09 AM, Merlin Moncure mmonc...@gmail.com wrote: hm. OK, what's the behavior of: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 1; END; RAISE EXCEPTION ...; EXCEPTION ... END; Sure, so that case might need a little bit of special handling. That doesn't mean it's a good idea for heavyweight locks to conflict in general. I think you're going to find that implementing the latter is an extremely unrewarding task, and that the benefits are seriously negative. For example, consider: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 2; END; END; Now, suppose that a concurrent session does LOCK TABLE foo after the first UPDATE and before the second one. That's now a soft deadlock. But the only way the deadlock detector can see that is if the main transaction and the autonomous transaction have separate PGPROC entries, which is a design we explicitly rejected because it puts a tight limit on the number of ATs that can be in progress and the level to which those ATs can be nested. But let's say you don't care, so we go back to that design. The deadlock detector will have to be taught that the outer transaction can't help but wait for the inner transaction, so we teach it that. Now it can see that the only way to resolve the deadlock without aborting any transactions is to reorder the lock request from the autonomous transaction ahead of the concurrent session that is seeking a full table lock. So the autonomous transaction acquires the lock without blocking after all. You have exactly the same result that you would have had anyway but with a phenomenal amount of additional code and complexity. And for what? In the original example, the way the deadlock is going to be reported is like this: ERROR: deadlock detected DETAIL: Process 12345 waits for ShareLock on transaction 1000; blocked by process 12345. Process 12345 waits for ShareLock on transaction 1001; blocked by process 12345. That is not a model of clarity. On the other hand, if you just make a rule that attempting to update or delete a tuple that an outer transaction has already updated throws a bespoke error, you can do something like this: ERROR: tuple to be updated was already modified by a suspended outer transaction ...which has precedent in an existing message in trigger.c. Similarly, if you try to drop a table that the outer transaction has locked, the natural thing is for CheckTableNotInUse() to catch that and report it this way: ERROR: cannot DROP TABLE foo because it is being used by active queries in this session If you work hard enough, you can instead make that generate a deadlock error message, but you're going to have to work pretty hard, and the result is worse. I'd really like to hear some more *specific* scenarios where it's valuable for locks to conflict between the outer transaction and the AT. I grant that tuple updates are a case where the conflict has to be detected somehow, but I don't accept that the lock manager is the best way to do that, and I don't accept that there are a large number of other cases that will need similar handling. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
Bruce Momjian wrote: On Fri, Jun 19, 2015 at 08:21:19PM +0200, Andres Freund wrote: I think 647248e3708, 4fe384bd85, 4f85fde8, 59f71a0d0 should also be I couldn't look up 647248e3708, I got unknown revision or path not in the working tree. a 6 is missing. 6647248e3708 -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Fri, Jun 26, 2015 at 3:39 PM, Peter Geoghegan p...@heroku.com wrote: I attach a compatibility note that is clearly needed; adding this is an open item of mine for 9.5. This concerns foreign data wrappers and UPSERT. Can you look at this please, Bruce? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 06:44:40PM -0300, Alvaro Herrera wrote: Bruce Momjian wrote: On Fri, Jun 19, 2015 at 08:21:19PM +0200, Andres Freund wrote: I think 647248e3708, 4fe384bd85, 4f85fde8, 59f71a0d0 should also be I couldn't look up 647248e3708, I got unknown revision or path not in the working tree. a 6 is missing. 6647248e3708 Thanks. It was this one: Don't allow immediate interrupts during authentication anymore. Falls into the same category, but if someone can write up an entry that covers them all, I think it would make sense to add it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thinko in processing of SHM message size info?
Robert Haas robertmh...@gmail.com wrote: On Thu, Aug 6, 2015 at 1:24 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Aug 6, 2015 at 9:04 AM, Antonin Houska a...@cybertec.at wrote: Can anyone please explain why the following patch shouldn't be applied? diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c index 126cb07..4cd52ac 100644 --- a/src/backend/storage/ipc/shm_mq.c +++ b/src/backend/storage/ipc/shm_mq.c @@ -584,7 +584,7 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait) if (mqh-mqh_partial_bytes + rb sizeof(Size)) lengthbytes = sizeof(Size) - mqh-mqh_partial_bytes; else - lengthbytes = rb - mqh-mqh_partial_bytes; + lengthbytes = rb; memcpy(mqh-mqh_buffer[mqh-mqh_partial_bytes], rawdata, lengthbytes); mqh-mqh_partial_bytes += lengthbytes; I'm failing to understand why anything should be subtracted. Note that the previous iteration must have called shm_mq_inc_bytes_read(), so rb should not include anything of mqh-mqh_partial_bytes. Thanks. Hmm, I think you are correct. This would matter in the case where the message length word was read in more than two chunks. I don't *think* that's possible right now because I believe the only systems where MAXIMUM_ALIGNOF sizeof(Size) are those with MAXIMUM_ALIGNOF == 4 and sizeof(Size) == 8. However, if we had systems where MAXIMUM_ALIGNOF == 4 and sizeof(Size) == 16, or systems where MAXIMUM_ALIGNOF == 2 and sizeof(Size) == 8, this would be a live bug. I ought to admit that I didn't think about the specific combinations of MAXIMUM_ALIGNOF and sizeof(Size), and considered the problem rather rare (maybe also because it can't happen on my workstation). But the next your consideration makes sense to me: Hmm, actually, maybe it is a live bug anyway, because the if statement tests rather than =. If we've read 4 bytes and exactly 4 more bytes are available, we'd set lengthbytes to 0 instead of 4. Oops. -- Antonin Houska Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
On 6 August 2015 at 21:14, Josh Berkus j...@agliodbs.com wrote: On 08/06/2015 01:10 PM, Simon Riggs wrote: Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to implement block sampling. 1. Test P for each block individually. This gives a range of possible results, with 0 blocks being possible outcome, though decreasing in probability as P increases for fixed N. This is the same way BERNOULLI works, we just do it for blocks rather than rows. 2. We calculate P/N at start of scan and deliver this number blocks by random selection from N available blocks. (My mistake, that would be P*N) At present we do (1), exactly as documented. (2) is slightly harder since we'd need to track which blocks have been selected already so we can use a random selection with no replacement algorithm. On a table with uneven distribution of rows this would still return a variable sample size, so it didn't seem worth changing. Aha, thanks! So, seems like this is just a doc issue? That is, we just need to document that using SYSTEM on very small sample sizes may return unexpected numbers of results ... and maybe also how the algorithm actually works. For me, the docs seem exactly correct. The mathematical implications of that just aren't recorded explicitly. I will try to reword or add something to make it clear that this can return a variable number of blocks and thus produces a result with greater variability in the number of rows returned. It's documented on the SELECT page only; plus there is a whole new section on writing tablesample functions. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
On 2015-08-06 22:25, Josh Berkus wrote: On 08/06/2015 01:19 PM, Simon Riggs wrote: For me, the docs seem exactly correct. The mathematical implications of that just aren't recorded explicitly. Well, for the SELECT page, all we need is the following (one changed sentence): The SYSTEM method is significantly faster than the BERNOULLI method when small sampling percentages are specified, but it may return a less-random sample of the table as a result of clustering effects, and may return a highly variable number of results for very small sample sizes. BTW this was one of the motivations for making tsm_system_rows contrib module, that one will give you exact number of tuples while still doing page level sampling. But since it does linear probing it's only useful if you want those really small amounts of tuples because it will always do random I/O even if you are scanning large part of the table. I will try to reword or add something to make it clear that this can return a variable number of blocks and thus produces a result with greater variability in the number of rows returned. It's documented on the SELECT page only; plus there is a whole new section on writing tablesample functions. Seems like it would be nice to have more detailed user docs somewhere which explain the sampling algos we have, especially if we get more in the future. Not sure where would be appropriate for that, though. If there is no appropriate place, I'll just write a blog. There is a blog post on 2ndQ blog page which tries to describe the sampling methods visually, not sure if it's more obvious from that or not. It's somewhat broken on planet though (only title there). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 3:06 PM, Bruce Momjian br...@momjian.us wrote: Below performance improvement in the General Performance category is missing: Reduce btree scan overhead for and strategies For , =, and = strategies, mark the first scan key as already matched if scanning in an appropriate direction. If index tuple contains no nulls we can skip the first re-check for each tuple. While this is a nice 9.5 feature, it really is btree is faster, and users usually don't need to know that, unless the change is massive that they would change their use of the feature. I think that Rajeev is entitled to be credited for his work, especially as a less experienced contributor. Sure, users are not likely to care too much about incremental progress like this (however, I would be willing to bet that more users care about this item than about existing items like Make initdb issue a warning about placing the data directory at the top of a file system mount point). IMV it is the role of the release notes to document what went into a release fairly methodically. I often look at release notes to determine what might have gone wrong in a part of the code that I'm less familiar with, for example. Users mostly only specifically care about one or two big ticket items, and in any case are not likely to learn about them from the release notes. The release notes seem shorter than previous years, and I don't think it's because 9.5 is a smaller release. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 01:48:26PM -0400, Robert Haas wrote: On Thu, Aug 6, 2015 at 10:33 AM, Bruce Momjian br...@momjian.us wrote: On Sun, Jun 14, 2015 at 12:05:54PM +0100, Dean Rasheed wrote: On 11 June 2015 at 05:15, Bruce Momjian br...@momjian.us wrote: I have committed the first draft of the 9.5 release notes. You can view the output here: http://momjian.us/pgsql_docs/release-9-5.html I think it's worth mentioning dcbf5948e12aa60b4d6ab65b6445897dfc971e01, probably under General Performance. It's an optimisation, and also a user-visible change to the way LEAKPROOF works. Perhaps something like Allow pushdown of non-leakproof functions into security barrier views if the function is not passed any arguments from the view. Previously only functions marked as LEAKPROOF could be pushed down into security barrier views. Sorry, just looking at this now. Since RLS is new for 9.5, we wouldn't mention the RLS change in the release notes because is is part of the RLS new features, but we could mention the SB change --- the new text would be: Allow non-LEAKPROOF functions to be passed into security barrier views if the function does not reference any table columns (Dean Rasheed) However, this is usually a level of detail that I do not cover in the release notes, so I need someone else to tell me it should be added. +1 for including it. That's a security-relevant backward incompatibility. Thanks, done. I was not aware of the complexity of this issue. Applied patch attached. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml new file mode 100644 index 722c8bd..2ed7b01 *** a/doc/src/sgml/release-9.5.sgml --- b/doc/src/sgml/release-9.5.sgml *** FIXME: Add Andres *** 412,417 --- 412,428 listitem !-- + 2015-04-27 [dcbf594] Stephe..: Improve qual pushdown for RLS and SB views + -- +para + Allow non-LEAKPROOF functions to be passed into security barrier + views if the function does not reference any table columns + (Dean Rasheed) +/para + /listitem + + listitem + !-- 2014-11-04 [5028f22] Heikki..: Switch to CRC-32C in WAL and other places. 2015-02-10 [025c024] Heikki..: Speed up CRC calculation using slicing-by-8 alg.. 2015-04-14 [3dc2d62] Heikki..: Use Intel SSE 4.2 CRC instructions where availa.. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Fri, Jun 19, 2015 at 08:21:19PM +0200, Andres Freund wrote: Hi, On 2015-06-11 00:15:21 -0400, Bruce Momjian wrote: I have committed the first draft of the 9.5 release notes. You can view the output here: So, I did a pass through master's state: listitem para Add link linkend=BRINBlock Range Indexes/ (acronymBRIN/) (Aacute;lvaro Herrera, Heikki Linnakangas, Emre Hasegeli) /para para acronymBRIN/ indexes are very compact and store the min/max values for a range of heap blocks. /para /listitem Maybe we should mention that they're cheap to maintain? I saw how you were able to get that idea into the release notes with little additional text --- nice. :-) I couldn't figure out how to do that. listitem para Improve in-memory hash performance (Tomas Vondra, Robert Haas) /para /listitem hash performance is pretty general, there's lot of places where we use hashing that aren't affected. Well, how many of our users even know what a hash is, or when we use hashing internally, and care where the speedup is, or will understand it? The guiding principal is that these release notes are written for our users, not for us. I don't see any way to give more details here without being confusing or overly verbose. listitem para Improve concurrency of link linkend=guc-shared-buffersshared buffer/ replacement (Robert Haas, Amit Kapila) /para /listitem I think in the end that patch was enhanced to a significant degree by making it lockless in d72731a70450b. I think the three (?) involved patches should just combined under one entry. Did you address this already in c0b0501925eacbf2d9c10cd231bf8a14e7c9ef4c? I can't tell. listitem para Improve concurrent locking and buffer scan performance (Andres Freund, Kevin Grittner) /para /listitem If this is ab5194e6f, I don't think it makes sense to mention buffer scan - it's just any lwlock, and buffer locks aren't the primary benefit (ProcArrayLock, buffer mapping lock probably are that). I also don't think Kevin was involved? I think ed127002d8 and 4b4b680c should be mentioned in this section as well. 4b4b680c will considerably reduce the per backend memory usage for servers with large shared buffers. I think you addressed this, right? sect4 titleServer Settings/title itemizedlist listitem para Replace varnamecheckpoint_segments/ with link linkend=guc-min-wal-sizevarnamemin_wal_size// and link linkend=guc-max-wal-sizevarnamemax_wal_size// (Heikki Linnakangas) /para para This allows the allocation of a large number of acronymWAL/ files without keeping them if they are not needed. /para /listitem Hm. This affects performance significantly, should we also list it there? Uh, I saw this as more of a configuration improvement in that you don't need to waste lots of WAL files if you don't need them. listitem para Add acronymGUC/ link linkend=guc-wal-compressionvarnamewal_compression// to enable compression of full page images stored in acronymWAL/ (Rahila Syed, Michael Paquier) /para /listitem Also rather performance relevant? Well, it is both, but we already have a configuration section, so it seems most logical there. listitem para Archive acronymWAL/ files with suffix literal.partial/ during standby promotion (Heikki Linnakangas) /para /listitem This should be expanded, will mention to Heikki. Possibly also need to be mentioned in the backward incompat section. Uh, it seemed obscure enough to not mention it as a backward incompatibility. I am a little afraid of briefkly trying to explain why we did it. listitem para Allow the link linkend=pg-replication-origin-createlabeling/ of the origin of logical replication changes (Andres Freund) /para para This helps with change tracking. /para /listitem I think it should be 'origin and progress'. The explanation should probably rather be 'This is helpful when implementing replication solutions or something like it. Done. listitem para Allow control of table acronymWAL/ logging after table creation with link linkend=SQL-ALTERTABLEcommandALTER TABLE .. SET LOGGED / UNLOGGED// (Fabriacute;zio de Royes Mello) /para /listitem This sounds a bit confusing. Maybe Allow to convert a WAL logged table to an UNLOGGED one, and the other way round? I saw you used you wording, but I changed it to this: Allow changing of the acronymWAL/acronym logging status of a table ater creation with link
Re: [HACKERS] Race conditions in shm_mq.c
Robert Haas robertmh...@gmail.com wrote: On Thu, Aug 6, 2015 at 2:38 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Aug 6, 2015 at 10:10 AM, Antonin Houska a...@cybertec.at wrote: During my experiments with parallel workers I sometimes saw the master and worker process blocked. The master uses shm queue to send data to the worker, both sides nowait==false. I concluded that the following happened: The worker process set itself as a receiver on the queue after shm_mq_wait_internal() has completed its first check of ptr, so this function left sender's procLatch in reset state. But before the procLatch was reset, the receiver still managed to read some data and set sender's procLatch to signal the reading, and eventually called its (receiver's) WaitLatch(). So sender has effectively missed the receiver's notification and called WaitLatch() too (if the receiver already waits on its latch, it does not help for sender to call shm_mq_notify_receiver(): receiver won't do anything because there's no new data in the queue). Below is my patch proposal. Another good catch. However, I would prefer to fix this without introducing a continue as I think that will make the control flow clearer. Therefore, I propose the attached variant of your idea. Err, that doesn't work at all. Have a look at this version instead. This makes sense to me. One advantage of continue was that I could apply the patch to my test code (containing the appropriate sleep() calls, to simulate the race conditions) with no conflicts and see the difference. The restructuring you do does not allow for such a mechanical testing, but it's clear enough. -- Antonin Houska Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
On 08/06/2015 12:45 PM, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On 6 August 2015 at 20:14, Josh Berkus j...@agliodbs.com wrote: Speaking from a user perspective, SYSTEM seems broken to me. I can't imagine using it for anything with a that degree of variation in the number of results returned, especially if it's possible to return zero rows from a populated table. Please bear in mind you have requested a very small random sample of blocks. Indeed. My expectation about it is that you'd get the requested number of rows *on average* over many tries (which is pretty much what Josh's results show). Since what SYSTEM actually returns must be a multiple of the number of rows per page, if you make a request that's less than that number of rows, you must get zero rows some of the time. Otherwise the sampling logic is cheating. Except that I'm getting zero results when requesting rows which equate to 2-3 pages, not just less than one page. Please do read the full bug report. And why should the *number* of pages sampled be random? That's not what the documentation says that SYSTEM does. But it's pretty clearly what's happening; this table has exactly 185 rows per page on every page. So the only reason for there to be a variation in the number of rows returned is that SYSTEM is varying the number of pages sampled. On testing, the zero results don't go away until the number of pages requested goes up to 5 (0.095). So that suggests that SYSTEM is varying the number of pages retrieved by +/- 4. Why? What purpose does that serve? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Fri, Jun 19, 2015 at 09:44:04PM +0200, Andres Freund wrote: Hi, On 2015-06-11 00:15:21 -0400, Bruce Momjian wrote: I have committed the first draft of the 9.5 release notes. You can view the output here: I'm looking through all the commits, checking which I think should possibly be mentioned additionally: - 9f03ca91 - Speed up CREATE INDEX by avoiding to copy index tuples I can't see this information being useful to our general users. - 9da86753 - Reject duplicate column names in foreign key referenced-columns lists. behavioural change, should be documented as intentional Uh, well, I didn't think such uses were common, and we now issue a very clear error message, so I didn't think it was worth mentioning. - 680513ab - Break out OpenSSL-specific code to separate files. should perhaps be mentioned in the developer section. I usually only mention code changes that would affect extension authors, or massive changes. - 0709b7ee7 - Change the spinlock primitives to function as compiler barriers. Significant behavioural change for developers. Uh, again, not really for the group of developers I usually worry about in the release notes. - 94028691 - Advance backend's advertised xmin more aggressively. Pretty good feature imo. Yeah, but again, do our general users understand enough to even understand the explaination, and even if they do, do they care. - 5028f22f6 - Switch to CRC-32C in WAL and other places. This might have compability impact in some external tools. Agreed, text updated: Speed up acronymCRC/ (cyclic redundancy check) computations and switch to CRC-32C (Abhijit Menon-Sen, Heikki Linnakangas) - 4f1b890b1 - Unsure if this should be mentioned, at least it's a fairly large change. This is: Merge the various forms of transaction commit abort records. Again, much too internal-focused for our users to care. - 27846f02 - Optimize locking a tuple already locked by another subxact Several users ran into this slowness, so I think we should mention the optimization. I usually don't mention speedup of edge cases. There are problably a few people who care about it though, but this will be read by thousands of people. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
Simon Riggs si...@2ndquadrant.com writes: On 6 August 2015 at 20:14, Josh Berkus j...@agliodbs.com wrote: Speaking from a user perspective, SYSTEM seems broken to me. I can't imagine using it for anything with a that degree of variation in the number of results returned, especially if it's possible to return zero rows from a populated table. Please bear in mind you have requested a very small random sample of blocks. Indeed. My expectation about it is that you'd get the requested number of rows *on average* over many tries (which is pretty much what Josh's results show). Since what SYSTEM actually returns must be a multiple of the number of rows per page, if you make a request that's less than that number of rows, you must get zero rows some of the time. Otherwise the sampling logic is cheating. I do *not* think that we should force the sample to contain at least one page, which is the only way that we could satisfy the complaint as stated. Perhaps we need to adjust the documentation to make it clearer that block-level sampling is not the thing to use if you want a sample that doesn't amount to a reasonable number of blocks. But I see absolutely no evidence here that the sampling isn't behaving exactly as expected. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
On 2015-08-06 22:17, Josh Berkus wrote: On 08/06/2015 01:14 PM, Josh Berkus wrote: On 08/06/2015 01:10 PM, Simon Riggs wrote: Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to implement block sampling. 1. Test P for each block individually. This gives a range of possible results, with 0 blocks being possible outcome, though decreasing in probability as P increases for fixed N. This is the same way BERNOULLI works, we just do it for blocks rather than rows. 2. We calculate P/N at start of scan and deliver this number blocks by random selection from N available blocks. At present we do (1), exactly as documented. (2) is slightly harder since we'd need to track which blocks have been selected already so we can use a random selection with no replacement algorithm. On a table with uneven distribution of rows this would still return a variable sample size, so it didn't seem worth changing. Aha, thanks! So, seems like this is just a doc issue? That is, we just need to document that using SYSTEM on very small sample sizes may return unexpected numbers of results ... and maybe also how the algorithm actually works. Yes, it's expected behavior on very small sample size so doc patch seems best fix. Following up on this ... where is TABLESAMPLE documented other than in the SELECT command? Doc search on the website is having issues right now. I'm happy to write a doc patch. The user documentation is only in SELECT page, the rest is API docs. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive
On 08/05/2015 09:33 PM, Robert Haas wrote: On Wed, Aug 5, 2015 at 1:10 PM, Ildus Kurbangaliev i.kurbangal...@postgrespro.ru wrote: About `memcpy`, PgBackendStatus struct already have a bunch of multi-byte variables, so it will be not consistent anyway if somebody will want to copy it in that way. On the other hand two bytes in this case give less overhead because we can avoid the offset calculations. And as I've mentioned before the class of wait will be useful when monitoring of waits will be extended. You're missing the point. Those multi-byte fields have additional synchronization requirements, as I explained in some detail in my previous email. You can't just wave that away. I see that now. Thank you for the point. I've looked deeper and I found PgBackendStatus to be not a suitable place for keeping information about low level waits. Really, PgBackendStatus is used to track high level information about backend. This is why auxiliary processes don't have PgBackendStatus, because they don't have such information to expose. But when we come to the low level wait events then auxiliary processes are as useful for monitoring as backends are. WAL writer, checkpointer, bgwriter etc are using LWLocks as well. This is certainly unclear why they can't be monitored. This is why I think we shoudn't place wait event into PgBackendStatus. It could be placed into PGPROC or even separate data structure with different concurrency model which would be most suitable for monitoring. -- Ildus Kurbangaliev Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Priority table or Cache table
On Mon, Jun 30, 2014 at 11:08 PM, Beena Emerson memissemer...@gmail.com wrote: I also ran the test script after making the same configuration changes that you have specified. I found that I was not able to get the same performance difference that you have reported. Following table lists the tps in each scenario and the % increase in performance. Threads Head PatchedDiff 1 1669 1718 3% 2 2844 3195 12% 4 3909 4915 26% 8 7332 8329 14% coming back to this old thread. I just tried a new approach for this priority table, instead of a entirely separate buffer pool, Just try to use a some portion of shared buffers to priority tables using some GUC variable buffer_cache_ratio(0-75) to specify what percentage of shared buffers to be used. Syntax: create table tbl(f1 int) with(buffer_cache=true); Comparing earlier approach, I though of this approach is easier to implement. But during the performance run, it didn't showed much improvement in performance. Here are the test results. Threads HeadPatchedDiff 1 3123 3238 3.68% 2 5997 6261 4.40% 4 11102 11407 2.75% I am suspecting that, this may because of buffer locks that are causing the problem. where as in older approach of different buffer pools, each buffer pool have it's own locks. I will try to collect the profile output and analyze the same. Any better ideas? Here I attached a proof of concept patch. Regards, Hari Babu Fujitsu Australia cache_table_poc.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
On 08/06/2015 01:19 PM, Simon Riggs wrote: On 6 August 2015 at 21:14, Josh Berkus j...@agliodbs.com mailto:j...@agliodbs.com wrote: On 08/06/2015 01:10 PM, Simon Riggs wrote: Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to implement block sampling. 1. Test P for each block individually. This gives a range of possible results, with 0 blocks being possible outcome, though decreasing in probability as P increases for fixed N. This is the same way BERNOULLI works, we just do it for blocks rather than rows. 2. We calculate P/N at start of scan and deliver this number blocks by random selection from N available blocks. (My mistake, that would be P*N) At present we do (1), exactly as documented. (2) is slightly harder since we'd need to track which blocks have been selected already so we can use a random selection with no replacement algorithm. On a table with uneven distribution of rows this would still return a variable sample size, so it didn't seem worth changing. Aha, thanks! So, seems like this is just a doc issue? That is, we just need to document that using SYSTEM on very small sample sizes may return unexpected numbers of results ... and maybe also how the algorithm actually works. For me, the docs seem exactly correct. The mathematical implications of that just aren't recorded explicitly. Well, for the SELECT page, all we need is the following (one changed sentence): The SYSTEM method is significantly faster than the BERNOULLI method when small sampling percentages are specified, but it may return a less-random sample of the table as a result of clustering effects, and may return a highly variable number of results for very small sample sizes. I will try to reword or add something to make it clear that this can return a variable number of blocks and thus produces a result with greater variability in the number of rows returned. It's documented on the SELECT page only; plus there is a whole new section on writing tablesample functions. Seems like it would be nice to have more detailed user docs somewhere which explain the sampling algos we have, especially if we get more in the future. Not sure where would be appropriate for that, though. If there is no appropriate place, I'll just write a blog. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Wed, Jun 24, 2015 at 11:34:27AM +0200, Andres Freund wrote: On 2015-06-23 21:08:36 -0400, Robert Haas wrote: On Tue, Jun 23, 2015 at 5:48 PM, Kevin Grittner kgri...@ymail.com wrote: Andres Freund and...@anarazel.de wrote: listitem para Improve concurrent locking and buffer scan performance (Andres Freund, Kevin Grittner) /para /listitem If this is ab5194e6f, I don't think it makes sense to mention buffer scan - it's just any lwlock, and buffer locks aren't the primary benefit (ProcArrayLock, buffer mapping lock probably are that). I also don't think Kevin was involved? It seems likely that 2ed5b87f9 was combined with something else in this reference. By reducing buffer pins and buffer content locking during btree index scans it shows a slight performance gain in btree scans and avoids some blocking of btree index vacuuming. Oh. That's what it was combined with. I don't think it makes sense to throw these three items together into one note. Their benefit/risk potential is pretty different. I believe Andres has made all these adjustments suggested. If not, please let me know. I think maybe we should separate that back out. The list needs to be user-accessible, but if it's hard to understand what it's referring to, that's not good either. Yea. And if then Bruce goes and compares feature counts... :) :-) -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Mon, Jun 22, 2015 at 09:14:01PM +, Rajeev rastogi wrote: On 11 June 2015 09:45, Bruce Momjian Wrote: I have committed the first draft of the 9.5 release notes. You can view the output here: http://momjian.us/pgsql_docs/release-9-5.html and it will eventually appear here: http://www.postgresql.org/docs/devel/static/release.html I am ready to make suggested adjustments, though I am traveling for conferences for the next ten days so there might a delay in my replies. Below performance improvement in the General Performance category is missing: Reduce btree scan overhead for and strategies For , =, and = strategies, mark the first scan key as already matched if scanning in an appropriate direction. If index tuple contains no nulls we can skip the first re-check for each tuple. While this is a nice 9.5 feature, it really is btree is faster, and users usually don't need to know that, unless the change is massive that they would change their use of the feature. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 11:19:46AM -0700, Jeff Janes wrote: On Wed, Jun 10, 2015 at 9:15 PM, Bruce Momjian br...@momjian.us wrote: I have committed the first draft of the 9.5 release notes. You can view the output here: http://momjian.us/pgsql_docs/release-9-5.html and it will eventually appear here: http://www.postgresql.org/docs/devel/static/release.html I am ready to make suggested adjustments, though I am traveling for conferences for the next ten days so there might a delay in my replies. Improve the speed of sorting character and numeric fields Here, does numeric refer specifically to the type numeric, or to all fields which are numeric in nature? If the former (which I think it is, as other numeric types are pass-by-value to start with), it should probably get some sgml markup. You are right, is only NUMERIC. It was this commit: commit abd94bcac4582903765be7be959d1dbc121df0d0 Author: Robert Haas rh...@postgresql.org Date: Thu Apr 2 14:02:55 2015 -0400 Use abbreviated keys for faster sorting of numeric datums. Andrew Gierth, reviewed by Peter Geoghegan, with further tweaks by me. I changed it to typeNUMERIC/. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce ProcArrayLock contention
On Thu, Aug 6, 2015 at 9:36 PM, Robert Haas robertmh...@gmail.com wrote: OK, committed. Thank you. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Raising our compiler requirements for 9.6
On Thu, Aug 06, 2015 at 05:34:50PM +0200, Andres Freund wrote: On 2015-08-06 12:29:15 -0300, Alvaro Herrera wrote: Andres Freund wrote: @@ -0,0 +1,56 @@ +/*- + * + * lockdefs.h + * Frontend exposed parts of postgres' low level lock mechanism + * + * The split between lockdefs.h and lock.h is not very principled. No kidding! Do you have a good suggestion about the split? I wanted to expose the minimal amount necessary, and those were the ones. lock.h includes lwlock.h only for the benefit of the three LockHashPartition* macros. Those macros are pieces of the lock.c implementation that cross into proc.c, not pieces of the lock.c public API. I suggest instead making a lock_internals.h for those macros and for anything else private to the various files of the lock implementation. For example, the PROCLOCK struct and the functions that take arguments of that type would fit in lock_internals.h. With that, indirect frontend lock.h inclusion will work fine. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using quicksort and a merge step to significantly improve on tuplesort's single run external sort
On Tue, Aug 4, 2015 at 1:24 AM, Heikki Linnakangas hlinn...@iki.fi wrote: Yeah, I don't think there's a big performance difference between the two approaches. I'm not wedded to either approach. Whichever approach we use, my main point was that it would be better to handle this in the logical tape abstraction. In my approach, you would have those in-memory tails attached to the last two tapes. In Peter's approach, you would have one tape that's completely in memory, backed by the array. In either case, the tapes would look like normal tapes to most of tuplesort.c. There would be no extra TSS state, it would be TSS_SORTEDONTAPE or TSS_FINALMERGE as usual. TBH, it's not clear to me why making the logical tape abstraction manage some fraction of memtuples has any advantage at all. The only case that I can imagine where this could be useful is where a logical tape does asynchronous I/O, and has the benefit of some portion of memtuples (probably half) as scratch space (most I/O probably occurs while tuplesort quicksorts the other half of memtuples). But that has nothing to do with building a better abstraction. The more expansive patch that I'm currently working on, that covers every external sorting case -- the patch to *also* quicksort runs past the first run, regardless of whether or not we can get away with a quicksort with spillover -- is going very well. I haven't had a solid block of time to work on it this week due to other commitments, but it isn't very complicated. Performance benefits are considerable even without saving any I/O. I can be as much as twice as fast for merge sort sorts in some cases. So not quite as nice as quicksort with spillover, but still a significant improvement considering writing everything out is inevitable for the cases helped. As I said before, the upcoming patch has tuplesort give up on memtuples *ever* being a heap after the first run, whatever happens. I just quicksort and dump in batches past the first run. Since I give up on replacement selection sort only after the first run, I still have most of the upside of replacement selection, but little of the big downside of heap maintenance. This will result in smaller runs on average past the first run. I can give you at least 3 very strong arguments for why this is totally worth it in every case, but I'll wait till I'm asked for them. :-) One useful saving made in this upcoming multi-tape-run patch is that it never treats any non-current run as part of the heap beyond its run number, even when currentRun is the first (run 0). So no comparisons occur beyond the first run to maintain the heap invariant *even when the first run is current* -- tuples are simply appended that belong to the second run (we only do an extra comparison to determine that that's the run they belong in). So the second run (run 1) is not trusted to be heapified by dumptuples(), and is quicksorted (either by quicksort with spillover, along with much of the first run, or on its own, when there are multiple conventional on-tape runs; it doesn't matter which way it is quicksorted). From there on, every run is quicksorted when memtuples fills, and written out entirely in memtuple sized batches. The logical tape abstraction is currently too low-level for that. It's just a tape of bytes, and tuplesort.c determines where a tuple begins and ends. That needs to be changed so that the logical tape abstraction works tuple-at-a-time instead. For example, instead of LogicalTapeRead(N) which reads N bytes, you would have LogicalTapeReadTuple(), which reads next tuple, and returns its length and the tuple itself. But that would be quite sensible anyway. Why would it be sensible? I honestly wonder why you want to do things that way. What is the advantage of not having what I call the in-memory subrun managed by a logical tape? It's already nothing like any other type of run in several ways. Aside from being all in-memory, it is often much larger. It's special in that it kind of rejects the preliminary determination that some tuples within memtuples need to be in a second, traditional, on-tape run (because we can just quicksort everything and merge with the existing single on-tape run). Also, we now need tuplesort_gettuple_common() to ask READTUP() what to tell its caller about freeing memory that is allocated in within tuplesort.c directly. The memtuples array is already treated as an array, a heap, the head of each tape that is merged, and maybe one other thing that I forgot about offhand. The field SortTuple.tupindex has a total of 3 context-dependent meanings. Playing these kind of games with the memtuples array is very much something that happens already. More than anything else, I think that the new TSS_MEMTAPEMERGE state is justified as a special case because quicksort with spillover is legitimately a special case. Users will want to know how close they were to an internal sort when looking at EXPLAIN ANALYZE and so on. When cost_sort() is
Re: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 06:42:38PM -0700, Peter Geoghegan wrote: On Thu, Aug 6, 2015 at 6:08 PM, Bruce Momjian br...@momjian.us wrote: I though tabout this, and it is really an issue for FDW authors, not for end users, so I put this text in the Source Code changes section: I carefully considered where to put it, and chose the compatibility section based on the precedent of this 9.4 item: Foreign data wrappers that support updating foreign tables must consider the possible presence of AFTER ROW triggers (Noah Misch) I don't suppose it matters much, though. I can close out that open item now. Oh, I think that 9.4 is in the wrong section, but good you researched it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Thu, Aug 6, 2015 at 4:15 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Aug 3, 2015 at 9:09 AM, Merlin Moncure mmonc...@gmail.com wrote: hm. OK, what's the behavior of: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 1; END; RAISE EXCEPTION ...; EXCEPTION ... END; Sure, so that case might need a little bit of special handling. That doesn't mean it's a good idea for heavyweight locks to conflict in general. I think you're going to find that implementing the latter is an extremely unrewarding task, and that the benefits are seriously negative. For example, consider: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 2; END; END; Now, suppose that a concurrent session does LOCK TABLE foo after the first UPDATE and before the second one. That's now a soft deadlock. But the only way the deadlock detector can see that is if the main transaction and the autonomous transaction have separate PGPROC entries, which is a design we explicitly rejected because it puts a tight limit on the number of ATs that can be in progress and the level to which those ATs can be nested. But let's say you don't care, so we go back to that design. The deadlock detector will have to be taught that the outer transaction can't help but wait for the inner transaction, so we teach it that. Now it can see that the only way to resolve the deadlock without aborting any transactions is to reorder the lock request from the autonomous transaction ahead of the concurrent session that is seeking a full table lock. So the autonomous transaction acquires the lock without blocking after all. You have exactly the same result that you would have had anyway but with a phenomenal amount of additional code and complexity. And for what? In the original example, the way the deadlock is going to be reported is like this: ERROR: deadlock detected DETAIL: Process 12345 waits for ShareLock on transaction 1000; blocked by process 12345. Process 12345 waits for ShareLock on transaction 1001; blocked by process 12345. That is not a model of clarity. On the other hand, if you just make a rule that attempting to update or delete a tuple that an outer transaction has already updated throws a bespoke error, you can do something like this: ERROR: tuple to be updated was already modified by a suspended outer transaction ...which has precedent in an existing message in trigger.c. Similarly, if you try to drop a table that the outer transaction has locked, the natural thing is for CheckTableNotInUse() to catch that and report it this way: ERROR: cannot DROP TABLE foo because it is being used by active queries in this session If you work hard enough, you can instead make that generate a deadlock error message, but you're going to have to work pretty hard, and the result is worse. I'd really like to hear some more *specific* scenarios where it's valuable for locks to conflict between the outer transaction and the AT. I grant that tuple updates are a case where the conflict has to be detected somehow, but I don't accept that the lock manager is the best way to do that, and I don't accept that there are a large number of other cases that will need similar handling. I don't necessarily disagree with what you're saying, but it's not clear to me what the proposed behavior is. Since the AT can commit before the outer, ISTM *any* ungranted lock requested by the AT but held by the outer leads to either A: functional deadlock (regardless of implementation details) or B: special behavior. Deadlocks would certainly require some acrobatics to detect and resolve due to the fact that one party to the lock is not in fact blocked on a lock but on the outer's execution state. So maybe the right thing to do is to simply ignore the problem and hang both transactions until timeout or cancel; this isn't really much different vs. ghetto dblink style AT that is done today in my experience. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 3:39 PM, Bruce Momjian br...@momjian.us wrote: I am using the same criteria I have always used. If you would like it changed, we need to discuss it at a macro level, not for individual cases where we feel someone didn't get enough _credit_. I don't know how you can say that no *user* cares about Rajeev's B-Tree contribution on the one hand, but on the other hand add items about things like hash index crash-safety warnings, or an entire section on obscure source code changes, with commentary about memory ordering considerations, for example. And, I will restate this again, the release note are not for _us_, or for _credit_ --- they are for our general users. If you would like that changed, you need to make a case for that change. Certainly, that's mostly true. But if it was entirely true, then no individual would be named on the release notes at all. That would be a mistake, but not because any ordinary user would care one bit. I am not going to comment further on this matter. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Mon, Jun 29, 2015 at 05:05:59PM -0400, Tom Lane wrote: Andres Freund and...@anarazel.de writes: I'm working on integrating the suggestions I made last week to the release notes. Would anybody mind if I start to add commit ids in comments, similar to what Tom has done for minor releases, to news items? +1. Helps confirm which items are meant to correspond to which commits. In case you didn't realize it already, the stuff I put into the minor release notes is from src/tools/git_changelog. Dunno whether we need to use that exact format for major releases though; there's no need to identify branches in major release notes. Gee, if I had known we wanted to do this for major releases I could have done it at the time I created the 9.5 release notes. It has to be harder to do it after the fact. Should I do it for 9.6? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 03:32:43PM -0700, Peter Geoghegan wrote: On Thu, Aug 6, 2015 at 3:06 PM, Bruce Momjian br...@momjian.us wrote: Below performance improvement in the General Performance category is missing: Reduce btree scan overhead for and strategies For , =, and = strategies, mark the first scan key as already matched if scanning in an appropriate direction. If index tuple contains no nulls we can skip the first re-check for each tuple. While this is a nice 9.5 feature, it really is btree is faster, and users usually don't need to know that, unless the change is massive that they would change their use of the feature. I think that Rajeev is entitled to be credited for his work, especially as a less experienced contributor. Sure, users are not likely to care too much about incremental progress like this (however, I would be willing to bet that more users care about this item than about existing items like Make initdb issue a warning about placing the data directory at the top of a file system mount point). IMV it is the role of the release notes to document what went into a release fairly methodically. I often look at release notes to determine what might have gone wrong in a part of the code that I'm less familiar with, for example. Users mostly only specifically care about one or two big ticket items, and in any case are not likely to learn about them from the release notes. The release notes seem shorter than previous years, and I don't think it's because 9.5 is a smaller release. I am using the same criteria I have always used. If you would like it changed, we need to discuss it at a macro level, not for individual cases where we feel someone didn't get enough _credit_. And, I will restate this again, the release note are not for _us_, or for _credit_ --- they are for our general users. If you would like that changed, you need to make a case for that change. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
Andreas Seltenreich seltenre...@gmx.de writes: Tom Lane writes: On 08/01/2015 05:59 PM, Tom Lane wrote: Well, I certainly think all of these represent bugs: 1 | ERROR: could not find pathkey item to sort Hmm ... I see no error with these queries as of today's HEAD or back-branch tips. I surmise that this was triggered by one of the other recently-fixed bugs, though the connection isn't obvious offhand. I still see this error in master as of b8cbe43, but the queries are indeed a pita to reproduce. The one below is the only one so far that is robust against running ANALYZE on the regression db, and also reproduces when I run it as an EXTRA_TEST with make check. Got that one, thanks! But we've seen this error message before in all sorts of weird situations, so I wouldn't assume that all the cases you've come across had the same cause. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Tue, Jun 30, 2015 at 12:12:16AM +0200, Andres Freund wrote: On 2015-06-29 17:58:54 -0400, Tom Lane wrote: Yeah we are. The only places you'll find where we aren't formatting to 77 or 78 columns or so are where it would require breaking SGML tags in weird places. Which isn't exactly infrequent... Anyway, how about: format:%cd [%h] %(8,trunc)%cN: %(49,trunc)%s (which you can configure as pretty.pgmajor or so in .gitconfig btw.) Should we add this to src/tools/git_changelog? It currently uses git log --format=fuller --date=iso. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
Bruce Momjian br...@momjian.us writes: On Tue, Jun 30, 2015 at 12:12:16AM +0200, Andres Freund wrote: Anyway, how about: format:%cd [%h] %(8,trunc)%cN: %(49,trunc)%s (which you can configure as pretty.pgmajor or so in .gitconfig btw.) Should we add this to src/tools/git_changelog? It currently uses git log --format=fuller --date=iso. I don't think that format can easily be merged into what git_changelog does: there would be no place to put branch or release annotations. But it was a real PITA to add that format into the alpha2 release notes, so we should probably think of a way to produce a suitable report. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Fri, Jun 26, 2015 at 03:39:19PM -0700, Peter Geoghegan wrote: On Wed, Jun 10, 2015 at 9:15 PM, Bruce Momjian br...@momjian.us wrote: I am ready to make suggested adjustments I attach a compatibility note that is clearly needed; adding this is an open item of mine for 9.5. This concerns foreign data wrappers and UPSERT. I though tabout this, and it is really an issue for FDW authors, not for end users, so I put this text in the Source Code changes section: +para + Foreign tables can now take part in commandINSERT ... ON CONFLICT + DO NOTHING/ queries (Peter Geoghegan, Heikki Linnakangas, + Andres Freund) +/para + +para + Foreign data wrappers must be modified to handle this. + commandINSERT ... ON CONFLICT DO UPDATE/ is not supported on + foreign tables. +/para + /listitem -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 6:08 PM, Bruce Momjian br...@momjian.us wrote: I though tabout this, and it is really an issue for FDW authors, not for end users, so I put this text in the Source Code changes section: I carefully considered where to put it, and chose the compatibility section based on the precedent of this 9.4 item: Foreign data wrappers that support updating foreign tables must consider the possible presence of AFTER ROW triggers (Noah Misch) I don't suppose it matters much, though. I can close out that open item now. Thanks -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in analyzejoins.c
Andreas Seltenreich seltenre...@gmx.de writes: this one was in today's sqlsmith harvest. It triggers an assertion in current master (c030dc4). Fixed, thanks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Tue, Jun 30, 2015 at 09:00:44PM +0200, Andres Freund wrote: I've gone through the release notes and added comments referencing commits as discussed earlier. Additionally I've improved and added a bunch of items. Further stuff that came up while looking: * 2014-09-25 [b0d81ad] Heikki..: Add -D option to specify data directory to pg_c.. new options, should possibly be documented? As I remember it was added just for consistency with other tools and is not expected to be used because the data directory is required, so documenting it seems like it would encourage nonsensical use. * 2014-10-02 [3acc10c9] Robert..: Increase the number of buffer mapping partition.. should we mention this? This has been patched by a number of downstream vendors and users, so it's probably worth calling out? Uh, I am not sure why general users would care. * 2014-11-18 [606c012] Simon ..: Reduce btree scan overhead for and strategies Worthy of a note in the performance section. I commented on this already. * 2014-11-22 [eca2b9b] Andrew..: Rework echo_hidden for \sf and \ef from commit .. Seems far too minor in comparison to other stuff left out. This is a user-visible change. * 2014-11-07 [7516f52] Alvaro..: BRIN: Block Range Indexes Looks to me like that should just be Alvaro. OK, other names removed. * 2014-11-22 [b62f94c] Tom Lane: Allow simplification of EXISTS() subqueries con.. Planner change, I think it's good to mention those. I have talked to Tom about this in the past and unless the item has a user-visible componient, in general he doesn't think they should be mentioned. I added a bunch of planner items in the major release notes years ago and he thought they should be removed, and they were. * 2014-11-28 [e384ed6] Tom Lane: Improve typcache: cache negative lookup results.. Should perhaps, together with other cache enhancing entries, be mentioned? Uh, I am not seeing it. * 2014-12-08 [519b075] Simon ..: Use GetSystemTimeAsFileTime directly in win32 2014-12-08 [8001fe6] Simon ..: Windows: use GetSystemTimePreciseAsFileTime if .. Timer resolution isn't a unimportant thing for people using explain? This all seemed very internals-only, e.g.: On most Windows systems this change will actually have no significant effect on timestamp resolution as the system timer tick is typically between 1ms and 15ms depending on what timer resolution currently running applications have requested. You can check this with clockres.exe from sysinternals. Despite the platform limiation this change still permits capture of finer timestamps where the system is capable of producing them and it gets rid of an unnecessary syscall. Was I wrong? * 2014-12-12 [7e354ab] Andrew..: Add several generator functions for jsonb that .. 2015-05-12 [c694701] Andrew..: Additional functions and operators for jsonb 2015-05-31 [37def42] Andrew..: Rename jsonb_replace to jsonb_set and allow it .. 2014-12-12 [237a882] Andrew..: Add json_strip_nulls and jsonb_strip_nulls fun.. should probably be merged? Similar set of authors and too many similar release note entries. They all do different types of things, hance the different entries. * 2014-12-23 [d7ee82e] Alvaro..: Add SQL-callable pg_get_object_address * 2014-12-30 [a676201] Alvaro..: Add pg_identify_object_as_address should be merged. OK, merged. * 2015-01-13 [4bad60e] Andres..: Allow latches to wait for socket writability wi.. 2015-01-14 [59f71a0] Andres..: Add a default local latch for use in signal han.. 2015-01-17 [ff44fba] Andres..: Replace walsender's latch with the general shar.. 2015-02-03 [387da18] Andres..: Use a nonblocking socket for FE/BE communicatio.. 2015-02-03 [4f85fde] Andres..: Introduce and use infrastructure for interrupt .. 2015-02-03 [4fe384b] Andres..: Process 'die' interrupts while reading/writing .. 2015-02-03 [6647248] Andres..: Don't allow immediate interrupts during authent.. 2015-02-03 [675] Andres..: Move deadlock and other interrupt handling in p.. 2015-02-13 [80788a4] Heikki..: Simplify waiting logic in reading from /writin.. Again, I am willing to add an combined entry for these, but I can't figure out accurate text for it. * 2015-01-17 [9402869] Heikki..: Advance backend's advertised xmin more aggressi.. This is a pretty big efficiency boon for systems with longer nontrivial transactions. What is the user-visible behavior here? More aggressive cleanup? * 2015-01-29 [ed12700] Andres..: Align buffer descriptors to cache line boundari.. Maybe worthwhile to mention? Uh, I think you and I worked on that. Odd I didn't list it, but I don't think it fit in with any user-visible behavior, and was rare in that it only happened on larger systems. * 2015-02-16 [9e3ad1a] Tom Lane: Use fast path in plpgsql's RETURN/RETURN NEXT i.. 2015-02-28 [e524cbd] Tom Lane: Track typmods in
Re: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 10:21:29PM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: On Tue, Jun 30, 2015 at 12:12:16AM +0200, Andres Freund wrote: Anyway, how about: format:%cd [%h] %(8,trunc)%cN: %(49,trunc)%s (which you can configure as pretty.pgmajor or so in .gitconfig btw.) Should we add this to src/tools/git_changelog? It currently uses git log --format=fuller --date=iso. I don't think that format can easily be merged into what git_changelog does: there would be no place to put branch or release annotations. But it was a real PITA to add that format into the alpha2 release notes, so we should probably think of a way to produce a suitable report. Agreed. If it is already output by git_changelog it would be much easier to add them to the release notes. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] libpq: Allow specifying multiple host names to try to connect to
On Thu, Aug 6, 2015 at 03:15 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Wed, Aug 5, 2015 at 11:53 PM, Bruce Momjian br...@momjian.us wrote: On Wed, Jul 8, 2015 at 12:24:37PM -0400, Robbie Harwood wrote: You update the documentation just for psql but your change effects any libpq application if we go forward with this patch we should update the documentation for libpq as well. This approach seems to work with the url style of conninfo For example postgres://some-down-host.info,some-other-host.org:5435/test1 seems to work as expected but I don't like that syntax I would rather see postgres://some-down-host.info:5435/test1,postgres://some-other-host.org:5435/test1 This would be a more invasive change but I think the syntax is more usable. I agree with this; it seems to me that it's more powerful to be able to specify complete urls for when they may differ. For the non-url case though, I don't see a clean way of doing this. We could always, e.g., locally bind port specification to the closest host specification, but that seems nasty, and is still less powerful than passing urls (or we could just do the same for all parameters, but that's just a mess). Might it be reasonable to only allow the multi-host syntax in the url-style and not otherwise? First, I agree this is a very useful feature that we want. Many NoSQL databases are promoting multi-host client libraries as HA, which is kind of humorous, and also makes sense because many NoSQL solution are multi-host. I can see this feature benefitting us for clients to auto-failover without requiring a pooler or virtual IP reassignment, and also useful for read-only connections that want to connect to a read-only slave, but don't care which one. The idea of randomly selecting a host from the list might be a future feature. Yep. The JDBC driver is doing it as well. I added the JDBC driver support similar feature. Currently it supports the following tuning parameters given a list of hostname/port combinations to connect to: targetServerType=any|master|slave|preferSlave loadBalanceHosts=false|true For an example 2 node master,replica setup one would open write connections with host1,host2 targetServerType=master and read-only connections with host1,host2 targetServerType=preferSlave. I realize this is libpq-feature-creep, but considering the complexities of a pooler and virtual IP address reassignment, I think adding this The fact that other DBs are doing it, including I think VMWare's libpq, supports the idea of adding this simple specification. Because the feature as its simplest is a for loop in libpq. I would not think it much of a feature creep, especially since my original patch to libpq showed the loop already has been hidden in libpq for a long time, it just needed a special dns record for the postgresql hosts that returned dns records for all hosts. Even there are poolers in front of postgres they can be set up in much simpler and reliable non-cluster mode when the libpq can be given multiple pooler addresses to connect to. -Mikko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising our compiler requirements for 9.6
On 2015-08-05 23:18:08 -0400, Tom Lane wrote: Andres Freund and...@anarazel.de writes: ... I'm going to reshuffle things in that direction tomorrow. I'll wait for other fallout first though. So far only gcc, xlc and clang (via gcc frontend) have run... In the department of other fallout, pademelon is not happy: cc -Ae -g +O0 -Wp,-H16384 -I../../../src/include -D_XOPEN_SOURCE_EXTENDED -I/usr/local/libxml2-2.6.23/include/libxml2 -I/usr/local/include -c -o pg_resetxlog.o pg_resetxlog.c cc -Ae -g +O0 -Wp,-H16384 pg_resetxlog.o -L../../../src/port -L../../../src/common -L/usr/local/libxml2-2.6.23/lib -L/usr/local/lib -Wl,+b -Wl,'/home/bfarm/bf-data/HEAD/inst/lib' -Wl,-z -lpgcommon -lpgport -lxnet -lxml2 -lz -lreadline -ltermcap -lm -o pg_resetxlog /usr/ccs/bin/ld: Unsatisfied symbols: pg_atomic_clear_flag_impl (code) pg_atomic_init_flag_impl (code) pg_atomic_compare_exchange_u32_impl (code) pg_atomic_fetch_add_u32_impl (code) pg_atomic_test_set_flag_impl (code) pg_atomic_init_u32_impl (code) make[3]: *** [pg_resetxlog] Error 1 I'd have thought that port/atomics.c would be included in libpgport, but it seems not. Given that it requires spinlocks for the fallback, which in turn may require semaphores, that didn't seem like a good idea. Thus atomics.c is in src/backend/port not src/port (just like other locking stuff like spinlocks, semaphores etc). (But is pademelon really the only buildfarm critter relying on the fallback atomics implementation?) I don't think so. At least that didn't use to be the case. My guess is that less ancient compilers don't emit code for unreferenced functions and this thus doesn't show up. Another possible angle is: what the heck does pg_resetxlog need with atomic ops, anyway? It really doesn't. It's just fallout from indirectly including lwlock.h which includes an atomic variable. The include path leading to it is In file included from /home/andres/src/postgresql/src/include/storage/lwlock.h:19:0, from /home/andres/src/postgresql/src/include/storage/lock.h:18, from /home/andres/src/postgresql/src/include/access/tuptoaster.h:18, from /home/andres/src/postgresql/src/bin/pg_resetxlog/pg_resetxlog.c:49: /home/andres/src/postgresql/src/include/port/atomics.h:41:2: error: #error THOU SHALL NOT REQUIRE ATOMICS #error THOU SHALL NOT REQUIRE ATOMICS There's other path's (slot.h) as well if that were fixed. Should these things have a different, stub implementation for FRONTEND code? I'm honestly not yet sure what the best approach here would be. One approach is to avoid including lwlock.h/slot.h in frontend code. That'll require some minor surgery and adding a couple includes, but it doesn't look that bad. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Thinko in processing of SHM message size info?
Can anyone please explain why the following patch shouldn't be applied? diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c index 126cb07..4cd52ac 100644 --- a/src/backend/storage/ipc/shm_mq.c +++ b/src/backend/storage/ipc/shm_mq.c @@ -584,7 +584,7 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait) if (mqh-mqh_partial_bytes + rb sizeof(Size)) lengthbytes = sizeof(Size) - mqh-mqh_partial_bytes; else - lengthbytes = rb - mqh-mqh_partial_bytes; + lengthbytes = rb; memcpy(mqh-mqh_buffer[mqh-mqh_partial_bytes], rawdata, lengthbytes); mqh-mqh_partial_bytes += lengthbytes; I'm failing to understand why anything should be subtracted. Note that the previous iteration must have called shm_mq_inc_bytes_read(), so rb should not include anything of mqh-mqh_partial_bytes. Thanks. -- Antonin Houska Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing unreferenced files
On 08/05/2015 11:44 AM, Ron Farrer wrote: Initial questions that had no consensus in previous discussions: 1. Approach on file handling undecided 2. Startup vs standalone tool I think it should be on startup and perhaps also have a function that will do it from user space. If this problem persists, we shouldn't expect users to go down to clean it up. 3. If startup: how to determine when to run Outstanding problems (point-for-point with original 2005 post): 1. Does not work with non-standard tablespaces. The check will even report all files are stale, etc. 2. Has issues with stale subdirs of a tablespace (subdirs corresponding to a nonexistent database) [appears related to #1 because of maintenance mode and not failing] 3. Assumes relfilenode is unique database-wide when it’s only safe tablespace-wide 4. Does not examine table segment files such as “nnn.1” - it should instead complain when “nnn” does not match a hash entry 5. It loads every value of relfilenode in pg_class into the hash table without checking that it is meaningful or not - needs to check. 6. strol vs strspn (or other) [not sure what the problem here is. If errors are handled correctly this should not be an issue] 7. No checks for readdir failure [this should be easy to check for] Ron, Do you have suggestions on how to resolve any of the 1-7? Other thoughts: 1. What to do if problem happens during drop table/index and the files that should be removed are still there.. the DBA needs to know when this happens somehow Why? If the drop/table/index happens, should it not just remove those files? 2. What happened to pgfsck: was that a better approach? why was that abandoned? 3. What to do about stale files and missing files IMO, unless there is a prevailing reason not to we should remove them. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing I'm offended is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
On 6 August 2015 at 20:14, Josh Berkus j...@agliodbs.com wrote: The results of 0, 185 and 925 are not. It really seems like SYSTEM is treating 0.04% as a maximum, but taking a random number of data pages somewhere around that maximum, using math which can choose numbers of pages far outside of the % requested by the user, and which includes 0. Thanks for the report, I'll look at this now. Speaking from a user perspective, SYSTEM seems broken to me. I can't imagine using it for anything with a that degree of variation in the number of results returned, especially if it's possible to return zero rows from a populated table. Please bear in mind you have requested a very small random sample of blocks. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] Doubt about AccessExclusiveLock in ALTER TABLE .. SET ( .. );
On Wed, Aug 5, 2015 at 9:21 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Aug 6, 2015 at 3:06 AM, Fabrízio de Royes Mello wrote: On Wed, Aug 5, 2015 at 9:31 AM, Robert Haas wrote: Agreed. I think we're making a mountain out of a molehill here. As long as the locks that are actually used are monotonic, just use and stick a comment in there explaining that it could need adjustment if we use other lock levels in the future. I presume all the lock-levels used for DDL are, and will always be, self-exclusive, so why all this hand-wringing? New version attached with suggested changes. Thanks! +# SET autovacuum_* options needs a ShareUpdateExclusiveLock +# so we mix reads with it to see what works or waits s/needs/need/ and I think you mean mixing writes, not reads. Those are minor things though, and from my point of view a committer can look at it. Fixed. Thanks for your review. Regards, *** This work is funded by Zenvia Mobile Results (http://www.zenvia.com.br) -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index 1c1c181..ad985cd 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -543,6 +543,10 @@ ALTER TABLE ALL IN TABLESPACE replaceable class=PARAMETERname/replaceable of commandALTER TABLE/ that forces a table rewrite. /para + para + Changing autovacuum storage parameters acquires a literalSHARE UPDATE EXCLUSIVE/literal lock. + /para + note para While commandCREATE TABLE/ allows literalOIDS/ to be specified diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index 180f529..c39b878 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -57,7 +57,8 @@ static relopt_bool boolRelOpts[] = { autovacuum_enabled, Enables autovacuum in this relation, - RELOPT_KIND_HEAP | RELOPT_KIND_TOAST + RELOPT_KIND_HEAP | RELOPT_KIND_TOAST, + ShareUpdateExclusiveLock }, true }, @@ -65,7 +66,8 @@ static relopt_bool boolRelOpts[] = { user_catalog_table, Declare a table as an additional catalog table, e.g. for the purpose of logical replication, - RELOPT_KIND_HEAP + RELOPT_KIND_HEAP, + AccessExclusiveLock }, false }, @@ -73,7 +75,8 @@ static relopt_bool boolRelOpts[] = { fastupdate, Enables \fast update\ feature for this GIN index, - RELOPT_KIND_GIN + RELOPT_KIND_GIN, + AccessExclusiveLock }, true }, @@ -81,7 +84,8 @@ static relopt_bool boolRelOpts[] = { security_barrier, View acts as a row security barrier, - RELOPT_KIND_VIEW + RELOPT_KIND_VIEW, + AccessExclusiveLock }, false }, @@ -95,7 +99,8 @@ static relopt_int intRelOpts[] = { fillfactor, Packs table pages only to this percentage, - RELOPT_KIND_HEAP + RELOPT_KIND_HEAP, + AccessExclusiveLock }, HEAP_DEFAULT_FILLFACTOR, HEAP_MIN_FILLFACTOR, 100 }, @@ -103,7 +108,8 @@ static relopt_int intRelOpts[] = { fillfactor, Packs btree index pages only to this percentage, - RELOPT_KIND_BTREE + RELOPT_KIND_BTREE, + AccessExclusiveLock }, BTREE_DEFAULT_FILLFACTOR, BTREE_MIN_FILLFACTOR, 100 }, @@ -111,7 +117,8 @@ static relopt_int intRelOpts[] = { fillfactor, Packs hash index pages only to this percentage, - RELOPT_KIND_HASH + RELOPT_KIND_HASH, + AccessExclusiveLock }, HASH_DEFAULT_FILLFACTOR, HASH_MIN_FILLFACTOR, 100 }, @@ -119,7 +126,8 @@ static relopt_int intRelOpts[] = { fillfactor, Packs gist index pages only to this percentage, - RELOPT_KIND_GIST + RELOPT_KIND_GIST, + AccessExclusiveLock }, GIST_DEFAULT_FILLFACTOR, GIST_MIN_FILLFACTOR, 100 }, @@ -127,7 +135,8 @@ static relopt_int intRelOpts[] = { fillfactor, Packs spgist index pages only to this percentage, - RELOPT_KIND_SPGIST + RELOPT_KIND_SPGIST, + AccessExclusiveLock }, SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100 }, @@ -135,7 +144,8 @@ static relopt_int intRelOpts[] = { autovacuum_vacuum_threshold, Minimum number of tuple updates or deletes prior to vacuum, - RELOPT_KIND_HEAP | RELOPT_KIND_TOAST + RELOPT_KIND_HEAP | RELOPT_KIND_TOAST, + ShareUpdateExclusiveLock }, -1, 0, INT_MAX }, @@ -143,7 +153,8 @@ static relopt_int intRelOpts[] = { autovacuum_analyze_threshold, Minimum number of tuple inserts, updates or deletes prior to analyze, - RELOPT_KIND_HEAP + RELOPT_KIND_HEAP, + ShareUpdateExclusiveLock }, -1, 0, INT_MAX }, @@ -151,7 +162,8 @@ static relopt_int intRelOpts[] = { autovacuum_vacuum_cost_delay, Vacuum cost delay
Re: [HACKERS] Race conditions in shm_mq.c
On Thu, Aug 6, 2015 at 10:10 AM, Antonin Houska a...@cybertec.at wrote: During my experiments with parallel workers I sometimes saw the master and worker process blocked. The master uses shm queue to send data to the worker, both sides nowait==false. I concluded that the following happened: The worker process set itself as a receiver on the queue after shm_mq_wait_internal() has completed its first check of ptr, so this function left sender's procLatch in reset state. But before the procLatch was reset, the receiver still managed to read some data and set sender's procLatch to signal the reading, and eventually called its (receiver's) WaitLatch(). So sender has effectively missed the receiver's notification and called WaitLatch() too (if the receiver already waits on its latch, it does not help for sender to call shm_mq_notify_receiver(): receiver won't do anything because there's no new data in the queue). Below is my patch proposal. Another good catch. However, I would prefer to fix this without introducing a continue as I think that will make the control flow clearer. Therefore, I propose the attached variant of your idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c index e765cea..8e10822 100644 --- a/src/backend/storage/ipc/shm_mq.c +++ b/src/backend/storage/ipc/shm_mq.c @@ -777,7 +777,7 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, const void *data, return SHM_MQ_DETACHED; } - if (available == 0) + if (available == 0 !mqh-mqh_counterparty_attached) { shm_mq_result res; @@ -805,6 +805,13 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, const void *data, mqh-mqh_counterparty_attached = true; } + /* + * The receiver may have read some data after attaching, so we + * must not wait without rechecking the queue state. + */ + } + else if (available == 0) + { /* Let the receiver know that we need them to read some data. */ res = shm_mq_notify_receiver(mq); if (res != SHM_MQ_SUCCESS) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
Version: 9.5alpha2 Issue: when requesting small samples, SYSTEM often returns zero rows, and sometimes returns unexpected numbers of rows. Example: create table thous ( id int, val text ); insert into thous select i, i::text || '-val' from generate_series(1,10) as gs(i); analyze; This is a 100,000 row table, so a 0.01% sample should be 10 rows. Since 10 rows is far less than what's on one data page, the documentation suggests that I should get 1 data page worth of rows. However: postgres=# select * from thous tablesample system ( 0.01 ); id | val +- (0 rows) Time: 0.636 ms ... this query consistently returns 0 rows, in 20 runs. Ok,let's try a million-row table, which is the example we have in the docs. postgres=# select * from mil tablesample system ( 0.01 ); id | val +- (0 rows) Hmmm? On testing, the query against the million-row table returns 0 rows around 50% of the time. This table has around 185 rows per page. As the sample size goes up, the number times I get zero rows goes down, but those results seem to still include data pages with zero rows. For example, here's a series of results from a 0.04 sample against the million-row table. 370 370 370 555 555 185 0 925 Since this is a synthetic table I just generated, it contains almost exactly 185 rows per data page for every data page. So on a 0.04% sample, the variation between 370 rows and 555 rows (whether we have 2 or 3 data pages) is expected, since 0.04% of 5406 data pages is 2.16 pages. The results of 0, 185 and 925 are not. It really seems like SYSTEM is treating 0.04% as a maximum, but taking a random number of data pages somewhere around that maximum, using math which can choose numbers of pages far outside of the % requested by the user, and which includes 0. Speaking from a user perspective, SYSTEM seems broken to me. I can't imagine using it for anything with a that degree of variation in the number of results returned, especially if it's possible to return zero rows from a populated table. BERNOULLI works as expected. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thinko in processing of SHM message size info?
On Thu, Aug 6, 2015 at 1:24 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Aug 6, 2015 at 9:04 AM, Antonin Houska a...@cybertec.at wrote: Can anyone please explain why the following patch shouldn't be applied? diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c index 126cb07..4cd52ac 100644 --- a/src/backend/storage/ipc/shm_mq.c +++ b/src/backend/storage/ipc/shm_mq.c @@ -584,7 +584,7 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait) if (mqh-mqh_partial_bytes + rb sizeof(Size)) lengthbytes = sizeof(Size) - mqh-mqh_partial_bytes; else - lengthbytes = rb - mqh-mqh_partial_bytes; + lengthbytes = rb; memcpy(mqh-mqh_buffer[mqh-mqh_partial_bytes], rawdata, lengthbytes); mqh-mqh_partial_bytes += lengthbytes; I'm failing to understand why anything should be subtracted. Note that the previous iteration must have called shm_mq_inc_bytes_read(), so rb should not include anything of mqh-mqh_partial_bytes. Thanks. Hmm, I think you are correct. This would matter in the case where the message length word was read in more than two chunks. I don't *think* that's possible right now because I believe the only systems where MAXIMUM_ALIGNOF sizeof(Size) are those with MAXIMUM_ALIGNOF == 4 and sizeof(Size) == 8. However, if we had systems where MAXIMUM_ALIGNOF == 4 and sizeof(Size) == 16, or systems where MAXIMUM_ALIGNOF == 2 and sizeof(Size) == 8, this would be a live bug. Hmm, actually, maybe it is a live bug anyway, because the if statement tests rather than =. If we've read 4 bytes and exactly 4 more bytes are available, we'd set lengthbytes to 0 instead of 4. Oops. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thinko in processing of SHM message size info?
On Thu, Aug 6, 2015 at 9:04 AM, Antonin Houska a...@cybertec.at wrote: Can anyone please explain why the following patch shouldn't be applied? diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c index 126cb07..4cd52ac 100644 --- a/src/backend/storage/ipc/shm_mq.c +++ b/src/backend/storage/ipc/shm_mq.c @@ -584,7 +584,7 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait) if (mqh-mqh_partial_bytes + rb sizeof(Size)) lengthbytes = sizeof(Size) - mqh-mqh_partial_bytes; else - lengthbytes = rb - mqh-mqh_partial_bytes; + lengthbytes = rb; memcpy(mqh-mqh_buffer[mqh-mqh_partial_bytes], rawdata, lengthbytes); mqh-mqh_partial_bytes += lengthbytes; I'm failing to understand why anything should be subtracted. Note that the previous iteration must have called shm_mq_inc_bytes_read(), so rb should not include anything of mqh-mqh_partial_bytes. Thanks. Hmm, I think you are correct. This would matter in the case where the message length word was read in more than two chunks. I don't *think* that's possible right now because I believe the only systems where MAXIMUM_ALIGNOF sizeof(Size) are those with MAXIMUM_ALIGNOF == 4 and sizeof(Size) == 8. However, if we had systems where MAXIMUM_ALIGNOF == 4 and sizeof(Size) == 16, or systems where MAXIMUM_ALIGNOF == 2 and sizeof(Size) == 8, this would be a live bug. Thanks for reviewing; I'll go push this change. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 10:33 AM, Bruce Momjian br...@momjian.us wrote: On Sun, Jun 14, 2015 at 12:05:54PM +0100, Dean Rasheed wrote: On 11 June 2015 at 05:15, Bruce Momjian br...@momjian.us wrote: I have committed the first draft of the 9.5 release notes. You can view the output here: http://momjian.us/pgsql_docs/release-9-5.html I think it's worth mentioning dcbf5948e12aa60b4d6ab65b6445897dfc971e01, probably under General Performance. It's an optimisation, and also a user-visible change to the way LEAKPROOF works. Perhaps something like Allow pushdown of non-leakproof functions into security barrier views if the function is not passed any arguments from the view. Previously only functions marked as LEAKPROOF could be pushed down into security barrier views. Sorry, just looking at this now. Since RLS is new for 9.5, we wouldn't mention the RLS change in the release notes because is is part of the RLS new features, but we could mention the SB change --- the new text would be: Allow non-LEAKPROOF functions to be passed into security barrier views if the function does not reference any table columns (Dean Rasheed) However, this is usually a level of detail that I do not cover in the release notes, so I need someone else to tell me it should be added. +1 for including it. That's a security-relevant backward incompatibility. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] cache invalidation skip logic
In cache invalidation logic, we have the following comment: /* * Now that we have the lock, check for invalidation messages, so that we * will update or flush any stale relcache entry before we try to use it. * RangeVarGetRelid() specifically relies on us for this. We can skip * this in the not-uncommon case that we already had the same type of lock * being requested, since then no one else could have modified the * relcache entry in an undesirable way. (In the case where our own xact * modifies the rel, the relcache update happens via * CommandCounterIncrement, not here.) */ if (res != LOCKACQUIRE_ALREADY_HELD) AcceptInvalidationMessages(); It is true after we hold the lock, nobody will further modify it but there could be some left-over invalidation message we shall accept before we can continue. This is can be demonstrated with the following invalidation sequence: { 1: inval A; 2: inval B; ...; 10: inval pg_class } After step 10, another session may encounter a lock and replays this sequence: step 1: RelationBuildDesc(A), it heap_open(pg_class), pg_class lock not acquired yet, so it acquires the lock and recursively replay the sequence, goto step 2. step 2: RelationBuildDesc(B), it heap_open(pg_class), but this time we already have LOCKACQUIRE_ALREADY_HELD with pg_class, so we now access pg_class but it is wrong. User may ends up with a could not open file ... error. Is above sequence possible? Regards, Qingqing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [sqlsmith] Failed assertion in analyzejoins.c
Hi, this one was in today's sqlsmith harvest. It triggers an assertion in current master (c030dc4). regards, Andreas -- TRAP: FailedAssertion(!(!bms_is_empty(phinfo-ph_eval_at)), File: analyzejoins.c, Line: 474) select rel_141618057.srvfdw as c0, rel_141618057.srvversion as c1 from public.tv as rel_141617979 right join pg_catalog.pg_foreign_server as rel_141618057 left join pg_catalog.pg_roles as rel_141618058 on (rel_141618057.srvname = rel_141618058.rolname ) on (rel_141617979.type = rel_141618058.rolpassword ) where rel_141618057.srvfdw is not NULL; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On 5 August 2015 at 18:46, Alvaro Herrera alvhe...@2ndquadrant.com wrote: What do others think? Wow, everything moves when you blink, eh? Sorry I was wasn't watching this. Mainly because I was working on some other related thoughts, separate post coming. 1. Most importantly, it needs to be somewhere where we can use the function in a regression test. As I said before, I would not commit this without a formal proof of correctness. 2. I'd also like to be able to make checks on this while we're in production, to ensure we have no bugs. I was trying to learn from earlier mistakes and make sure we are ready with diagnostic tools to allow run-time checks and confirm everything is good. If people feel that means I've asked for something in the wrong place, I am happy to skip that request and place it wherever requested. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] Race conditions in shm_mq.c
On Thu, Aug 6, 2015 at 2:38 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Aug 6, 2015 at 10:10 AM, Antonin Houska a...@cybertec.at wrote: During my experiments with parallel workers I sometimes saw the master and worker process blocked. The master uses shm queue to send data to the worker, both sides nowait==false. I concluded that the following happened: The worker process set itself as a receiver on the queue after shm_mq_wait_internal() has completed its first check of ptr, so this function left sender's procLatch in reset state. But before the procLatch was reset, the receiver still managed to read some data and set sender's procLatch to signal the reading, and eventually called its (receiver's) WaitLatch(). So sender has effectively missed the receiver's notification and called WaitLatch() too (if the receiver already waits on its latch, it does not help for sender to call shm_mq_notify_receiver(): receiver won't do anything because there's no new data in the queue). Below is my patch proposal. Another good catch. However, I would prefer to fix this without introducing a continue as I think that will make the control flow clearer. Therefore, I propose the attached variant of your idea. Err, that doesn't work at all. Have a look at this version instead. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c index e765cea..0e60dbc 100644 --- a/src/backend/storage/ipc/shm_mq.c +++ b/src/backend/storage/ipc/shm_mq.c @@ -777,33 +777,37 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, const void *data, return SHM_MQ_DETACHED; } - if (available == 0) + if (available == 0 !mqh-mqh_counterparty_attached) { - shm_mq_result res; - /* * The queue is full, so if the receiver isn't yet known to be * attached, we must wait for that to happen. */ - if (!mqh-mqh_counterparty_attached) + if (nowait) { -if (nowait) +if (shm_mq_get_receiver(mq) == NULL) { - if (shm_mq_get_receiver(mq) == NULL) - { - *bytes_written = sent; - return SHM_MQ_WOULD_BLOCK; - } -} -else if (!shm_mq_wait_internal(mq, mq-mq_receiver, - mqh-mqh_handle)) -{ - mq-mq_detached = true; *bytes_written = sent; - return SHM_MQ_DETACHED; + return SHM_MQ_WOULD_BLOCK; } -mqh-mqh_counterparty_attached = true; } + else if (!shm_mq_wait_internal(mq, mq-mq_receiver, + mqh-mqh_handle)) + { +mq-mq_detached = true; +*bytes_written = sent; +return SHM_MQ_DETACHED; + } + mqh-mqh_counterparty_attached = true; + + /* + * The receiver may have read some data after attaching, so we + * must not wait without rechecking the queue state. + */ + } + else if (available == 0) + { + shm_mq_result res; /* Let the receiver know that we need them to read some data. */ res = shm_mq_notify_receiver(mq); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Fwd: [HACKERS] 9.5 release notes
On Wed, Jun 10, 2015 at 9:15 PM, Bruce Momjian br...@momjian.us wrote: I have committed the first draft of the 9.5 release notes. You can view the output here: http://momjian.us/pgsql_docs/release-9-5.html and it will eventually appear here: http://www.postgresql.org/docs/devel/static/release.html I am ready to make suggested adjustments, though I am traveling for conferences for the next ten days so there might a delay in my replies. Improve the speed of sorting character and numeric fields Here, does numeric refer specifically to the type numeric, or to all fields which are numeric in nature? If the former (which I think it is, as other numeric types are pass-by-value to start with), it should probably get some sgml markup. (copy to list, accidentally sent just to Bruce initially) Thanks, Jeff
Re: [HACKERS] Raising our compiler requirements for 9.6
Andres Freund and...@anarazel.de writes: One approach is to avoid including lwlock.h/slot.h in frontend code. That'll require some minor surgery and adding a couple includes, but it doesn't look that bad. Patch doing that attached. This seems kinda messy. Looking at the contents of lock.h, it seems like getting rid of its dependency on lwlock.h is not really very appropriate, because there is boatloads of other backend-only stuff in there. Why is any frontend code including lock.h at all? If there is a valid reason, should we refactor lock.h into two separate headers, one that is safe to expose to frontends and one with the rest of the stuff? Also, I think the reason for the #include is that lock.h has a couple of macros that reference MainLWLockArray. The fact that you can omit the #include lwlock.h is therefore only accidental: if we had done those as static inline functions, this wouldn't work at all. So I think this solution is not very future-proof either. As a consequence I think we should actually add a bunch of #ifdef FRONTEND #error checks in code we definitely do not want to included in frontend code. The attached patch so far adds a check to atomics.h, lwlock.h and s_lock.h. I agree with that idea anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance improvement for joins where outer side is unique
On Wednesday 08 July 2015 12:29:38 David Rowley wrote: On 8 July 2015 at 02:00, Alexander Korotkov a.korot...@postgrespro.ru wrote: Patch doesn't apply to current master. Could you, please, rebase it? Attached. Thanks. Regards David Rowley -- David Rowley http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services Hello. What is the current status of the patch? -- YUriy Zhuravlev Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Race conditions in shm_mq.c
During my experiments with parallel workers I sometimes saw the master and worker process blocked. The master uses shm queue to send data to the worker, both sides nowait==false. I concluded that the following happened: The worker process set itself as a receiver on the queue after shm_mq_wait_internal() has completed its first check of ptr, so this function left sender's procLatch in reset state. But before the procLatch was reset, the receiver still managed to read some data and set sender's procLatch to signal the reading, and eventually called its (receiver's) WaitLatch(). So sender has effectively missed the receiver's notification and called WaitLatch() too (if the receiver already waits on its latch, it does not help for sender to call shm_mq_notify_receiver(): receiver won't do anything because there's no new data in the queue). Below is my patch proposal. -- Antonin Houska Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c index 126cb07..39ea673 100644 --- a/src/backend/storage/ipc/shm_mq.c +++ b/src/backend/storage/ipc/shm_mq.c @@ -803,6 +808,22 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, const void *data, return SHM_MQ_DETACHED; } mqh-mqh_counterparty_attached = true; + +/* + * Re-check if the queue is still full. + * + * While we were using our procLatch to detect receiver's + * connection, the receiver could have connected and started + * reading from it - that includes concurrent manipulation of + * the latch, in order to report on reading activity. Thus we + * could miss the information that some data has already been + * consumed, and cause a deadlock by calling SetLatch() below. + * + * (If the receiver starts waiting on its latch soon enough, + * our call of shm_mq_notify_receiver() will have no effect.) + */ +if (!nowait) + continue; } /* Let the receiver know that we need them to read some data. */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising our compiler requirements for 9.6
On Thu, Aug 6, 2015 at 3:09 AM, Andres Freund and...@anarazel.de wrote: It really doesn't. It's just fallout from indirectly including lwlock.h which includes an atomic variable. The include path leading to it is In file included from /home/andres/src/postgresql/src/include/storage/lwlock.h:19:0, from /home/andres/src/postgresql/src/include/storage/lock.h:18, from /home/andres/src/postgresql/src/include/access/tuptoaster.h:18, from /home/andres/src/postgresql/src/bin/pg_resetxlog/pg_resetxlog.c:49: /home/andres/src/postgresql/src/include/port/atomics.h:41:2: error: #error THOU SHALL NOT REQUIRE ATOMICS #error THOU SHALL NOT REQUIRE ATOMICS Isn't that #include entirely superfluous? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising our compiler requirements for 9.6
On 2015-08-06 10:29:39 -0400, Robert Haas wrote: On Thu, Aug 6, 2015 at 3:09 AM, Andres Freund and...@anarazel.de wrote: It really doesn't. It's just fallout from indirectly including lwlock.h which includes an atomic variable. The include path leading to it is In file included from /home/andres/src/postgresql/src/include/storage/lwlock.h:19:0, from /home/andres/src/postgresql/src/include/storage/lock.h:18, from /home/andres/src/postgresql/src/include/access/tuptoaster.h:18, from /home/andres/src/postgresql/src/bin/pg_resetxlog/pg_resetxlog.c:49: /home/andres/src/postgresql/src/include/port/atomics.h:41:2: error: #error THOU SHALL NOT REQUIRE ATOMICS #error THOU SHALL NOT REQUIRE ATOMICS Isn't that #include entirely superfluous? Which one? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance improvement for joins where outer side is unique
On 2015-08-06 15:36, Uriy Zhuravlev wrote: On Wednesday 08 July 2015 12:29:38 David Rowley wrote: On 8 July 2015 at 02:00, Alexander Korotkov a.korot...@postgrespro.ru wrote: Patch doesn't apply to current master. Could you, please, rebase it? Attached. Thanks. Regards David Rowley -- David Rowley http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services Hello. What is the current status of the patch? [unique_joins_9e6d4e4_2015-07-08.patch] FWIW, I just happened to be looking at the latest patch: applies OK (against current HEAD) compiles OK make check fail: join ... FAILED installs OK I was just going to repeat David's 2 queries (hardware: x86_64, Linux 2.6.18-402.el5) -- with these tables: create table t1 (id int primary key); create table t2 (id int primary key); insert into t1 select x.x from generate_series(1,100) x(x); insert into t2 select x.x from generate_series(1,100) x(x); create table s1 (id varchar(32) primary key); insert into s1 select x.x::varchar from generate_series(1,100) x(x); create table s2 (id varchar(32) primary key); insert into s2 select x.x::varchar from generate_series(1,100) x(x); vacuum analyze; I do not see much difference between patched and unpatched after running both David's statements (unijoin.sql and unijoin2.sql): -- pgbench -f unijoin.sql -n -T 300 -P30 testdb select count(t2.id) from t1 left outer join t2 on t1.id=t2.id; tps = 2.323222 - unpatched tps = 2.356906 - patched -- -- pgbench -f unijoin2.sql -n -T 300 -P30 testdb tps = 1.257656 - unpatched tps = 1.225758 - patched So as far as I can tell it does not look very promising. Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Sun, Jun 14, 2015 at 12:05:54PM +0100, Dean Rasheed wrote: On 11 June 2015 at 05:15, Bruce Momjian br...@momjian.us wrote: I have committed the first draft of the 9.5 release notes. You can view the output here: http://momjian.us/pgsql_docs/release-9-5.html I think it's worth mentioning dcbf5948e12aa60b4d6ab65b6445897dfc971e01, probably under General Performance. It's an optimisation, and also a user-visible change to the way LEAKPROOF works. Perhaps something like Allow pushdown of non-leakproof functions into security barrier views if the function is not passed any arguments from the view. Previously only functions marked as LEAKPROOF could be pushed down into security barrier views. Sorry, just looking at this now. Since RLS is new for 9.5, we wouldn't mention the RLS change in the release notes because is is part of the RLS new features, but we could mention the SB change --- the new text would be: Allow non-LEAKPROOF functions to be passed into security barrier views if the function does not reference any table columns (Dean Rasheed) However, this is usually a level of detail that I do not cover in the release notes, so I need someone else to tell me it should be added. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising our compiler requirements for 9.6
On 2015-08-06 10:27:52 -0400, Tom Lane wrote: Andres Freund and...@anarazel.de writes: One approach is to avoid including lwlock.h/slot.h in frontend code. That'll require some minor surgery and adding a couple includes, but it doesn't look that bad. Patch doing that attached. This seems kinda messy. Looking at the contents of lock.h, it seems like getting rid of its dependency on lwlock.h is not really very appropriate, because there is boatloads of other backend-only stuff in there. Why is any frontend code including lock.h at all? If there is a valid reason, should we refactor lock.h into two separate headers, one that is safe to expose to frontends and one with the rest of the stuff? I think the primary reason for lock.h being included pretty widely is to have the declaration of LOCKMODE. That's pretty widely used in headers included by clients for various reasons. There's also a bit of fun around xl_standby_locks. I can have a go at trying to separate them, but I'm not convinced it's going to look particularly pretty. Also, I think the reason for the #include is that lock.h has a couple of macros that reference MainLWLockArray. The fact that you can omit the #include lwlock.h is therefore only accidental: if we had done those as static inline functions, this wouldn't work at all. So I think this solution is not very future-proof either. Yea, I saw that and concluded I'm not particularly bothered by it. Most of the other similar macros are in lwlock.h itself, and if it became a problem we could just move them alongside. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising our compiler requirements for 9.6
On Thu, Aug 6, 2015 at 10:31 AM, Andres Freund and...@anarazel.de wrote: On 2015-08-06 10:29:39 -0400, Robert Haas wrote: On Thu, Aug 6, 2015 at 3:09 AM, Andres Freund and...@anarazel.de wrote: It really doesn't. It's just fallout from indirectly including lwlock.h which includes an atomic variable. The include path leading to it is In file included from /home/andres/src/postgresql/src/include/storage/lwlock.h:19:0, from /home/andres/src/postgresql/src/include/storage/lock.h:18, from /home/andres/src/postgresql/src/include/access/tuptoaster.h:18, from /home/andres/src/postgresql/src/bin/pg_resetxlog/pg_resetxlog.c:49: /home/andres/src/postgresql/src/include/port/atomics.h:41:2: error: #error THOU SHALL NOT REQUIRE ATOMICS #error THOU SHALL NOT REQUIRE ATOMICS Isn't that #include entirely superfluous? Which one? Never mind, I'm confused. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] nodes/*funcs.c inconsistencies
Noah, * Noah Misch (n...@leadboat.com) wrote: Rather than commit on an emergency basis in the few hours between these +1's and the wrap, I kept to my original schedule. FYI, if it hadn't required emergency procedures (cancelling the day's plans so I could get to a notebook computer), I would have committed early based on the three +1s. No worries, fully agreed that this was not an emergency in any way. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Raising our compiler requirements for 9.6
Andres Freund wrote: On 2015-08-06 10:27:52 -0400, Tom Lane wrote: Andres Freund and...@anarazel.de writes: One approach is to avoid including lwlock.h/slot.h in frontend code. That'll require some minor surgery and adding a couple includes, but it doesn't look that bad. Patch doing that attached. This seems kinda messy. Looking at the contents of lock.h, it seems like getting rid of its dependency on lwlock.h is not really very appropriate, because there is boatloads of other backend-only stuff in there. Why is any frontend code including lock.h at all? If there is a valid reason, should we refactor lock.h into two separate headers, one that is safe to expose to frontends and one with the rest of the stuff? I think the primary reason for lock.h being included pretty widely is to have the declaration of LOCKMODE. That's pretty widely used in headers included by clients for various reasons. There's also a bit of fun around xl_standby_locks. I think it is a good idea to split up LOCKMODE so that most headers do not need to include lock.h at all; you will need to add an explicit #include storage/lock.h to a lot of C files, but to me that's a good thing. See http://doxygen.postgresql.org/lock_8h.html Funnily enough, the included by graph is so large that my browser (arguably a bit dated) fails to display it and shows a black rectangle instead. Chromium shows it, though it's illegible. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising our compiler requirements for 9.6
On 2015-08-06 12:05:24 -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2015-08-06 10:27:52 -0400, Tom Lane wrote: Andres Freund and...@anarazel.de writes: One approach is to avoid including lwlock.h/slot.h in frontend code. That'll require some minor surgery and adding a couple includes, but it doesn't look that bad. Patch doing that attached. This seems kinda messy. Looking at the contents of lock.h, it seems like getting rid of its dependency on lwlock.h is not really very appropriate, because there is boatloads of other backend-only stuff in there. Why is any frontend code including lock.h at all? If there is a valid reason, should we refactor lock.h into two separate headers, one that is safe to expose to frontends and one with the rest of the stuff? I think the primary reason for lock.h being included pretty widely is to have the declaration of LOCKMODE. That's pretty widely used in headers included by clients for various reasons. There's also a bit of fun around xl_standby_locks. I think it is a good idea to split up LOCKMODE so that most headers do not need to include lock.h at all; you will need to add an explicit #include storage/lock.h to a lot of C files, but to me that's a good thing. I had to split of three things: LOCKMASK, the individual lock levels and xl_standby_lock to be able to prohibit lock.h to be included by frontend code. lockdefs.h works for me, counter proposals? There weren't any places that needed additional lock.h includes. But hashfn.c somewhat hilariously missed utils/hsearch.h ;) Regards, Andres From bc9dc1910058a3e4638576261f24cb12471e7b15 Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Thu, 6 Aug 2015 12:53:56 +0200 Subject: [PATCH] Don't include low level locking code from frontend code. To avoid that happening again put some checks agains this happening into atomics.h, lock.h, lwlock.h and s_lock.h. --- src/backend/utils/hash/hashfn.c | 1 + src/include/access/genam.h | 2 +- src/include/access/hash.h | 2 +- src/include/access/tuptoaster.h | 2 +- src/include/catalog/objectaddress.h | 2 +- src/include/port/atomics.h | 4 +++ src/include/storage/lock.h | 43 src/include/storage/lockdefs.h | 56 + src/include/storage/lwlock.h| 4 +++ src/include/storage/procarray.h | 1 + src/include/storage/s_lock.h| 4 +++ src/include/storage/standby.h | 2 +- 12 files changed, 80 insertions(+), 43 deletions(-) create mode 100644 src/include/storage/lockdefs.h diff --git a/src/backend/utils/hash/hashfn.c b/src/backend/utils/hash/hashfn.c index 260d880..bdd438d 100644 --- a/src/backend/utils/hash/hashfn.c +++ b/src/backend/utils/hash/hashfn.c @@ -22,6 +22,7 @@ #include postgres.h #include access/hash.h +#include utils/hsearch.h /* diff --git a/src/include/access/genam.h b/src/include/access/genam.h index d86590a..d9d05a0 100644 --- a/src/include/access/genam.h +++ b/src/include/access/genam.h @@ -17,7 +17,7 @@ #include access/sdir.h #include access/skey.h #include nodes/tidbitmap.h -#include storage/lock.h +#include storage/lockdefs.h #include utils/relcache.h #include utils/snapshot.h diff --git a/src/include/access/hash.h b/src/include/access/hash.h index 93cc8af..97cb859 100644 --- a/src/include/access/hash.h +++ b/src/include/access/hash.h @@ -24,7 +24,7 @@ #include fmgr.h #include lib/stringinfo.h #include storage/bufmgr.h -#include storage/lock.h +#include storage/lockdefs.h #include utils/relcache.h /* diff --git a/src/include/access/tuptoaster.h b/src/include/access/tuptoaster.h index 7d18535..77f637e 100644 --- a/src/include/access/tuptoaster.h +++ b/src/include/access/tuptoaster.h @@ -14,8 +14,8 @@ #define TUPTOASTER_H #include access/htup_details.h +#include storage/lockdefs.h #include utils/relcache.h -#include storage/lock.h /* * This enables de-toasting of index entries. Needed until VACUUM is diff --git a/src/include/catalog/objectaddress.h b/src/include/catalog/objectaddress.h index 37808c0..0fc16ed 100644 --- a/src/include/catalog/objectaddress.h +++ b/src/include/catalog/objectaddress.h @@ -14,7 +14,7 @@ #define OBJECTADDRESS_H #include nodes/pg_list.h -#include storage/lock.h +#include storage/lockdefs.h #include utils/acl.h #include utils/relcache.h diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h index bb87945..65cfa3f 100644 --- a/src/include/port/atomics.h +++ b/src/include/port/atomics.h @@ -37,6 +37,10 @@ #ifndef ATOMICS_H #define ATOMICS_H +#ifdef FRONTEND +#error atomics.h may not be included from frontend code +#endif + #define INSIDE_ATOMICS_H #include limits.h diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h index 96fe3a6..a9cd08c 100644 --- a/src/include/storage/lock.h +++ b/src/include/storage/lock.h @@ -14,6 +14,11
Re: [HACKERS] raw output from copy
Hi, Psql based implementation needs new infrastructure (more than few lines) Missing: * binary mode support * parametrized query support, I am not against, but both points I proposed, and both was rejected. So why dont use current infrastructure? Raw copy is trivial patch. Dne 6.8.2015 0:09 napsal uživatel Andrew Dunstan and...@dunslane.net: On 08/05/2015 04:59 PM, Heikki Linnakangas wrote: On 07/27/2015 02:28 PM, Pavel Stehule wrote: 2015-07-27 10:41 GMT+02:00 Heikki Linnakangas hlinn...@iki.fi: What about input? This is a whole new feature, but it would be nice to be able to pass the file contents as a query parameter. Something like: \P /tmp/foo binary INSERT INTO foo VALUES (?); The example of input is strong reason, why don't do it via inserts. Only parsing some special ? symbol needs lot of new code. Sorry, I meant $1 in place of the ?. No special parsing needed, psql can send the query to the server as is, with the parameters that are given by this new mechanism. In this case, I don't see any advantage of psql based solution. COPY is standard interface for input/output from/to files, and it should be used there. I'm not too happy with the COPY approach, although I won't object is one of the other committers feel more comfortable with it. However, we don't seem to be making progress here, so I'm going to mark this as Returned with Feedback. I don't feel good about that either, because I don't actually have any great suggestions on how to move this forward. Which is a pity because this is a genuine problem for users. This is really only a psql problem, IMNSHO. Inserting and extracting binary data is pretty trivial for most users of client libraries (e.g. it's a couple of lines of code in a DBD::Pg program), but it's hard in psql. I do agree that the COPY approach feels more than a little klunky. cheers andrew
Re: [HACKERS] deparsing utility commands
Jim Nasby wrote: FWIW, my interest in this is largely due to the problems I've had with this in the variant type. In particular, using the same resolution rules for functions and operators. So I'm wondering if there's a bigger issue here. I'd be glad to review your variant stuff, but I doubt it's the same thing at play. Please don't hijack this thread. Deparse is doing some pretty nasty games with the backend innards to make the deparsed representation usable generally. We were just missing a simple trick. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising our compiler requirements for 9.6
On 2015-08-06 12:29:15 -0300, Alvaro Herrera wrote: Andres Freund wrote: I had to split of three things: LOCKMASK, the individual lock levels and xl_standby_lock to be able to prohibit lock.h to be included by frontend code. lockdefs.h works for me, counter proposals? There weren't any places that needed additional lock.h includes. Ah, but that's because you cheated and didn't remove the include from namespace.h ... Well, it's not included from frontend code, so I didn't see the need? Going through all the backend code and replacing lock.h by lockdefs.h and some other includes doesn't seem particularly beneficial to me. FWIW, removing it from namespace.h is relatively easy. It starts to get a lot more noisy when you want to touch heapam.h. diff --git a/src/include/storage/lockdefs.h b/src/include/storage/lockdefs.h new file mode 100644 index 000..bfbcdba --- /dev/null +++ b/src/include/storage/lockdefs.h @@ -0,0 +1,56 @@ +/*- + * + * lockdefs.h + *Frontend exposed parts of postgres' low level lock mechanism + * + * The split between lockdefs.h and lock.h is not very principled. No kidding! Do you have a good suggestion about the split? I wanted to expose the minimal amount necessary, and those were the ones. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] obsolete comments for InitializeMaxBackends
While hacking today, I realized that a couple of comments related to InitializeMaxBackends are obsolete. Originally, the number of background workers was determined just after processing shared_preload_libraries, but I changed that in commit 6bc8ef0b7f1f1df3998745a66e1790e27424aa0c with the introduction of max_worker_processes. The attached patch fixes up the comments so that they no longer offer up an obsolete rationale for what the code does. Assuming nobody minds, I plan to commit this, and, since the comments are actually wrong and might confuse someone, back-patch this change as far as 9.4, where the above-mentioned commit first appeared. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 000524d..0c8bc8f 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -923,8 +923,9 @@ PostmasterMain(int argc, char *argv[]) process_shared_preload_libraries(); /* - * Now that loadable modules have had their chance to register background - * workers, calculate MaxBackends. + * Calculate the maximum possible number of backends. This must happen + * after we have finished initalizing GUCs and before shared memory size + * is determined. */ InitializeMaxBackends(); diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c index 7b19714..c99e7c0 100644 --- a/src/backend/utils/init/postinit.c +++ b/src/backend/utils/init/postinit.c @@ -482,9 +482,8 @@ pg_split_opts(char **argv, int *argcp, const char *optstr) /* * Initialize MaxBackends value from config options. * - * This must be called after modules have had the chance to register background - * workers in shared_preload_libraries, and before shared memory size is - * determined. + * This must happen after we finish setting the initial values for GUCs, + * and before shared memory size is determined. * * Note that in EXEC_BACKEND environment, the value is passed down from * postmaster to subprocesses via BackendParameters in SubPostmasterMain; only -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] deparsing utility commands
On 8/5/15 9:55 PM, Alvaro Herrera wrote: Jim Nasby wrote: On 7/31/15 8:45 AM, Shulgin, Oleksandr wrote: STATEMENT: create view v1 as select * from t1 ; ERROR: operator does not exist: pg_catalog.oid = pg_catalog.oid at character 52 HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. QUERY: SELECT * FROM pg_catalog.pg_rewrite WHERE ev_class = $1 AND rulename = $2 CONTEXT: PL/pgSQL function test_ddl_deparse() line 1 at FOR over SELECT rows I'm not sure what test_ddl_deparse is doing, is that where the oid = oid is coming from? It might be enlightening to replace = with OPERATOR(pg_catalog.=) and see if that works. That worked, thanks! Good, but why weren't operator resolution rules working correctly here to begin with? FWIW, my interest in this is largely due to the problems I've had with this in the variant type. In particular, using the same resolution rules for functions and operators. So I'm wondering if there's a bigger issue here. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising our compiler requirements for 9.6
Andres Freund wrote: I had to split of three things: LOCKMASK, the individual lock levels and xl_standby_lock to be able to prohibit lock.h to be included by frontend code. lockdefs.h works for me, counter proposals? There weren't any places that needed additional lock.h includes. Ah, but that's because you cheated and didn't remove the include from namespace.h ... But hashfn.c somewhat hilariously missed utils/hsearch.h ;) hah. diff --git a/src/include/storage/lockdefs.h b/src/include/storage/lockdefs.h new file mode 100644 index 000..bfbcdba --- /dev/null +++ b/src/include/storage/lockdefs.h @@ -0,0 +1,56 @@ +/*- + * + * lockdefs.h + * Frontend exposed parts of postgres' low level lock mechanism + * + * The split between lockdefs.h and lock.h is not very principled. No kidding! -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On 8/5/15 1:47 PM, Petr Jelinek wrote: On 2015-08-05 20:09, Alvaro Herrera wrote: Josh Berkus wrote: On 08/05/2015 10:46 AM, Alvaro Herrera wrote: 1. Add the functions as a builtins. This is what the current patch does. Simon seems to prefer this, because he wants the function to be always available in production; but I don't like this option because adding functions as builtins makes it impossible to move later to extensions. Bruce doesn't like this option either. Why would we want to move them later to extensions? Because it's not nice to have random stuff as builtins. Extensions have one nice property, they provide namespacing so not everything has to be in pg_catalog which already has about gazilion functions. It's nice to have stuff you don't need for day to day operations separate but still available (which is why src/extensions is better than contrib). They also provide a level of control over what is and isn't installed in a cluster. Personally, I'd prefer that most users not even be aware of the existence of things like pageinspect. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising our compiler requirements for 9.6
Andres Freund wrote: On 2015-08-06 12:29:15 -0300, Alvaro Herrera wrote: Ah, but that's because you cheated and didn't remove the include from namespace.h ... Well, it's not included from frontend code, so I didn't see the need? Going through all the backend code and replacing lock.h by lockdefs.h and some other includes doesn't seem particularly beneficial to me. FWIW, removing it from namespace.h is relatively easy. It starts to get a lot more noisy when you want to touch heapam.h. Ah, heapam.h is the granddaddy of all header messes, isn't it. (Actually it's execnodes.h or something like that.) diff --git a/src/include/storage/lockdefs.h b/src/include/storage/lockdefs.h new file mode 100644 index 000..bfbcdba --- /dev/null +++ b/src/include/storage/lockdefs.h @@ -0,0 +1,56 @@ +/*- + * + * lockdefs.h + * Frontend exposed parts of postgres' low level lock mechanism + * + * The split between lockdefs.h and lock.h is not very principled. No kidding! Do you have a good suggestion about the split? I wanted to expose the minimal amount necessary, and those were the ones. Nope, nothing useful, sorry. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce ProcArrayLock contention
On Wed, Aug 5, 2015 at 10:59 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Tue, Aug 4, 2015 at 8:59 PM, Robert Haas robertmh...@gmail.com wrote: I'm not entirely happy with the name nextClearXidElem but apart from that I'm fairly happy with this version. We should probably test it to make sure I haven't broken anything; I have verified the patch and it is fine. I have tested it via manual tests; for long pgbench tests, results are quite similar to previous versions of patch. Few changes, I have made in patch: 1. +static void +ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid) +{ + volatile PROC_HDR *procglobal = ProcGlobal; + uint32 nextidx; + uint32 wakeidx; + int extraWaits = -1; + + /* We should definitely have an XID to clear. */ + Assert(TransactionIdIsValid(pgxact-xid)); Here Assert is using pgxact which is wrong. 2. Made ProcArrayEndTransactionInternal as inline function as suggested by you. OK, committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers