Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: Sergey Konoplev gray...@gmail.com writes: On Wed, Oct 23, 2013 at 11:03 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: This is a slightly reworked version of the patch submitted by Richard Poole last month, which was based on Christian Kruse's earlier patch. Is it possible that this patch will be included in a minor version of 9.3? IMHO hugepages is a very important ability that postgres lost in 9.3, and it would be great to have it back ASAP. Say what? There's never been any hugepages support in Postgres. There were an ability to back shared memory with hugepages when using =9.2. I use it on ~30 servers for several years and it brings 8-17% of performance depending on the memory size. Here you will find several paragraphs of the description about how to do it https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md. Just search for the 'hugepages' word on the page. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
On Tue, Oct 29, 2013 at 11:08:05PM -0700, Sergey Konoplev wrote: On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: Sergey Konoplev gray...@gmail.com writes: On Wed, Oct 23, 2013 at 11:03 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: This is a slightly reworked version of the patch submitted by Richard Poole last month, which was based on Christian Kruse's earlier patch. Is it possible that this patch will be included in a minor version of 9.3? IMHO hugepages is a very important ability that postgres lost in 9.3, and it would be great to have it back ASAP. Say what? There's never been any hugepages support in Postgres. There were an ability to back shared memory with hugepages when using =9.2. I use it on ~30 servers for several years and it brings 8-17% of performance depending on the memory size. Here you will find several paragraphs of the description about how to do it https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md. Just search for the 'hugepages' word on the page. For better or worse, we add new features exactly and only in .0 releases. It's what's made it possible for people to plan deployments, given us a deserved reputation for stability, etc., etc. I guess what I'm saying here is that awesome as any particular feature might be to back-patch, that benefit is overwhelmed by the cost of having unstable releases. -infininty from me to any proposal that gets us into are you using PostgreSQL x.y.z or x.y.w? when it comes to features. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
On Wed, Oct 30, 2013 at 10:16:57AM +0530, Abhijit Menon-Sen wrote: At 2013-10-24 16:06:19 +0300, hlinnakan...@vmware.com wrote: Let's get rid of the rounding. I share Andres's concern that the bug is present in various recent kernels that are going to stick around for quite some time. Given the rather significant performance gain, I think it's worth doing something, though I'm not a big fan of the directory-scanning code myself. As a compromise, perhaps we can unconditionally round the size up to be a multiple of 2MB? How about documenting that 2MB is the quantum (OK, we'll say indivisible unit or smallest division or something) and failing with a message to that effect if someone tries to set it otherwise? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
At 2013-10-30 00:10:39 -0700, da...@fetter.org wrote: How about documenting that 2MB is the quantum (OK, we'll say indivisible unit or smallest division or something) and failing with a message to that effect if someone tries to set it otherwise? I don't think you understand the problem. We're not discussing a user setting here. The size that is passed to PGSharedMemoryCreate is based on shared_buffers and our estimates of how much memory we need for other things like ProcArray (see ipci.c:CreateSharedMemoryAndSemaphores). If this calculated size is not a multiple of a page size supported by the hardware (usually 2/4/16MB etc.), the allocation will fail under some commonly-used kernels. We can either ignore the problem and let the allocation fail, or try to discover the smallest supported huge page size (what the patch does now), or assume that 2MB pages can be used if any huge pages can be used and align accordingly. We could use a larger size, e.g. if we aligned to 16MB then it would work on hardware that supported 2/4/8/16MB pages, but we'd waste the extra memory unless we also increased NBuffers after the rounding up (which is also something Andres suggested earlier). I don't have a strong opinion on the available options, other than not liking the do nothing approach. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
Hmm, you realise Alvaro is working on MinMax indexes in this release? They are very efficient with regard to index inserts and specially designed for use on large tables. Prior work by Heikki on Grouped Item Tuples was a way of reducing the size of indexes, yet still allowing uniqueness checks. That is implemented in SQLServer already and is very useful. Reading the implementation of those features, I don't think they can help in the cases handled by the index types I mentioned (insertions of random values in big tables). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] appendStringInfo vs appendStringInfoString
On Mon, Sep 30, 2013 at 10:10 PM, David Rowley dgrowle...@gmail.com wrote: On Sat, Sep 28, 2013 at 9:44 PM, David Rowley dgrowle...@gmail.comwrote: I did some benchmarking earlier in the week for the new patch which was just commited to allow formatting in the log_line_prefix string. In version 0.4 of the patch there was some performance regression as I was doing appendStringInfo(buf, %*s, padding, variable); instead of appendStringInfoString(buf, variable); This regression was fixed in a later version of the patch by only using appendStringInfo when the padding was 0. More details here: http://www.postgresql.org/message-id/CAApHDvreSGYvtXJvqHcXZL8_tXiKKiFXhQyXgqtnQ5Yo=me...@mail.gmail.com I've attached a the cleanup patch for this. This one just converts instances of appendStringInfo into appendStringInfoString where appendStringInfo does no formatting or just has the format %s. I've attached a re-based version of this. David Rowley appendStringInfo_cleanup_v0.5.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
On 30 October 2013 07:55, Leonardo Francalanci m_li...@yahoo.it wrote: Hmm, you realise Alvaro is working on MinMax indexes in this release? They are very efficient with regard to index inserts and specially designed for use on large tables. Prior work by Heikki on Grouped Item Tuples was a way of reducing the size of indexes, yet still allowing uniqueness checks. That is implemented in SQLServer already and is very useful. Reading the implementation of those features, I don't think they can help in the cases handled by the index types I mentioned (insertions of random values in big tables). Presumably the data you are inserting isn't actually random. Please describe the use case you are considering in more detail and some view on how frequent that is, with some examples. Once we understand the use case and agree it is important, we might solve problems. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
Presumably the data you are inserting isn't actually random. Please describe the use case you are considering in more detail and some view on how frequent that is, with some examples. Once we understand the use case and agree it is important, we might solve problems. Collecting calls data for mobile network operators (and no, I don't work for the NSA...) Easily 5000-1 inserts per second. Indexes in timestamp and ID (not a problem, always increasing so no btree issues) and in called #, calling #, imsi, imei. The last four obviously are random, out of millions of possible values. After the few first millions of records, the disks can't keep up with the amount of random writing in the indexes. Workaround: the table is partitioned every 15 minutes, and indexes created in bulk after we start the new 15-minutes partition. Searches on current 15 minutes are not allowed (as it is not indexed), and searches on older data are K*log(N) (where K is the number of partitions). Yes, I could throw more disks, use ssd, sharding more, etc etc. But I still think that btree just aren't fit for this kind of problem. I don't delete data, I don't update data, there's not that much concurrency going on. I would sacrifice search speed (K*log(N) is already much slower than regular btree usage) for realtime insertion. I don't think I'm the only one having a big system to be indexed by random values. In fact, I didn't want to turn this thread into a help me with this workload thread. I just wanted to know if there was some other known issues with these different indexes other than not enough time to implement them correctly: I was afraid that someone already dismissed them as good in theory, bad in practice... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
On 30 October 2013 10:35, Leonardo Francalanci m_li...@yahoo.it wrote: Presumably the data you are inserting isn't actually random. Please describe the use case you are considering in more detail and some view on how frequent that is, with some examples. Once we understand the use case and agree it is important, we might solve problems. Collecting calls data for mobile network operators (and no, I don't work for the NSA...) Easily 5000-1 inserts per second. Indexes in timestamp and ID (not a problem, always increasing so no btree issues) and in called #, calling #, imsi, imei. The last four obviously are random, out of millions of possible values. After the few first millions of records, the disks can't keep up with the amount of random writing in the indexes. Workaround: the table is partitioned every 15 minutes, and indexes created in bulk after we start the new 15-minutes partition. Searches on current 15 minutes are not allowed (as it is not indexed), and searches on older data are K*log(N) (where K is the number of partitions). Yes, I could throw more disks, use ssd, sharding more, etc etc. But I still think that btree just aren't fit for this kind of problem. I don't delete data, I don't update data, there's not that much concurrency going on. I would sacrifice search speed (K*log(N) is already much slower than regular btree usage) for realtime insertion. I don't think I'm the only one having a big system to be indexed by random values. In fact, I didn't want to turn this thread into a help me with this workload thread. I just wanted to know if there was some other known issues with these different indexes other than not enough time to implement them correctly: I was afraid that someone already dismissed them as good in theory, bad in practice... What is the reason for needing such fast access to individual groups of records? Sure sounds like the NSA or similar ;-) Sacrificing timeliness for efficiency is a common solution. I'm seeing lots of areas where being able to specify the timeliness that is acceptable in a query leads to various optimisations of this and similar. Indexes are a declarative solution. We would need to be able to specify the tolerances to be able to do this. (You can write your own index...) In terms of generality, do you think its worth a man year of developer effort to replicate what you have already achieved? Who would pay? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
What is the reason for needing such fast access to individual groups of records? Sure sounds like the NSA or similar ;-) Users need to search all calls originated from/to a user or from/to a specific mobile phone to answer/analyze customers' probl... ok, I give up: I work for the NSA ;) In terms of generality, do you think its worth a man year of developer effort to replicate what you have already achieved? Who would pay? 1) I haven't achieved what I need: realtime indexing. I can't query the current 15 minutes table efficiently. Plus, K*log(N) is not that great when you have a lot of K. 2) I'm not suggesting that this is top priority. I'm asking if there's something else, other than we don't have time for this, that I don't know. In fact, I don't even know if those indexes types would really help in my (specific) case. That's why my original question was why aren't there developments in this area: I didn't mean to imply someone should do it. I just wanted to know if those indexes were already discussed (and maybe dismissed for some reason) in the past... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
On 2013-10-30 12:08, Simon Riggs wrote: effort to replicate what you have already achieved? Who would pay? The NSA, obviously ;-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On Wed, Oct 30, 2013 at 8:22 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Oct 30, 2013 at 1:22 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Wed, Oct 30, 2013 at 12:42 AM, Tom Lane t...@sss.pgh.pa.us wrote: The last two buildfarm runs on frogmouth have failed in initdb, like this: creating directory d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... windows creating configuration files ... ok creating template1 database in d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data/base/1 ... FATAL: could not open shared memory segment Global/PostgreSQL.851401618: Not enough space child process exited with exit code 1 In windows implementation of dynamic shared memory, Size calculation for creating dynamic shared memory is assuming that requested size for creation of dynamic shared memory segment is uint64, which is changed by commit d2aecae, so we need to change that calculation as well. Please find the attached patch to fix this problem. I find it hard to believe this is the right fix. I know we have similar code in win32_shmem.c, but surely if size is a 32-bit unsigned quantity then size 0 is simply 0 anyway. Err, rather, size 32 is simply 0 anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On Wed, Oct 30, 2013 at 1:22 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Wed, Oct 30, 2013 at 12:42 AM, Tom Lane t...@sss.pgh.pa.us wrote: The last two buildfarm runs on frogmouth have failed in initdb, like this: creating directory d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... windows creating configuration files ... ok creating template1 database in d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data/base/1 ... FATAL: could not open shared memory segment Global/PostgreSQL.851401618: Not enough space child process exited with exit code 1 In windows implementation of dynamic shared memory, Size calculation for creating dynamic shared memory is assuming that requested size for creation of dynamic shared memory segment is uint64, which is changed by commit d2aecae, so we need to change that calculation as well. Please find the attached patch to fix this problem. I find it hard to believe this is the right fix. I know we have similar code in win32_shmem.c, but surely if size is a 32-bit unsigned quantity then size 0 is simply 0 anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
Hello, For a certain reason, I need to build PostgreSQL on Windows with OSSP UUID support to use UUID for primary keys. I have to use Visual Studio 2010. The original source code for OSSP UUID and its fork for Windows can be downloaded from: http://www.ossp.org/pkg/lib/uuid/ http://sourceforge.jp/projects/sfnet_osspuuidwin32/ I built the latest version 1.6.2 and used uuid_generate_v4() as the default value for a primary key column of type uuid. However, when I insert 200-300 rows, an unique constraint violation error is reported. It occurs repeatedly in several retries. I found some people hit the same problem in the past, but they didn't seem to get the solution: http://www.postgresql.org/message-id/039e01c8b63c$d0c95e90$60096cdb@IBMC9A0F63B40D http://www.postgresql.org/message-id/018d01c8dec6$2a14b470$0b01a8c0@IBMC9A0F63B40D http://www.postgresql.org/message-id/98B20744191F464DB598A25AB00BC125@acer08f817a9b5 When I copy lib\uuid_ossp.dll in the PostgreSQL community package, which I guess Dave Page san maintains, to my installation, the error does not occur. So, OSSP UUID library is the cause. Saito san published a patch for OSSP UUID on his site: http://winpg.jp/~saito/pg_work/OSSP_win32/msvc/ And Dave san showed interest in it: http://www.postgresql.org/message-id/snt127-w18c42125e5fd719ba514ceff...@phx.gbl How can I build the OSSP UUID library that functions properly on Windows? Is OSSP UUID 1.6.2 + Saito-san's patch the exact source for the uuid-ossp.dll in thecommunity binary? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
Hello, # Sorry, Saito san' address doesn't seem to exist, so excuse for sending again For a certain reason, I need to build PostgreSQL on Windows with OSSP UUID support to use UUID for primary keys. I have to use Visual Studio 2010. The original source code for OSSP UUID and its fork for Windows can be downloaded from: http://www.ossp.org/pkg/lib/uuid/ http://sourceforge.jp/projects/sfnet_osspuuidwin32/ I built the latest version 1.6.2 and used uuid_generate_v4() as the default value for a primary key column of type uuid. However, when I insert 200-300 rows, an unique constraint violation error is reported. It occurs repeatedly in several retries. I found some people hit the same problem in the past, but they didn't seem to get the solution: http://www.postgresql.org/message-id/039e01c8b63c$d0c95e90$60096cdb@IBMC9A0F63B40D http://www.postgresql.org/message-id/018d01c8dec6$2a14b470$0b01a8c0@IBMC9A0F63B40D http://www.postgresql.org/message-id/98B20744191F464DB598A25AB00BC125@acer08f817a9b5 When I copy lib\uuid_ossp.dll in the PostgreSQL community package, which I guess Dave Page san maintains, to my installation, the error does not occur. So, OSSP UUID library is the cause. Saito san published a patch for OSSP UUID on his site: http://winpg.jp/~saito/pg_work/OSSP_win32/msvc/ And Dave san showed interest in it: http://www.postgresql.org/message-id/snt127-w18c42125e5fd719ba514ceff...@phx.gbl How can I build the OSSP UUID library that functions properly on Windows? Is OSSP UUID 1.6.2 + Saito-san's patch the exact source for the uuid-ossp.dll in thecommunity binary? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On Tue, Oct 29, 2013 at 3:12 PM, Tom Lane t...@sss.pgh.pa.us wrote: The last two buildfarm runs on frogmouth have failed in initdb, like this: creating directory d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... windows creating configuration files ... ok creating template1 database in d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data/base/1 ... FATAL: could not open shared memory segment Global/PostgreSQL.851401618: Not enough space child process exited with exit code 1 It shouldn't be failing like that, considering that we just finished probing for acceptable max_connections and shared_buffers without hitting any apparent limit. I suppose it's possible that the final shm segment size is a bit larger than what was tested at the shared_buffer step, but that doesn't seem very likely to be the explanation. What seems considerably more probable is that the probe for a shared memory implementation is screwing up the system state somehow. It may not be unrelated that this machine was happy before commit d2aecae went in. If I'm reading this correctly, the last three runs on frogmouth have all failed, and all of them have failed with a complaint about, specifically, Global/PostgreSQL.851401618. Now, that really shouldn't be happening, because the code to choose that number looks like this: dsm_control_handle = random(); One possibility that occurs to me is that if, for some reason, we're using the same handle every time on Windows, and if Windows takes a bit of time to reclaim the segment after the postmaster exits (which is not hard to believe given some previous Windows behavior I've seen), then running the postmaster lots of times in quick succession (as initdb does) might fail. I dunno what that has to do with the patch, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On 2013-10-30 08:45:03 -0400, Robert Haas wrote: If I'm reading this correctly, the last three runs on frogmouth have all failed, and all of them have failed with a complaint about, specifically, Global/PostgreSQL.851401618. Now, that really shouldn't be happening, because the code to choose that number looks like this: dsm_control_handle = random(); One possibility that occurs to me is that if, for some reason, we're using the same handle every time on Windows, and if Windows takes a bit of time to reclaim the segment after the postmaster exits (which is not hard to believe given some previous Windows behavior I've seen), then running the postmaster lots of times in quick succession (as initdb does) might fail. I dunno what that has to do with the patch, though. Could it be that we haven't primed the random number generator with the time or something like that yet? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On Wed, Oct 30, 2013 at 8:22 AM, Robert Haas robertmh...@gmail.com wrote: I find it hard to believe this is the right fix. I know we have similar code in win32_shmem.c, but surely if size is a 32-bit unsigned quantity then size 0 is simply 0 anyway. Gosh, I stand corrected. According to http://msdn.microsoft.com/en-us/library/336xbhcz.aspx -- The result is undefined if the right operand of a shift expression is negative or if the right operand is greater than or equal to the number of bits in the (promoted) left operand. No shift operation is performed if the right operand is zero (0). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On Wed, Oct 30, 2013 at 8:47 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-10-30 08:45:03 -0400, Robert Haas wrote: If I'm reading this correctly, the last three runs on frogmouth have all failed, and all of them have failed with a complaint about, specifically, Global/PostgreSQL.851401618. Now, that really shouldn't be happening, because the code to choose that number looks like this: dsm_control_handle = random(); One possibility that occurs to me is that if, for some reason, we're using the same handle every time on Windows, and if Windows takes a bit of time to reclaim the segment after the postmaster exits (which is not hard to believe given some previous Windows behavior I've seen), then running the postmaster lots of times in quick succession (as initdb does) might fail. I dunno what that has to do with the patch, though. Could it be that we haven't primed the random number generator with the time or something like that yet? Yeah, I think that's probably what it is. There's PostmasterRandom() to initialize the random-number generator on first use, but that doesn't help if some other module calls random(). I wonder if we ought to just get rid of PostmasterRandom() and instead have the postmaster run that initialization code very early in startup. That'd make the timing of the random number generator being initialized a bit more predictable, perhaps, but if the dynamic shared memory code is going to grab a random number during startup it's basically going to be nailed to that event anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] appendStringInfo vs appendStringInfoString
On Wed, Oct 30, 2013 at 4:51 AM, David Rowley dgrowle...@gmail.com wrote: I've attached a re-based version of this. I don't see any compelling reason not to commit this. Does anyone wish to object? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
Robert Haas robertmh...@gmail.com writes: If I'm reading this correctly, the last three runs on frogmouth have all failed, and all of them have failed with a complaint about, specifically, Global/PostgreSQL.851401618. Now, that really shouldn't be happening, because the code to choose that number looks like this: dsm_control_handle = random(); Isn't this complaining about the main shm segment, not a DSM extension? Also, why is the error not enough space, rather than something about a collision? And if this is the explanation, why didn't the previous runs probing for allowable shmem size fail? BTW, regardless of the specific properties of random(), surely you ought to have code in there that would cope with a name collision. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On 2013-10-30 09:26:42 -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: If I'm reading this correctly, the last three runs on frogmouth have all failed, and all of them have failed with a complaint about, specifically, Global/PostgreSQL.851401618. Now, that really shouldn't be happening, because the code to choose that number looks like this: dsm_control_handle = random(); Isn't this complaining about the main shm segment, not a DSM extension? Don't think so, that has a : in the name. But I think this touches a fair point, I think we need to make all the dsm error messages more distinctive. The history since this has been committed makes it likely that there will be more errors. Also, why is the error not enough space, rather than something about a collision? And if this is the explanation, why didn't the previous runs probing for allowable shmem size fail? Yea, I don't think this explains the issue but something independent that needs to be fixed. BTW, regardless of the specific properties of random(), surely you ought to have code in there that would cope with a name collision. There actually is code that retries, but only for EEXISTS. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
Andres Freund and...@2ndquadrant.com writes: On 2013-10-30 09:26:42 -0400, Tom Lane wrote: Isn't this complaining about the main shm segment, not a DSM extension? Don't think so, that has a : in the name. If it *isn't* about the main memory segment, what the hell are we doing creating random addon segments during bootstrap? None of the DSM code should even get control at this point, IMO. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] appendStringInfo vs appendStringInfoString
Robert Haas escribió: On Wed, Oct 30, 2013 at 4:51 AM, David Rowley dgrowle...@gmail.com wrote: I've attached a re-based version of this. I don't see any compelling reason not to commit this. Does anyone wish to object? I think a blanket substitution of places that currently have %s might cause bugs, particularly if the string is user-supplied. It might be right for many cases, but did you/someone review each such callsite? No objection to the other half, that substitute calls that don't have %s. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
On 30 October 2013 11:23, Leonardo Francalanci m_li...@yahoo.it wrote: What is the reason for needing such fast access to individual groups of records? Sure sounds like the NSA or similar ;-) Users need to search all calls originated from/to a user or from/to a specific mobile phone to answer/analyze customers' probl... ok, I give up: I work for the NSA ;) In terms of generality, do you think its worth a man year of developer effort to replicate what you have already achieved? Who would pay? 1) I haven't achieved what I need: realtime indexing. I can't query the current 15 minutes table efficiently. Plus, K*log(N) is not that great when you have a lot of K. 2) I'm not suggesting that this is top priority. I'm asking if there's something else, other than we don't have time for this, that I don't know. In fact, I don't even know if those indexes types would really help in my (specific) case. That's why my original question was why aren't there developments in this area: I didn't mean to imply someone should do it. I just wanted to know if those indexes were already discussed (and maybe dismissed for some reason) in the past... OK, I understand now. LSM-trees seem patent free, since open source implementations exist. The main concept is to partition the index into multiple trees, so that the current insertion sub-tree fits more easily in memory. That sounds good and was exactly the solution I'd come up with as well, which is a good cross check. It leads to a slow increase in index response times, but we could offset that by having minimum values on each subtree and using partitioning logic as with a minmax index. LSM-tree also covers the goal of maintaining just 2 sub-trees and a concurrent process of merging sub-trees. That sounds like it would take a lot of additional time to get right and would need some off-line process to perform the merge. Please somebody advise patent status of Y-trees otherwise I wouldn't bother. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] appendStringInfo vs appendStringInfoString
On 2013-10-30 10:52:05 -0300, Alvaro Herrera wrote: Robert Haas escribió: On Wed, Oct 30, 2013 at 4:51 AM, David Rowley dgrowle...@gmail.com wrote: I've attached a re-based version of this. I don't see any compelling reason not to commit this. Does anyone wish to object? I think a blanket substitution of places that currently have %s might cause bugs, particularly if the string is user-supplied. It might be right for many cases, but did you/someone review each such callsite? No objection to the other half, that substitute calls that don't have %s. appendStringInfoString() doesn't expand format strings, so I am not sure what you're worried about? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
LSM-trees seem patent free I'm no expert, and I gave it just a look some time ago: it looked to me very complicated to get right... and as far as I remember you don't get that much gain, unless you go multi-level which would complicate things further Please somebody advise patent status of Y-trees otherwise I wouldn't bother. y-trees look much more easier to get right... (and to me they also make more sense, but I'm not skilled enough to judge). There's also the FD-tree, which looks a lot like the (patented...) fractal tree: http://www.ntu.edu.sg/home/bshe/fdtree_pvldb.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
Hi MauMau-san it my bug patch applied to ver 1.6.2sorry.. then, I made the next patch is there, please see, http://winpg.jp/~saito/pg_work/OSSP_win32/ I will be adjusted and Ralf-san again. best regards, Hiroshi Saito (2013/10/30 21:45), MauMau wrote: Hello, # Sorry, Saito san' address doesn't seem to exist, so excuse for sending again For a certain reason, I need to build PostgreSQL on Windows with OSSP UUID support to use UUID for primary keys. I have to use Visual Studio 2010. The original source code for OSSP UUID and its fork for Windows can be downloaded from: http://www.ossp.org/pkg/lib/uuid/ http://sourceforge.jp/projects/sfnet_osspuuidwin32/ I built the latest version 1.6.2 and used uuid_generate_v4() as the default value for a primary key column of type uuid. However, when I insert 200-300 rows, an unique constraint violation error is reported. It occurs repeatedly in several retries. I found some people hit the same problem in the past, but they didn't seem to get the solution: http://www.postgresql.org/message-id/039e01c8b63c$d0c95e90$60096cdb@IBMC9A0F63B40D http://www.postgresql.org/message-id/018d01c8dec6$2a14b470$0b01a8c0@IBMC9A0F63B40D http://www.postgresql.org/message-id/98B20744191F464DB598A25AB00BC125@acer08f817a9b5 When I copy lib\uuid_ossp.dll in the PostgreSQL community package, which I guess Dave Page san maintains, to my installation, the error does not occur. So, OSSP UUID library is the cause. Saito san published a patch for OSSP UUID on his site: http://winpg.jp/~saito/pg_work/OSSP_win32/msvc/ And Dave san showed interest in it: http://www.postgresql.org/message-id/snt127-w18c42125e5fd719ba514ceff...@phx.gbl How can I build the OSSP UUID library that functions properly on Windows? Is OSSP UUID 1.6.2 + Saito-san's patch the exact source for the uuid-ossp.dll in thecommunity binary? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
On 30 October 2013 11:23, Leonardo Francalanci m_li...@yahoo.it wrote: In terms of generality, do you think its worth a man year of developer effort to replicate what you have already achieved? Who would pay? I work on an application that does exactly what Leonardo described. We hit the exact same problem, and came up with the same exact same solution (down to the 15 minutes interval). But I have also worked on other various datamarts (all using Oracle), and they are all subject to this problem in some form: B-tree indexes slow down bulk data inserts too much and need to be disabled or dropped and then recreated after the load. In some cases this is done easily enough, in others it's more complicated (example: every day, a process imports from 1 million to 1 billion records into a table partition that may contain from 0 to 1 billion records. To be as efficient as possible, you need some logic to compare the number of rows to insert to the number of rows already present, in order to decide whether to drop the indexes or not). Basically, my point is that this is a common problem for datawarehouses and datamarts. In my view, indexes that don't require developers to work around poor insert performance would be a significant feature in a datawarehouse-ready DBMS. Yann
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
Abhijit Menon-Sen a...@2ndquadrant.com writes: As a compromise, perhaps we can unconditionally round the size up to be a multiple of 2MB? That way, we can use huge pages more often, but also avoid putting in a lot of code and effort into the workaround and waste only a little space (if any at all). That sounds reasonably painless to me. Note that at least in our main shmem segment, extra space is not useless, because it allows slop for the main hash tables, notably the locks table. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
Sergey Konoplev gray...@gmail.com writes: On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: Say what? There's never been any hugepages support in Postgres. There were an ability to back shared memory with hugepages when using =9.2. I use it on ~30 servers for several years and it brings 8-17% of performance depending on the memory size. Here you will find several paragraphs of the description about how to do it https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md. What this describes is how to modify Postgres to request huge pages. That's hardly built-in support. In any case, as David already explained, we don't do feature additions in minor releases. We'd be especially unlikely to make an exception for this, since it has uncertain portability and benefits. Anything that carries portability risks has got to go through a beta testing cycle before we'll unleash it on the masses. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
On Wed, Oct 30, 2013 at 9:23 AM, Leonardo Francalanci m_li...@yahoo.it wrote: LSM-trees seem patent free I'm no expert, and I gave it just a look some time ago: it looked to me very complicated to get right... and as far as I remember you don't get that much gain, unless you go multi-level which would complicate things further Please somebody advise patent status of Y-trees otherwise I wouldn't bother. y-trees look much more easier to get right... (and to me they also make more sense, but I'm not skilled enough to judge). There's also the FD-tree, which looks a lot like the (patented...) fractal tree: http://www.ntu.edu.sg/home/bshe/fdtree_pvldb.pdf Skimming the white paper, it's clear right from the start that they make assumptions about the hardware that appear to be already obsolete -- extremely poor write performance vs read performance of SSD. Later generation SSDs are increasingly using hardware optimizations to convert random writes to sequential writes using various tricks. Point being: hardware is marching along pretty fast (after 20+ years of stagnation) and it's dangerous (IMO) to make big software investments based on the situation on the ground *today*. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
On Wed, Oct 30, 2013 at 3:35 AM, Leonardo Francalanci m_li...@yahoo.itwrote: Presumably the data you are inserting isn't actually random. Please describe the use case you are considering in more detail and some view on how frequent that is, with some examples. Once we understand the use case and agree it is important, we might solve problems. Collecting calls data for mobile network operators (and no, I don't work for the NSA...) Easily 5000-1 inserts per second. Indexes in timestamp and ID (not a problem, always increasing so no btree issues) and in called #, calling #, imsi, imei. The last four obviously are random, out of millions of possible values. After the few first millions of records, the disks can't keep up with the amount of random writing in the indexes. So, like, 3 minutes worth? How much RAM and shared_buffers do you have? The index insertions should be fast until the size of the active part of the indexes being inserted into exceeds shared_buffers by some amount (what that amount is would depend on how much dirty data the kernel is willing to allow in the page cache before it starts suffering anxiety about it). If you have enough shared_buffers to make that last for 15 minutes, then you shouldn't have a problem inserting with live indexes. Cheers, Jeff
Re: [HACKERS] appendStringInfo vs appendStringInfoString
Andres Freund escribió: On 2013-10-30 10:52:05 -0300, Alvaro Herrera wrote: Robert Haas escribió: On Wed, Oct 30, 2013 at 4:51 AM, David Rowley dgrowle...@gmail.com wrote: I've attached a re-based version of this. I don't see any compelling reason not to commit this. Does anyone wish to object? I think a blanket substitution of places that currently have %s might cause bugs, particularly if the string is user-supplied. It might be right for many cases, but did you/someone review each such callsite? No objection to the other half, that substitute calls that don't have %s. appendStringInfoString() doesn't expand format strings, so I am not sure what you're worried about? Um. Blame my lack of decent breakfast this morning. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
Hiroshi Saito escribió: Hi MauMau-san it my bug patch applied to ver 1.6.2sorry.. then, I made the next patch is there, please see, http://winpg.jp/~saito/pg_work/OSSP_win32/ I will be adjusted and Ralf-san again. At this point, I think we need to consider ossp-uuid as dead code. Last release was in 2008 and there are lots of bugs reports still open, even some with trivial patches attached. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
On Wed, Oct 30, 2013 at 4:23 AM, Leonardo Francalanci m_li...@yahoo.itwrote: 1) I haven't achieved what I need: realtime indexing. I can't query the current 15 minutes table efficiently. Plus, K*log(N) is not that great when you have a lot of K. Are partitions read-only once time has moved on, or can stragglers show up that need to be inserted into older partitions? You could periodically merge older partitions into larger tables, index those aggregated tables, then transactionally disinherit the old partitions and inherit the new aggregated one. This would keep the value of K down, at the expense of re-writing data multiple times (but all method write data multiple times, some just hide it from you). 2) I'm not suggesting that this is top priority. I'm asking if there's something else, other than we don't have time for this, that I don't know. In fact, I don't even know if those indexes types would really help in my (specific) case. That's why my original question was why aren't there developments in this area: I didn't mean to imply someone should do it. I just wanted to know if those indexes were already discussed (and maybe dismissed for some reason) in the past... There have been several ideas discussed, but not a whole lot of time to implement them. By the way, what is the transaction structure of your inserts? Are they large batches between commits, or is each row committed? Cheers, Jeff
Re: [HACKERS] Something fishy happening on frogmouth
On Wed, Oct 30, 2013 at 9:26 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: If I'm reading this correctly, the last three runs on frogmouth have all failed, and all of them have failed with a complaint about, specifically, Global/PostgreSQL.851401618. Now, that really shouldn't be happening, because the code to choose that number looks like this: dsm_control_handle = random(); Isn't this complaining about the main shm segment, not a DSM extension? No. That's why the identifier being assigned to has dsm in it. I'll respond to this in more detail in a separate post. Also, why is the error not enough space, rather than something about a collision? And if this is the explanation, why didn't the previous runs probing for allowable shmem size fail? Good questions. I think that my previous theory was wrong, and that the patch from Amit which I pushed a while ago should fix the breakage. BTW, regardless of the specific properties of random(), surely you ought to have code in there that would cope with a name collision. I do have code in there to cope with a name collision. However, that doesn't mean it's good for it to choose the same name for the segment by default every time. If we were going to do it that way I ought to have just made it serial (PostgreSQL.0, 1, 2, 3, ...) instead of using random numbers to name them. The reason I didn't do that is to minimize the chances of collisions actually happening - and especially to minimize the chances of a large number of collisions happening. Especially for System V shared memory, the namespace is rather constrained, so bouncing around randomly through the namespace makes it unlikely that we'll hit a whole bunch of identifiers in a row that are all already in use by some other postmaster or, indeed, a process unrelated to PostgreSQL. A linear scan starting at any fixed value wouldn't have that desirable property. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
Robert Haas robertmh...@gmail.com writes: On Wed, Oct 30, 2013 at 9:26 AM, Tom Lane t...@sss.pgh.pa.us wrote: Also, why is the error not enough space, rather than something about a collision? And if this is the explanation, why didn't the previous runs probing for allowable shmem size fail? Good questions. I think that my previous theory was wrong, and that the patch from Amit which I pushed a while ago should fix the breakage. Indeed, I see frogmouth just went green, so Amit nailed it. I'm still wondering why we try to create a DSM segment in bootstrap. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
Alvaro Herrera alvhe...@2ndquadrant.com writes: At this point, I think we need to consider ossp-uuid as dead code. Yeah, but what shall we replace it with? And can we preserve the API contrib/uuid-ossp offers? (Maybe we shouldn't even try, but just deprecate that module and start fresh.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
Robert Haas robertmh...@gmail.com writes: Yeah, I think that's probably what it is. There's PostmasterRandom() to initialize the random-number generator on first use, but that doesn't help if some other module calls random(). I wonder if we ought to just get rid of PostmasterRandom() and instead have the postmaster run that initialization code very early in startup. You could do arbitrary rearrangement of the postmaster's code and not succeed in affecting this behavior in the slightest, because the postmaster isn't running during bootstrap. I continue to doubt that there's a good reason to be creating DSM segment(s) here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On Wed, Oct 30, 2013 at 9:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2013-10-30 09:26:42 -0400, Tom Lane wrote: Isn't this complaining about the main shm segment, not a DSM extension? Don't think so, that has a : in the name. If it *isn't* about the main memory segment, what the hell are we doing creating random addon segments during bootstrap? None of the DSM code should even get control at this point, IMO. Here's a short summary of what I posted back in August: at system startup time, the postmaster creates one dynamic shared segment, called the control segment. That segment sticks around for the lifetime of the server and records the identity of any *other* dynamic shared memory segments that are subsequently created. If the server dies a horrible death (e.g. kill -9), the next postmaster will find the previous control segment (whose ID is written to a file in the data directory) and remove any leftover shared memory segments from the previous run; without this, such segments would live until the next server reboot unless manually removed by the user (which isn't even practical on all platforms; e.g. there doesn't seem to be any way to list all exstant POSIX shared memory segments on MacOS X, so a user wouldn't know which segments to remove). For my previous posting on this topic, see the following link, particularly the paragraph which begins The actual implementation is split up into two layers and the following one. http://www.postgresql.org/message-id/CA+TgmoaDqDUgt=4Zs_QPOnBt=EstEaVNP+5t+m=fpnwship...@mail.gmail.com Now, you might ask why not store this control information that we need for cleanup purposes in the *main* shared memory segment rather than in a dynamic shared memory segment. The basic problem is that I don't know how to dig it out of there in any reasonable way. The dsm control segment is small and has a very simple structure; when the postmaster uses the previous postmaster's leftover control segment to clean up orphaned shared memory segments, it will ignore that old control segment unless it passes various sanity tests. But even if passes those sanity tests but is corrupted somehow otherwise, nothing that happens as a result will cause a fatal error, let alone a server crash. You're of course welcome to critique that logic, but I tried my best to make it bulletproof. See dsm_cleanup_using_control_segment(). The structure of the main shared memory segment is way more complicated. If we latched onto an old main shared memory segment, we'd presumably need to traverse ShmemIndex to even find that portion of the shared memory segment where the DSM control information was slated to be stored. And there's no way that's going to be robust in the face of a possibly-corrupted shared memory segment left over from a previous run. And that's actually making the assumption that we could even do it that way, which we really can't: as of 9.3, things like ShmemIndex are stored in the MAP_SHARED anonymous mapping, and the System V shared memory segment is small and fixed-size. We could try to refactor the code so that we merge the control segment data into the residual System V segment, but I think it'd be ugly and I'm not sure what it really buys us. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
On 10/30/2013 12:43 PM, Tom Lane wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: At this point, I think we need to consider ossp-uuid as dead code. Yeah, but what shall we replace it with? And can we preserve the API contrib/uuid-ossp offers? (Maybe we shouldn't even try, but just deprecate that module and start fresh.) An alternative would be for someone to put up a fork on, say, github, and maintain it. (No, I'm not volunteering.) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
Jeff Janes wrote The index insertions should be fast until the size of the active part of the indexes being inserted into exceeds shared_buffers by some amount (what that amount is would depend on how much dirty data the kernel is willing to allow in the page cache before it starts suffering anxiety about it). If you have enough shared_buffers to make that last for 15 minutes, then you shouldn't have a problem inserting with live indexes. Sooner or later you'll have to checkpoint those shared_buffers... and we are talking about GB of data (my understanding is that we change basically every btree page, resulting in re-writing of the whole index). -- View this message in context: http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776413.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On Wed, Oct 30, 2013 at 12:51 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Yeah, I think that's probably what it is. There's PostmasterRandom() to initialize the random-number generator on first use, but that doesn't help if some other module calls random(). I wonder if we ought to just get rid of PostmasterRandom() and instead have the postmaster run that initialization code very early in startup. You could do arbitrary rearrangement of the postmaster's code and not succeed in affecting this behavior in the slightest, because the postmaster isn't running during bootstrap. Well, if you're telling me that it's not possible to find a way to arrange things so that the random number is initialized before first use, I'm gonna respectfully disagree. If you're just critiquing my particular suggestion about where to put that code - fair enough. Maybe it really ought to live in our src/port implementation of random() or pg_lrand48(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
Jeff Janes wrote Are partitions read-only once time has moved on, or can stragglers show up that need to be inserted into older partitions? You could periodically merge older partitions into larger tables, index those aggregated tables, then transactionally disinherit the old partitions and inherit the new aggregated one. This would keep the value of K down, at the expense of re-writing data multiple times (but all method write data multiple times, some just hide it from you). Yes, we could merge the partitions: the idea was to merge them during night hour, when traffic is low ( and NSA people are sleeping ;) ) Jeff Janes wrote By the way, what is the transaction structure of your inserts? Are they large batches between commits, or is each row committed? Of course large batches (using COPY) -- View this message in context: http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776416.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
Point being: hardware is marching along pretty fast (after 20+ years of stagnation) and it's dangerous (IMO) to make big software investments based on the situation on the ground *today*. Yes, that's a good point. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
Jeff Janes wrote You could periodically merge older partitions into larger tables, index those aggregated tables, then transactionally disinherit the old partitions and inherit the new aggregated one. This would keep the value of K down, at the expense of re-writing data multiple times (but all method write data multiple times, some just hide it from you). Forgot to add: I thought also that we could: - use the RAM as tablespace for indexes, and move the indexes later (but postgresql doesn't handle very well a machine crash in this case... it would be cool to create an index as recreate on crash...) - use unlogged tables and turn those to logged to speed up somehow the insertion; I actually started to write a patch for it, but making it work for replication was too hard (not that I'm using replication, but it wouldn't be accepted for wal_level = minimal). But this wouldn't solve the problem anyway. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776418.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
At 2013-10-30 11:04:36 -0400, t...@sss.pgh.pa.us wrote: As a compromise, perhaps we can unconditionally round the size up to be a multiple of 2MB? […] That sounds reasonably painless to me. Here's a patch that does that and adds a DEBUG1 log message when we try with MAP_HUGETLB and fail and fallback to ordinary mmap. -- Abhijit diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 707edf1..7d9d0a8 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -113,6 +113,7 @@ #shared_buffers = 32MB # min 128kB # (change requires restart) +#huge_tlb_pages = try # try to map memory with MAP_HUGETLB (on, off, try) #temp_buffers = 8MB # min 800kB #max_prepared_transactions = 0 # zero disables the feature # (change requires restart) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 77a9303..e4ded7a 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1052,6 +1052,49 @@ include 'filename' /listitem /varlistentry + varlistentry id=guc-huge-tlb-pages xreflabel=huge_tlb_pages + termvarnamehuge_tlb_pages/varname (typeenum/type)/term + indexterm + primaryvarnamehuge_tlb_pages/ configuration parameter/primary + /indexterm + listitem + para +Enables/disables the use of huge TLB pages. Valid values are +literaltry/literal (the default), literalon/literal, +and literaloff/literal. + /para + + para +At present, this feature is supported only on Linux. The setting +is ignored on other systems. + /para + + para +The use of huge TLB pages results in smaller page tables and +less CPU time spent on memory management. For more details, see +ulink url=https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt;hugepages.txt +/ulink in the Linux kernel documentation. + /para + + para +With varnamehuge_tlb_pages/varname set to literaltry/literal, +the server will try to use huge pages, but fall back to using +normal allocation if the first attempt fails. + /para + + para +With varnamehuge_tlb_pages/varname set to literalon/literal, +the server will try to use huge pages, and treat failure as a +FATAL error. + /para + + para +With varnamehuge_tlb_pages/varname set to literaloff/literal, +the server will not try to use huge pages. + /para + /listitem + /varlistentry + varlistentry id=guc-temp-buffers xreflabel=temp_buffers termvarnametemp_buffers/varname (typeinteger/type)/term indexterm diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c index b604407..fc0d74b 100644 --- a/src/backend/port/sysv_shmem.c +++ b/src/backend/port/sysv_shmem.c @@ -32,6 +32,7 @@ #include portability/mem.h #include storage/ipc.h #include storage/pg_shmem.h +#include utils/guc.h typedef key_t IpcMemoryKey; /* shared memory key passed to shmget(2) */ @@ -367,14 +368,31 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port) */ #ifndef EXEC_BACKEND { - long pagesize = sysconf(_SC_PAGE_SIZE); + int flags = PG_MMAP_FLAGS; + long pagesize = 2*1024*1024; + +#ifdef MAP_HUGETLB + if (huge_tlb_pages == HUGE_TLB_ON || huge_tlb_pages == HUGE_TLB_TRY) + { + flags |= MAP_HUGETLB; + } +#endif /* * Ensure request size is a multiple of pagesize. * - * pagesize will, for practical purposes, always be a power of two. - * But just in case it isn't, we do it this way instead of using - * TYPEALIGN(). + * By doing this ourselves, we maximise the chances of being + * able to use huge TLB pages even on kernels that do not round + * up the request size correctly, for example due to this bug: + * https://bugzilla.kernel.org/show_bug.cgi?id=56881 + * + * The default value of 2MB for pagesize is chosen based on the + * most common supported huge page size. Rounding up to a larger + * value (e.g. 16MB) would use even larger pages if the hardware + * supported them, but would potentially waste more space. + * + * We round up by hand instead of using TYPEALIGN(), but for all + * practical purposes, pagesize will always be a power of two. */ if (pagesize 0 size % pagesize != 0) size += pagesize - (size % pagesize); @@ -386,8 +404,21 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port) * out to be false, we might need to add a run-time test here and do * this only if the running kernel supports it. */ - AnonymousShmem = mmap(NULL, size, PROT_READ | PROT_WRITE, PG_MMAP_FLAGS, - -1, 0); + + AnonymousShmem = mmap(NULL, size, PROT_READ|PROT_WRITE, flags, -1, 0); + +#ifdef MAP_HUGETLB + if (huge_tlb_pages == HUGE_TLB_TRY AnonymousShmem == MAP_FAILED) + { + elog(DEBUG1, mmap(%lu) with
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
On 2013-10-30 22:39:20 +0530, Abhijit Menon-Sen wrote: At 2013-10-30 11:04:36 -0400, t...@sss.pgh.pa.us wrote: As a compromise, perhaps we can unconditionally round the size up to be a multiple of 2MB? […] That sounds reasonably painless to me. Here's a patch that does that and adds a DEBUG1 log message when we try with MAP_HUGETLB and fail and fallback to ordinary mmap. But it's in no way guaranteed that the smallest hugepage size is 2MB. It'll be on current x86 hardware, but not on any other platform... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
On Wed, Oct 30, 2013 at 8:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Sergey Konoplev gray...@gmail.com writes: On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: Say what? There's never been any hugepages support in Postgres. There were an ability to back shared memory with hugepages when using =9.2. I use it on ~30 servers for several years and it brings 8-17% of performance depending on the memory size. Here you will find several paragraphs of the description about how to do it https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md. What this describes is how to modify Postgres to request huge pages. That's hardly built-in support. I wasn't talking about a built-in support. It was about an ability (a way) to back sh_buf with hugepages. In any case, as David already explained, we don't do feature additions in minor releases. We'd be especially unlikely to make an exception for this, since it has uncertain portability and benefits. Anything that carries portability risks has got to go through a beta testing cycle before we'll unleash it on the masses. Yes, I got the idea. Thanks both of you for clarification. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast insertion indexes: why no developments
On Wed, Oct 30, 2013 at 9:54 AM, Leonardo Francalanci m_li...@yahoo.itwrote: Jeff Janes wrote The index insertions should be fast until the size of the active part of the indexes being inserted into exceeds shared_buffers by some amount (what that amount is would depend on how much dirty data the kernel is willing to allow in the page cache before it starts suffering anxiety about it). If you have enough shared_buffers to make that last for 15 minutes, then you shouldn't have a problem inserting with live indexes. Sooner or later you'll have to checkpoint those shared_buffers... True, but that is also true of indexes created in bulk. It all has to reach disk eventually--either the checkpointer writes it out and fsyncs it, or the background writer or user backends writes it out and the checkpoint fsyncs it. If bulk creation uses a ring buffer strategy (I don't know if it does), then it might kick the buffers to kernel in more or less physical order, which would help the kernel get them to disk in long sequential writes. Or not. I think that this is where sorted checkpoint could really help. and we are talking about GB of data (my understanding is that we change basically every btree page, resulting in re-writing of the whole index). If the checkpoint interval is as long as the partitioning period, then hopefully the active index buffers get re-dirtied while protected in shared_buffers, and only get written to disk once. If the buffers get read, dirtied, and evicted from a small shared_buffers over and over again then you are almost guaranteed that will get written to disk multiple times while they are still hot, unless your kernel is very aggressive about caching dirty data (which will cause other problems). Cheers, Jeff
Re: [HACKERS] Fast insertion indexes: why no developments
On Wed, Oct 30, 2013 at 10:53 AM, Simon Riggs si...@2ndquadrant.com wrote: LSM-tree also covers the goal of maintaining just 2 sub-trees and a concurrent process of merging sub-trees. That sounds like it would take a lot of additional time to get right and would need some off-line process to perform the merge. Not necessarily. Merging means applying insertions/deletions from one subtree to another. While it's normally preferable and more efficient to do it in batches, I've successfully implemented in-memory versions that use other writers to perform the task, amortizing the cost of merging across many operations. In essence, when there's a need to merge two subtrees, an inserting process also merges one entry, so slowly trees get merged. That works in-memory very well, it's quite clear that it's not necessarily generalizable to external storage, but it's a technique to have in mind. Alternatively, vacuum could do it. It's quite clearly a vacuuming task anyway.
Re: [HACKERS] Fast insertion indexes: why no developments
On 31/10/13 06:46, Jeff Janes wrote: On Wed, Oct 30, 2013 at 9:54 AM, Leonardo Francalanci m_li...@yahoo.it mailto:m_li...@yahoo.it wrote: Jeff Janes wrote The index insertions should be fast until the size of the active part of the indexes being inserted into exceeds shared_buffers by some amount (what that amount is would depend on how much dirty data the kernel is willing to allow in the page cache before it starts suffering anxiety about it). If you have enough shared_buffers to make that last for 15 minutes, then you shouldn't have a problem inserting with live indexes. Sooner or later you'll have to checkpoint those shared_buffers... True, but that is also true of indexes created in bulk. It all has to reach disk eventually--either the checkpointer writes it out and fsyncs it, or the background writer or user backends writes it out and the checkpoint fsyncs it. If bulk creation uses a ring buffer strategy (I don't know if it does), then it might kick the buffers to kernel in more or less physical order, which would help the kernel get them to disk in long sequential writes. Or not. I think that this is where sorted checkpoint could really help. and we are talking about GB of data (my understanding is that we change basically every btree page, resulting in re-writing of the whole index). If the checkpoint interval is as long as the partitioning period, then hopefully the active index buffers get re-dirtied while protected in shared_buffers, and only get written to disk once. If the buffers get read, dirtied, and evicted from a small shared_buffers over and over again then you are almost guaranteed that will get written to disk multiple times while they are still hot, unless your kernel is very aggressive about caching dirty data (which will cause other problems). Cheers, Jeff How about being able to mark indexes: 'MEMORY ONLY' to make them not go to disk and 'PERSISTENT | TRANSIENT' to mark if they should be recreated on machine bootup? or something similar Cheers, Gavin
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
Sergey Konoplev escribió: On Wed, Oct 30, 2013 at 8:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Sergey Konoplev gray...@gmail.com writes: There were an ability to back shared memory with hugepages when using =9.2. I use it on ~30 servers for several years and it brings 8-17% of performance depending on the memory size. Here you will find several paragraphs of the description about how to do it https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md. What this describes is how to modify Postgres to request huge pages. That's hardly built-in support. I wasn't talking about a built-in support. It was about an ability (a way) to back sh_buf with hugepages. Then what you need is to set dynamic_shared_memory_type = sysv in postgresql.conf. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
On Wed, Oct 30, 2013 at 11:50 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: There were an ability to back shared memory with hugepages when using =9.2. I use it on ~30 servers for several years and it brings 8-17% of performance depending on the memory size. Here you will find several paragraphs of the description about how to do it https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md. What this describes is how to modify Postgres to request huge pages. That's hardly built-in support. I wasn't talking about a built-in support. It was about an ability (a way) to back sh_buf with hugepages. Then what you need is to set dynamic_shared_memory_type = sysv in postgresql.conf. Neither I found this parameter in the docs nor it works when I specify it in postgresql.conf. LOG: unrecognized configuration parameter dynamic_shared_memory_type in file /etc/postgresql/9.3/main/postgresql.conf line 114 FATAL: configuration file /etc/postgresql/9.3/main/postgresql.conf contains errors -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
Alvaro Herrera escribió: Sergey Konoplev escribió: I wasn't talking about a built-in support. It was about an ability (a way) to back sh_buf with hugepages. Then what you need is to set dynamic_shared_memory_type = sysv in postgresql.conf. The above is mistaken -- there's no way to disable the mmap() segment in 9.3, other than recompiling with EXEC_BACKEND which is probably undesirable for other reasons. I don't think I had ever heard of that recipe to use huge pages in previous versions; since the win is probably significant in some systems, we could have made this configurable. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
On Wed, Oct 30, 2013 at 12:17 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I wasn't talking about a built-in support. It was about an ability (a way) to back sh_buf with hugepages. Then what you need is to set dynamic_shared_memory_type = sysv in postgresql.conf. The above is mistaken -- there's no way to disable the mmap() segment in 9.3, other than recompiling with EXEC_BACKEND which is probably undesirable for other reasons. Alternatively, I assume it could be linked with libhugetlbfs and you don't need any source modifications in this case. However I am not sure it will work with shared memory. I don't think I had ever heard of that recipe to use huge pages in previous versions; since the win is probably significant in some systems, we could have made this configurable. There are several articles in the web describing how to do this, except the mine one. And the win becomes mostly significant when you have 64GB and more on your server. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
From: Hiroshi Saito hiro...@winpg.jp it my bug patch applied to ver 1.6.2sorry.. then, I made the next patch is there, please see, http://winpg.jp/~saito/pg_work/OSSP_win32/ I will be adjusted and Ralf-san again. Thanks. Yes, I wrote the wrong URL and meant this one. Is this patch (uuid-1.6.2_win32_patch2) the one that EnterpriseDB is using to publish the community PostgreSQL package? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
From: Tom Lane t...@sss.pgh.pa.us Alvaro Herrera alvhe...@2ndquadrant.com writes: At this point, I think we need to consider ossp-uuid as dead code. Yeah, but what shall we replace it with? And can we preserve the API contrib/uuid-ossp offers? (Maybe we shouldn't even try, but just deprecate that module and start fresh.) Would it be welcomed in this community if a patch for built-in UUID generation functions are provided? (I'm not saying I can provide them soon.) According to the following page, Linux has libuuid.so. Microsoft provides UuidCreate() as a Win32 API. I don't know about other UNIXes. MySQL, Oracle (and perhaps other DBMSs) offer UUID generation functions. I feel it convenient if I can use that function as the default value for a column. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Shave a few instructions from child-process startup sequence
Just a small patch; hopefully useful. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EnterpriseDB Inc. diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index ccb8b86..48dc7af 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -2236,6 +2236,8 @@ ClosePostmasterPorts(bool am_syslogger) StreamClose(ListenSocket[i]); ListenSocket[i] = PGINVALID_SOCKET; } +else +break; } /* If using syslogger, close the read side of the pipe */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
On Wed, Oct 30, 2013 at 12:51 PM, Sergey Konoplev gray...@gmail.com wrote: On Wed, Oct 30, 2013 at 12:17 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I wasn't talking about a built-in support. It was about an ability (a way) to back sh_buf with hugepages. Then what you need is to set dynamic_shared_memory_type = sysv in postgresql.conf. The above is mistaken -- there's no way to disable the mmap() segment in 9.3, other than recompiling with EXEC_BACKEND which is probably undesirable for other reasons. Alternatively, I assume it could be linked with libhugetlbfs and you don't need any source modifications in this case. However I am not sure it will work with shared memory. BTW, I managed to run 9.3 backed with hugepages after I put HUGETLB_MORECORE (see man libhugetlbfs) to the environment yesterday, but, after some time of working, it failed with messages showed below. syslog: Oct 29 17:53:13 grayhemp kernel: [150579.903875] PID 7584 killed due to inadequate hugepage pool postgres: libhugetlbfslibhugetlbfs2013-10-29 17:53:21 PDT LOG: server process (PID 7584) was terminated by signal 7: Bus error 2013-10-29 17:53:21 PDT LOG: terminating any other active server processes 2013-10-29 1 7:53:21 PDT WARNING: terminating connection because of crash of another server process 2013-10-29 17:53:21 PDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. My theory is that it has happened after the amount of huge pages (vm.nr_overcommit_hugepages + vm.nr_hugepages) was exceeded, but I might be wrong. Does anybody has some thoughts of why it has happened and how to work abound it? -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
MauMau maumau...@gmail.com writes: From: Tom Lane t...@sss.pgh.pa.us Yeah, but what shall we replace it with? And can we preserve the API contrib/uuid-ossp offers? (Maybe we shouldn't even try, but just deprecate that module and start fresh.) Would it be welcomed in this community if a patch for built-in UUID generation functions are provided? Based on what? The reason we've held this code at arms length (in a contrib module) is exactly that we weren't too satisfied with the portability or adequacy of the generation methods. Moving the functions into core doesn't do anything to make that better; rather, it raises the portability bar even higher, as well as making it almost impossible to contemplate further API changes. Note the lack of enthusiasm for taking on maintainership of the OSSP code. Pushing it into core would mean that we're buying into that maintainership, hook line and sinker. I don't think that such a proposal would fly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
Robert Haas robertmh...@gmail.com writes: On Wed, Oct 30, 2013 at 9:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: If it *isn't* about the main memory segment, what the hell are we doing creating random addon segments during bootstrap? None of the DSM code should even get control at this point, IMO. Here's a short summary of what I posted back in August: at system startup time, the postmaster creates one dynamic shared segment, called the control segment. Well, as I've pointed out already in this thread, the postmaster does not execute during bootstrap, which makes me think this code is getting called from the wrong place. What possible reason is there to create add-on shm segments in bootstrap mode? I'm even dubious that we should create them in standalone backends, because there will be no other process to share them with. I'm inclined to think this initialization should be moved to the actual postmaster (and I mean postmaster.c) from wherever it is now. That might fix the not-so-random name choice in itself, but if it doesn't, then we could consider where to move the random-seed-initialization step to. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] appendStringInfo vs appendStringInfoString
On Wed, Oct 30, 2013 at 12:12 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Andres Freund escribió: On 2013-10-30 10:52:05 -0300, Alvaro Herrera wrote: Robert Haas escribió: On Wed, Oct 30, 2013 at 4:51 AM, David Rowley dgrowle...@gmail.com wrote: I've attached a re-based version of this. I don't see any compelling reason not to commit this. Does anyone wish to object? I think a blanket substitution of places that currently have %s might cause bugs, particularly if the string is user-supplied. It might be right for many cases, but did you/someone review each such callsite? No objection to the other half, that substitute calls that don't have %s. appendStringInfoString() doesn't expand format strings, so I am not sure what you're worried about? Um. Blame my lack of decent breakfast this morning. So, does that mean we're good to go? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On Wed, Oct 30, 2013 at 9:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Oct 30, 2013 at 9:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: If it *isn't* about the main memory segment, what the hell are we doing creating random addon segments during bootstrap? None of the DSM code should even get control at this point, IMO. Here's a short summary of what I posted back in August: at system startup time, the postmaster creates one dynamic shared segment, called the control segment. Well, as I've pointed out already in this thread, the postmaster does not execute during bootstrap, which makes me think this code is getting called from the wrong place. What possible reason is there to create add-on shm segments in bootstrap mode? I'm even dubious that we should create them in standalone backends, because there will be no other process to share them with. I'm inclined to think this initialization should be moved to the actual postmaster (and I mean postmaster.c) from wherever it is now. That might fix the not-so-random name choice in itself, but if it doesn't, then we could consider where to move the random-seed-initialization step to. The initialization code is currently called form CreateSharedMemoryAndSemaphores(), like this: /* Initialize dynamic shared memory facilities. */ if (!IsUnderPostmaster) dsm_postmaster_startup(); The reason I put it there is that if the postmaster does a crash-and-restart cycle, we need create a new control segment just as we need to create a new main shared memory segment. (We also need to make sure all dynamic shared memory segments left over from the previous postmaster lifetime get nuked, but that happens earlier, as part of the shmem_exit sequence.) There may be a good reason to move it elsewhere, but by and large I have not had good luck deviating from the pattern laid down for the main shared memory segment. My respect for that battle-tested code is growing daily; every time I think I know better, I get burned. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Long paths for tablespace leads to uninterruptible hang in Windows
On Wed, Oct 16, 2013 at 1:44 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Tue, Oct 15, 2013 at 6:28 PM, Magnus Hagander mag...@hagander.net wrote: On Tue, Oct 15, 2013 at 2:55 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Oct 14, 2013 at 1:30 PM, Tom Lane t...@sss.pgh.pa.us wrote: Well, that sucks. So it's a Windows bug. I agree we'll probably want to work around it in the end, but I still think it should be put to Microsoft PSS if we can. The usual - have we actually produced a self-contained example that does just this (and doesn't include the full postgres support) and submitted it to *microsoft* for comments? I have written a self contained win32 console application with which the issue can be reproduced. The application project is attached with this mail. Logged a support ticket with Microsoft, they could reproduce the issue with the sample application (it is same what I had posted on hackers in this thread) and working on it. Progress on ticket can be checked at below link: https://support.microsoft.com/oas/default.aspx?st=1as=1iid=113iguid=42d48223-e81d-4693-a7b2-2e70186f06b2_1_1c=SMCln=en-inincno=113102310885322 I could view above link using my Microsoft account. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Service on Windows does not start. ~ is not a valid Win32 application
On Tue, Oct 29, 2013 at 12:46 PM, Naoya Anzai anzai-na...@mxu.nes.nec.co.jp wrote: Hi Sandeep I think, you should change the subject line to Unquoted service path containing space is vulnerable and can be exploited on Windows to get the attention.. :) Thank you for advice! I'll try to post to pgsql-bugs again. I could also reproduce this issue. The situation is very rare such that an exe with name same as first part of directory should exist in installation path. I suggest you can post your patch in next commit fest. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Service on Windows does not start. ~ is not a valid Win32 application
On Thu, Oct 31, 2013 at 10:17 AM, Amit Kapila amit.kapil...@gmail.comwrote: On Tue, Oct 29, 2013 at 12:46 PM, Naoya Anzai anzai-na...@mxu.nes.nec.co.jp wrote: Hi Sandeep I think, you should change the subject line to Unquoted service path containing space is vulnerable and can be exploited on Windows to get the attention.. :) Thank you for advice! I'll try to post to pgsql-bugs again. I could also reproduce this issue. The situation is very rare such that an exe with name same as first part of directory should exist in installation path. I believe it is a security risk with bigger impact as it is related to Windows environment and as installers rely on it. I suggest you can post your patch in next commit fest. Yes. Are not vulnerabilities/security risk's taken care of more urgent bases ? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com