Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Sergey Konoplev
On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Sergey Konoplev gray...@gmail.com writes:
 On Wed, Oct 23, 2013 at 11:03 PM, Abhijit Menon-Sen a...@2ndquadrant.com 
 wrote:
 This is a slightly reworked version of the patch submitted by Richard
 Poole last month, which was based on Christian Kruse's earlier patch.

 Is it possible that this patch will be included in a minor version of
 9.3? IMHO hugepages is a very important ability that postgres lost in
 9.3, and it would be great to have it back ASAP.

 Say what?  There's never been any hugepages support in Postgres.

There were an ability to back shared memory with hugepages when using
=9.2. I use it on ~30 servers for several years and it brings 8-17%
of performance depending on the memory size. Here you will find
several paragraphs of the description about how to do it
https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.
Just search for the 'hugepages' word on the page.

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread David Fetter
On Tue, Oct 29, 2013 at 11:08:05PM -0700, Sergey Konoplev wrote:
 On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Sergey Konoplev gray...@gmail.com writes:
  On Wed, Oct 23, 2013 at 11:03 PM, Abhijit Menon-Sen a...@2ndquadrant.com 
  wrote:
  This is a slightly reworked version of the patch submitted by Richard
  Poole last month, which was based on Christian Kruse's earlier patch.
 
  Is it possible that this patch will be included in a minor version of
  9.3? IMHO hugepages is a very important ability that postgres lost in
  9.3, and it would be great to have it back ASAP.
 
  Say what?  There's never been any hugepages support in Postgres.
 
 There were an ability to back shared memory with hugepages when using
 =9.2. I use it on ~30 servers for several years and it brings 8-17%
 of performance depending on the memory size. Here you will find
 several paragraphs of the description about how to do it
 https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.
 Just search for the 'hugepages' word on the page.

For better or worse, we add new features exactly and only in .0
releases.  It's what's made it possible for people to plan
deployments, given us a deserved reputation for stability, etc., etc.

I guess what I'm saying here is that awesome as any particular feature
might be to back-patch, that benefit is overwhelmed by the cost of
having unstable releases.

-infininty from me to any proposal that gets us into are you using
PostgreSQL x.y.z or x.y.w? when it comes to features.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread David Fetter
On Wed, Oct 30, 2013 at 10:16:57AM +0530, Abhijit Menon-Sen wrote:
 At 2013-10-24 16:06:19 +0300, hlinnakan...@vmware.com wrote:
 
  Let's get rid of the rounding.
 
 I share Andres's concern that the bug is present in various recent
 kernels that are going to stick around for quite some time. Given
 the rather significant performance gain, I think it's worth doing
 something, though I'm not a big fan of the directory-scanning code
 myself.
 
 As a compromise, perhaps we can unconditionally round the size up to be
 a multiple of 2MB?

How about documenting that 2MB is the quantum (OK, we'll say
indivisible unit or smallest division or something) and failing
with a message to that effect if someone tries to set it otherwise?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Abhijit Menon-Sen
At 2013-10-30 00:10:39 -0700, da...@fetter.org wrote:

 How about documenting that 2MB is the quantum (OK, we'll say
 indivisible unit or smallest division or something) and failing
 with a message to that effect if someone tries to set it otherwise?

I don't think you understand the problem. We're not discussing a user
setting here. The size that is passed to PGSharedMemoryCreate is based
on shared_buffers and our estimates of how much memory we need for other
things like ProcArray (see ipci.c:CreateSharedMemoryAndSemaphores).

If this calculated size is not a multiple of a page size supported by
the hardware (usually 2/4/16MB etc.), the allocation will fail under
some commonly-used kernels. We can either ignore the problem and let
the allocation fail, or try to discover the smallest supported huge
page size (what the patch does now), or assume that 2MB pages can be
used if any huge pages can be used and align accordingly.

We could use a larger size, e.g. if we aligned to 16MB then it would
work on hardware that supported 2/4/8/16MB pages, but we'd waste the
extra memory unless we also increased NBuffers after the rounding up
(which is also something Andres suggested earlier).

I don't have a strong opinion on the available options, other than not
liking the do nothing approach.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
 Hmm, you realise Alvaro is working on MinMax indexes in this release?
 They are very efficient with regard to index inserts and specially
 designed for use on large tables.
 
 Prior work by Heikki on Grouped Item Tuples was a way of reducing the
 size of indexes, yet still allowing uniqueness checks. That is
 implemented in SQLServer already and is very useful.


Reading the implementation of those features, I don't think they can help in 
the cases handled by the index types I mentioned (insertions of random values 
in big tables).



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] appendStringInfo vs appendStringInfoString

2013-10-30 Thread David Rowley
On Mon, Sep 30, 2013 at 10:10 PM, David Rowley dgrowle...@gmail.com wrote:

 On Sat, Sep 28, 2013 at 9:44 PM, David Rowley dgrowle...@gmail.comwrote:

 I did some benchmarking earlier in the week for the new patch which was
 just commited to allow formatting in the log_line_prefix string. In version
 0.4 of the patch there was some performance regression as I was doing
 appendStringInfo(buf, %*s, padding, variable); instead of
 appendStringInfoString(buf, variable); This regression was fixed in a later
 version of the patch by only using appendStringInfo when the padding was 0.

 More details here:
 http://www.postgresql.org/message-id/CAApHDvreSGYvtXJvqHcXZL8_tXiKKiFXhQyXgqtnQ5Yo=me...@mail.gmail.com

 I've attached a the cleanup patch for this. This one just converts
 instances of appendStringInfo into appendStringInfoString where
 appendStringInfo does no formatting or just has the format %s.


I've attached a re-based version of this.




 David Rowley





appendStringInfo_cleanup_v0.5.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Simon Riggs
On 30 October 2013 07:55, Leonardo Francalanci m_li...@yahoo.it wrote:
 Hmm, you realise Alvaro is working on MinMax indexes in this release?
 They are very efficient with regard to index inserts and specially
 designed for use on large tables.

 Prior work by Heikki on Grouped Item Tuples was a way of reducing the
 size of indexes, yet still allowing uniqueness checks. That is
 implemented in SQLServer already and is very useful.


 Reading the implementation of those features, I don't think they can help in 
 the cases handled by the index types I mentioned (insertions of random values 
 in big tables).

Presumably the data you are inserting isn't actually random. Please
describe the use case you are considering in more detail and some view
on how frequent that is, with some examples. Once we understand the
use case and agree it is important, we might solve problems.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
 Presumably the data you are inserting isn't actually random. Please
 describe the use case you are considering in more detail and some view
 on how frequent that is, with some examples. Once we understand the
 use case and agree it is important, we might solve problems.


Collecting calls data for mobile network operators (and no, I don't work for 
the NSA...)
Easily 5000-1 inserts per second. Indexes in timestamp and ID (not a 
problem, always increasing so no btree issues) and in called #, calling #, 
imsi, imei. The last four obviously are random, out of millions of possible 
values.
After the few first millions of records, the disks can't keep up with the 
amount of random writing in the indexes. Workaround: the table is partitioned 
every 15 minutes, and indexes created in bulk after we start the new 
15-minutes partition. Searches on current 15 minutes are not allowed (as it is 
not indexed), and searches on older data are K*log(N) (where K is the number of 
partitions). 
Yes, I could throw more disks, use ssd, sharding more, etc etc. But I still 
think that btree just aren't fit for this kind of problem. I don't delete data, 
I don't update data, there's not that much concurrency going on. I would 
sacrifice search speed (K*log(N) is already much slower than regular btree 
usage) for realtime insertion.

I don't think I'm the only one having a big system to be indexed by random 
values. 

In fact, I didn't want to turn this thread into a help me with this workload 
thread. I just wanted to know if there was some other known issues with these 
different indexes other than not enough time to implement them correctly: I 
was afraid that someone already dismissed them as good in theory, bad in 
practice...



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Simon Riggs
On 30 October 2013 10:35, Leonardo Francalanci m_li...@yahoo.it wrote:
 Presumably the data you are inserting isn't actually random. Please
 describe the use case you are considering in more detail and some view
 on how frequent that is, with some examples. Once we understand the
 use case and agree it is important, we might solve problems.


 Collecting calls data for mobile network operators (and no, I don't work for 
 the NSA...)
 Easily 5000-1 inserts per second. Indexes in timestamp and ID (not a 
 problem, always increasing so no btree issues) and in called #, calling #, 
 imsi, imei. The last four obviously are random, out of millions of possible 
 values.
 After the few first millions of records, the disks can't keep up with the 
 amount of random writing in the indexes. Workaround: the table is partitioned 
 every 15 minutes, and indexes created in bulk after we start the new 
 15-minutes partition. Searches on current 15 minutes are not allowed (as it 
 is not indexed), and searches on older data are K*log(N) (where K is the 
 number of partitions).
 Yes, I could throw more disks, use ssd, sharding more, etc etc. But I still 
 think that btree just aren't fit for this kind of problem. I don't delete 
 data, I don't update data, there's not that much concurrency going on. I 
 would sacrifice search speed (K*log(N) is already much slower than regular 
 btree usage) for realtime insertion.

 I don't think I'm the only one having a big system to be indexed by random 
 values.

 In fact, I didn't want to turn this thread into a help me with this 
 workload thread. I just wanted to know if there was some other known issues 
 with these different indexes other than not enough time to implement them 
 correctly: I was afraid that someone already dismissed them as good in 
 theory, bad in practice...

What is the reason for needing such fast access to individual groups
of records? Sure sounds like the NSA or similar ;-)

Sacrificing timeliness for efficiency is a common solution. I'm seeing
lots of areas where being able to specify the timeliness that is
acceptable in a query leads to various optimisations of this and
similar.

Indexes are a declarative solution.  We would need to be able to
specify the tolerances to be able to do this. (You can write your own
index...)

In terms of generality, do you think its worth a man year of developer
effort to replicate what you have already achieved? Who would pay?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
 What is the reason for needing such fast access to individual groups
 of records? Sure sounds like the NSA or similar ;-)


Users need to search all calls originated from/to a user or from/to a specific 
mobile phone to answer/analyze customers' probl... ok, I give up: I work for 
the NSA ;)

 In terms of generality, do you think its worth a man year of developer
 effort to replicate what you have already achieved? Who would pay?


1) I haven't achieved what I need: realtime indexing. I can't query the 
current 15 minutes table efficiently. Plus, K*log(N) is not that great when 
you have a lot of K.
2) I'm not suggesting that this is top priority. I'm asking if there's 
something else, other than we don't have time for this, that I don't know. In 
fact, I don't even know if those indexes types would really help in my 
(specific) case. That's why my original question was why aren't there 
developments in this area: I didn't mean to imply someone should do it. I just 
wanted to know if those indexes were already discussed (and maybe dismissed for 
some reason) in the past...



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Kaare Rasmussen

On 2013-10-30 12:08, Simon Riggs wrote:
effort to replicate what you have already achieved? Who would pay? 


The NSA, obviously ;-)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Robert Haas
On Wed, Oct 30, 2013 at 8:22 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Oct 30, 2013 at 1:22 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Wed, Oct 30, 2013 at 12:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The last two buildfarm runs on frogmouth have failed in initdb,
 like this:

 creating directory 
 d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data ... ok
 creating subdirectories ... ok
 selecting default max_connections ... 100
 selecting default shared_buffers ... 128MB
 selecting dynamic shared memory implementation ... windows
 creating configuration files ... ok
 creating template1 database in 
 d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data/base/1 
 ... FATAL:  could not open shared memory segment 
 Global/PostgreSQL.851401618: Not enough space
 child process exited with exit code 1

 In windows implementation of dynamic shared memory, Size calculation
 for creating dynamic shared memory is assuming that requested size for
 creation of dynamic shared memory segment is uint64, which is changed
 by commit d2aecae, so we need to change that calculation as well.
 Please find the attached patch to fix this problem.

 I find it hard to believe this is the right fix.  I know we have
 similar code in win32_shmem.c, but surely if size is a 32-bit unsigned
 quantity then size  0 is simply 0 anyway.

Err, rather, size  32 is simply 0 anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Robert Haas
On Wed, Oct 30, 2013 at 1:22 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Wed, Oct 30, 2013 at 12:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The last two buildfarm runs on frogmouth have failed in initdb,
 like this:

 creating directory 
 d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data ... ok
 creating subdirectories ... ok
 selecting default max_connections ... 100
 selecting default shared_buffers ... 128MB
 selecting dynamic shared memory implementation ... windows
 creating configuration files ... ok
 creating template1 database in 
 d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data/base/1 
 ... FATAL:  could not open shared memory segment 
 Global/PostgreSQL.851401618: Not enough space
 child process exited with exit code 1

 In windows implementation of dynamic shared memory, Size calculation
 for creating dynamic shared memory is assuming that requested size for
 creation of dynamic shared memory segment is uint64, which is changed
 by commit d2aecae, so we need to change that calculation as well.
 Please find the attached patch to fix this problem.

I find it hard to believe this is the right fix.  I know we have
similar code in win32_shmem.c, but surely if size is a 32-bit unsigned
quantity then size  0 is simply 0 anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?

2013-10-30 Thread MauMau

Hello,

For a certain reason, I need to build PostgreSQL on Windows with OSSP UUID 
support to use UUID for primary keys.  I have to use Visual Studio 2010.


The original source code for OSSP UUID and its fork for Windows can be 
downloaded from:


http://www.ossp.org/pkg/lib/uuid/
http://sourceforge.jp/projects/sfnet_osspuuidwin32/

I built the latest version 1.6.2 and used uuid_generate_v4() as the default 
value for a primary key column of type uuid.  However, when I insert 200-300 
rows, an unique constraint violation error is reported.  It occurs 
repeatedly in several retries.


I found some people hit the same problem in the past, but they didn't seem 
to get the solution:


http://www.postgresql.org/message-id/039e01c8b63c$d0c95e90$60096cdb@IBMC9A0F63B40D
http://www.postgresql.org/message-id/018d01c8dec6$2a14b470$0b01a8c0@IBMC9A0F63B40D
http://www.postgresql.org/message-id/98B20744191F464DB598A25AB00BC125@acer08f817a9b5

When I copy lib\uuid_ossp.dll in the PostgreSQL community package, which I 
guess Dave Page san maintains, to my installation, the error does not occur. 
So, OSSP UUID library is the cause.


Saito san published a patch for OSSP UUID on his site:

http://winpg.jp/~saito/pg_work/OSSP_win32/msvc/

And Dave san showed interest in it:

http://www.postgresql.org/message-id/snt127-w18c42125e5fd719ba514ceff...@phx.gbl

How can I build the OSSP UUID library that functions properly on Windows? Is 
OSSP UUID 1.6.2 + Saito-san's patch the exact source for the uuid-ossp.dll 
in thecommunity binary?



Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?

2013-10-30 Thread MauMau

Hello,

# Sorry, Saito san' address doesn't seem to exist, so excuse for sending 
again


For a certain reason, I need to build PostgreSQL on Windows with OSSP UUID 
support to use UUID for primary keys.  I have to use Visual Studio 2010.


The original source code for OSSP UUID and its fork for Windows can be 
downloaded from:


http://www.ossp.org/pkg/lib/uuid/
http://sourceforge.jp/projects/sfnet_osspuuidwin32/

I built the latest version 1.6.2 and used uuid_generate_v4() as the default 
value for a primary key column of type uuid.  However, when I insert 200-300 
rows, an unique constraint violation error is reported.  It occurs 
repeatedly in several retries.


I found some people hit the same problem in the past, but they didn't seem 
to get the solution:


http://www.postgresql.org/message-id/039e01c8b63c$d0c95e90$60096cdb@IBMC9A0F63B40D
http://www.postgresql.org/message-id/018d01c8dec6$2a14b470$0b01a8c0@IBMC9A0F63B40D
http://www.postgresql.org/message-id/98B20744191F464DB598A25AB00BC125@acer08f817a9b5

When I copy lib\uuid_ossp.dll in the PostgreSQL community package, which I 
guess Dave Page san maintains, to my installation, the error does not occur. 
So, OSSP UUID library is the cause.


Saito san published a patch for OSSP UUID on his site:

http://winpg.jp/~saito/pg_work/OSSP_win32/msvc/

And Dave san showed interest in it:

http://www.postgresql.org/message-id/snt127-w18c42125e5fd719ba514ceff...@phx.gbl

How can I build the OSSP UUID library that functions properly on Windows? Is 
OSSP UUID 1.6.2 + Saito-san's patch the exact source for the uuid-ossp.dll 
in thecommunity binary?



Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Robert Haas
On Tue, Oct 29, 2013 at 3:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The last two buildfarm runs on frogmouth have failed in initdb,
 like this:

 creating directory 
 d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data ... ok
 creating subdirectories ... ok
 selecting default max_connections ... 100
 selecting default shared_buffers ... 128MB
 selecting dynamic shared memory implementation ... windows
 creating configuration files ... ok
 creating template1 database in 
 d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data/base/1 ... 
 FATAL:  could not open shared memory segment Global/PostgreSQL.851401618: 
 Not enough space
 child process exited with exit code 1

 It shouldn't be failing like that, considering that we just finished
 probing for acceptable max_connections and shared_buffers without hitting
 any apparent limit.  I suppose it's possible that the final shm segment
 size is a bit larger than what was tested at the shared_buffer step,
 but that doesn't seem very likely to be the explanation.  What seems
 considerably more probable is that the probe for a shared memory
 implementation is screwing up the system state somehow.  It may not be
 unrelated that this machine was happy before commit d2aecae went in.

If I'm reading this correctly, the last three runs on frogmouth have
all failed, and all of them have failed with a complaint about,
specifically, Global/PostgreSQL.851401618.  Now, that really shouldn't
be happening, because the code to choose that number looks like this:

dsm_control_handle = random();

One possibility that occurs to me is that if, for some reason, we're
using the same handle every time on Windows, and if Windows takes a
bit of time to reclaim the segment after the postmaster exits (which
is not hard to believe given some previous Windows behavior I've
seen), then running the postmaster lots of times in quick succession
(as initdb does) might fail.  I dunno what that has to do with the
patch, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Andres Freund
On 2013-10-30 08:45:03 -0400, Robert Haas wrote:
 If I'm reading this correctly, the last three runs on frogmouth have
 all failed, and all of them have failed with a complaint about,
 specifically, Global/PostgreSQL.851401618.  Now, that really shouldn't
 be happening, because the code to choose that number looks like this:
 
 dsm_control_handle = random();
 
 One possibility that occurs to me is that if, for some reason, we're
 using the same handle every time on Windows, and if Windows takes a
 bit of time to reclaim the segment after the postmaster exits (which
 is not hard to believe given some previous Windows behavior I've
 seen), then running the postmaster lots of times in quick succession
 (as initdb does) might fail.  I dunno what that has to do with the
 patch, though.

Could it be that we haven't primed the random number generator with the
time or something like that yet?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Robert Haas
On Wed, Oct 30, 2013 at 8:22 AM, Robert Haas robertmh...@gmail.com wrote:
 I find it hard to believe this is the right fix.  I know we have
 similar code in win32_shmem.c, but surely if size is a 32-bit unsigned
 quantity then size  0 is simply 0 anyway.

Gosh, I stand corrected.  According to
http://msdn.microsoft.com/en-us/library/336xbhcz.aspx --

The result is undefined if the right operand of a shift expression is
negative or if the right operand is greater than or equal to the
number of bits in the (promoted) left operand. No shift operation is
performed if the right operand is zero (0).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Robert Haas
On Wed, Oct 30, 2013 at 8:47 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-10-30 08:45:03 -0400, Robert Haas wrote:
 If I'm reading this correctly, the last three runs on frogmouth have
 all failed, and all of them have failed with a complaint about,
 specifically, Global/PostgreSQL.851401618.  Now, that really shouldn't
 be happening, because the code to choose that number looks like this:

 dsm_control_handle = random();

 One possibility that occurs to me is that if, for some reason, we're
 using the same handle every time on Windows, and if Windows takes a
 bit of time to reclaim the segment after the postmaster exits (which
 is not hard to believe given some previous Windows behavior I've
 seen), then running the postmaster lots of times in quick succession
 (as initdb does) might fail.  I dunno what that has to do with the
 patch, though.

 Could it be that we haven't primed the random number generator with the
 time or something like that yet?

Yeah, I think that's probably what it is.  There's PostmasterRandom()
to initialize the random-number generator on first use, but that
doesn't help if some other module calls random().  I wonder if we
ought to just get rid of PostmasterRandom() and instead have the
postmaster run that initialization code very early in startup.  That'd
make the timing of the random number generator being initialized a bit
more predictable, perhaps, but if the dynamic shared memory code is
going to grab a random number during startup it's basically going to
be nailed to that event anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] appendStringInfo vs appendStringInfoString

2013-10-30 Thread Robert Haas
On Wed, Oct 30, 2013 at 4:51 AM, David Rowley dgrowle...@gmail.com wrote:
 I've attached a re-based version of this.

I don't see any compelling reason not to commit this.  Does anyone
wish to object?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 If I'm reading this correctly, the last three runs on frogmouth have
 all failed, and all of them have failed with a complaint about,
 specifically, Global/PostgreSQL.851401618.  Now, that really shouldn't
 be happening, because the code to choose that number looks like this:

 dsm_control_handle = random();

Isn't this complaining about the main shm segment, not a DSM extension?

Also, why is the error not enough space, rather than something about
a collision?  And if this is the explanation, why didn't the previous
runs probing for allowable shmem size fail?

BTW, regardless of the specific properties of random(), surely you ought
to have code in there that would cope with a name collision.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Andres Freund
On 2013-10-30 09:26:42 -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  If I'm reading this correctly, the last three runs on frogmouth have
  all failed, and all of them have failed with a complaint about,
  specifically, Global/PostgreSQL.851401618.  Now, that really shouldn't
  be happening, because the code to choose that number looks like this:
 
  dsm_control_handle = random();
 
 Isn't this complaining about the main shm segment, not a DSM extension?

Don't think so, that has a : in the name. But I think this touches a
fair point, I think we need to make all the dsm error messages more
distinctive. The history since this has been committed makes it likely
that there will be more errors.

 Also, why is the error not enough space, rather than something about
 a collision?  And if this is the explanation, why didn't the previous
 runs probing for allowable shmem size fail?

Yea, I don't think this explains the issue but something independent
that needs to be fixed.

 BTW, regardless of the specific properties of random(), surely you ought
 to have code in there that would cope with a name collision.

There actually is code that retries, but only for EEXISTS.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-10-30 09:26:42 -0400, Tom Lane wrote:
 Isn't this complaining about the main shm segment, not a DSM extension?

 Don't think so, that has a : in the name.

If it *isn't* about the main memory segment, what the hell are we doing
creating random addon segments during bootstrap?  None of the DSM code
should even get control at this point, IMO.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] appendStringInfo vs appendStringInfoString

2013-10-30 Thread Alvaro Herrera
Robert Haas escribió:
 On Wed, Oct 30, 2013 at 4:51 AM, David Rowley dgrowle...@gmail.com wrote:
  I've attached a re-based version of this.
 
 I don't see any compelling reason not to commit this.  Does anyone
 wish to object?

I think a blanket substitution of places that currently have %s might
cause bugs, particularly if the string is user-supplied.  It might be
right for many cases, but did you/someone review each such callsite?

No objection to the other half, that substitute calls that don't have
%s.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Simon Riggs
On 30 October 2013 11:23, Leonardo Francalanci m_li...@yahoo.it wrote:
 What is the reason for needing such fast access to individual groups
 of records? Sure sounds like the NSA or similar ;-)


 Users need to search all calls originated from/to a user or from/to a 
 specific mobile phone to answer/analyze customers' probl... ok, I give up: I 
 work for the NSA ;)

 In terms of generality, do you think its worth a man year of developer
 effort to replicate what you have already achieved? Who would pay?


 1) I haven't achieved what I need: realtime indexing. I can't query the 
 current 15 minutes table efficiently. Plus, K*log(N) is not that great when 
 you have a lot of K.
 2) I'm not suggesting that this is top priority. I'm asking if there's 
 something else, other than we don't have time for this, that I don't know. 
 In fact, I don't even know if those indexes types would really help in my 
 (specific) case. That's why my original question was why aren't there 
 developments in this area: I didn't mean to imply someone should do it. I 
 just wanted to know if those indexes were already discussed (and maybe 
 dismissed for some reason) in the past...

OK, I understand now.

LSM-trees seem patent free, since open source implementations exist.
The main concept is to partition the index into multiple trees, so
that the current insertion sub-tree fits more easily in memory.  That
sounds good and was exactly the solution I'd come up with as well,
which is a good cross check. It leads to a slow increase in index
response times, but we could offset that by having minimum values on
each subtree and using partitioning logic as with a minmax index.

LSM-tree also covers the goal of maintaining just 2 sub-trees and a
concurrent process of merging sub-trees. That sounds like it would
take a lot of additional time to get right and would need some
off-line process to perform the merge.

Please somebody advise patent status of Y-trees otherwise I wouldn't bother.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] appendStringInfo vs appendStringInfoString

2013-10-30 Thread Andres Freund
On 2013-10-30 10:52:05 -0300, Alvaro Herrera wrote:
 Robert Haas escribió:
  On Wed, Oct 30, 2013 at 4:51 AM, David Rowley dgrowle...@gmail.com wrote:
   I've attached a re-based version of this.
  
  I don't see any compelling reason not to commit this.  Does anyone
  wish to object?
 
 I think a blanket substitution of places that currently have %s might
 cause bugs, particularly if the string is user-supplied.  It might be
 right for many cases, but did you/someone review each such callsite?
 
 No objection to the other half, that substitute calls that don't have
 %s.

appendStringInfoString() doesn't expand format strings, so I am not sure
what you're worried about?


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
 LSM-trees seem patent free

I'm no expert, and I gave it just a look some time ago: it looked to me very 
complicated to get right... and as far as I remember you don't get that much 
gain, unless you go multi-level which would complicate things further

 Please somebody advise patent status of Y-trees otherwise I wouldn't bother.
 
y-trees look much more easier to get right... (and to me they also make more 
sense, but I'm not skilled enough to judge). 

There's also the FD-tree, which looks a lot like the (patented...) fractal tree:
http://www.ntu.edu.sg/home/bshe/fdtree_pvldb.pdf


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?

2013-10-30 Thread Hiroshi Saito
Hi MauMau-san

it my bug patch applied to ver 1.6.2sorry..
then, I made the next patch is there, please see,
http://winpg.jp/~saito/pg_work/OSSP_win32/

I will be adjusted and Ralf-san again.

best regards,
Hiroshi Saito

(2013/10/30 21:45), MauMau wrote:
 Hello,
 
 # Sorry, Saito san' address doesn't seem to exist, so excuse for sending
 again
 
 For a certain reason, I need to build PostgreSQL on Windows with OSSP
 UUID support to use UUID for primary keys.  I have to use Visual Studio
 2010.
 
 The original source code for OSSP UUID and its fork for Windows can be
 downloaded from:
 
 http://www.ossp.org/pkg/lib/uuid/
 http://sourceforge.jp/projects/sfnet_osspuuidwin32/
 
 I built the latest version 1.6.2 and used uuid_generate_v4() as the
 default value for a primary key column of type uuid.  However, when I
 insert 200-300 rows, an unique constraint violation error is reported. 
 It occurs repeatedly in several retries.
 
 I found some people hit the same problem in the past, but they didn't
 seem to get the solution:
 
 http://www.postgresql.org/message-id/039e01c8b63c$d0c95e90$60096cdb@IBMC9A0F63B40D
 
 http://www.postgresql.org/message-id/018d01c8dec6$2a14b470$0b01a8c0@IBMC9A0F63B40D
 
 http://www.postgresql.org/message-id/98B20744191F464DB598A25AB00BC125@acer08f817a9b5
 
 
 When I copy lib\uuid_ossp.dll in the PostgreSQL community package, which
 I guess Dave Page san maintains, to my installation, the error does not
 occur. So, OSSP UUID library is the cause.
 
 Saito san published a patch for OSSP UUID on his site:
 
 http://winpg.jp/~saito/pg_work/OSSP_win32/msvc/
 
 And Dave san showed interest in it:
 
 http://www.postgresql.org/message-id/snt127-w18c42125e5fd719ba514ceff...@phx.gbl
 
 
 How can I build the OSSP UUID library that functions properly on
 Windows? Is OSSP UUID 1.6.2 + Saito-san's patch the exact source for the
 uuid-ossp.dll in thecommunity binary?
 
 
 Regards
 MauMau
 
 
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Yann Fontana
On 30 October 2013 11:23, Leonardo Francalanci m_li...@yahoo.it wrote:

  In terms of generality, do you think its worth a man year of developer
  effort to replicate what you have already achieved? Who would pay?


I work on an application that does exactly what Leonardo described. We hit
the exact same problem, and came up with the same exact same solution (down
to the 15 minutes interval). But I have also worked on other various
datamarts (all using Oracle), and they are all subject to this problem in
some form: B-tree indexes slow down bulk data inserts too much and need to
be disabled or dropped and then recreated after the load. In some cases
this is done easily enough, in others it's more complicated (example: every
day, a process imports from 1 million to 1 billion records into a table
partition that may contain from 0 to 1 billion records. To be as efficient
as possible, you need some logic to compare the number of rows to insert to
the number of rows already present, in order to decide whether to drop the
indexes or not).

Basically, my point is that this is a common problem for datawarehouses and
datamarts. In my view, indexes that don't require developers to work around
poor insert performance would be a significant feature in a
datawarehouse-ready DBMS.

Yann


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Tom Lane
Abhijit Menon-Sen a...@2ndquadrant.com writes:
 As a compromise, perhaps we can unconditionally round the size up to be
 a multiple of 2MB? That way, we can use huge pages more often, but also
 avoid putting in a lot of code and effort into the workaround and waste
 only a little space (if any at all).

That sounds reasonably painless to me.  Note that at least in our main
shmem segment, extra space is not useless, because it allows slop for
the main hash tables, notably the locks table.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Tom Lane
Sergey Konoplev gray...@gmail.com writes:
 On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Say what?  There's never been any hugepages support in Postgres.

 There were an ability to back shared memory with hugepages when using
 =9.2. I use it on ~30 servers for several years and it brings 8-17%
 of performance depending on the memory size. Here you will find
 several paragraphs of the description about how to do it
 https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.

What this describes is how to modify Postgres to request huge pages.
That's hardly built-in support.

In any case, as David already explained, we don't do feature additions
in minor releases.  We'd be especially unlikely to make an exception
for this, since it has uncertain portability and benefits.  Anything
that carries portability risks has got to go through a beta testing
cycle before we'll unleash it on the masses.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Merlin Moncure
On Wed, Oct 30, 2013 at 9:23 AM, Leonardo Francalanci m_li...@yahoo.it wrote:
 LSM-trees seem patent free

 I'm no expert, and I gave it just a look some time ago: it looked to me very 
 complicated to get right... and as far as I remember you don't get that much 
 gain, unless you go multi-level which would complicate things further

 Please somebody advise patent status of Y-trees otherwise I wouldn't bother.

 y-trees look much more easier to get right... (and to me they also make more 
 sense, but I'm not skilled enough to judge).

 There's also the FD-tree, which looks a lot like the (patented...) fractal 
 tree:
 http://www.ntu.edu.sg/home/bshe/fdtree_pvldb.pdf

Skimming the white paper, it's clear right from the start that they
make assumptions about the hardware that appear to be already obsolete
-- extremely poor write performance vs read performance of SSD.
Later generation SSDs are increasingly using hardware optimizations to
convert random writes to sequential writes using various tricks.

Point being: hardware is marching along pretty fast (after 20+ years
of stagnation) and it's dangerous (IMO) to make big software
investments based on the situation on the ground *today*.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Jeff Janes
On Wed, Oct 30, 2013 at 3:35 AM, Leonardo Francalanci m_li...@yahoo.itwrote:

  Presumably the data you are inserting isn't actually random. Please
  describe the use case you are considering in more detail and some view
  on how frequent that is, with some examples. Once we understand the
  use case and agree it is important, we might solve problems.


 Collecting calls data for mobile network operators (and no, I don't work
 for the NSA...)
 Easily 5000-1 inserts per second. Indexes in timestamp and ID (not a
 problem, always increasing so no btree issues) and in called #, calling #,
 imsi, imei. The last four obviously are random, out of millions of possible
 values.
 After the few first millions of records, the disks can't keep up with the
 amount of random writing in the indexes.


So, like, 3 minutes worth?  How much RAM and shared_buffers do you have?
 The index insertions should be fast until the size of the active part of
the indexes being inserted into exceeds shared_buffers by some amount (what
that amount is would depend on how much dirty data the kernel is willing to
allow in the page cache before it starts suffering anxiety about it).  If
you have enough shared_buffers to make that last for 15 minutes, then you
shouldn't have a problem inserting with live indexes.

Cheers,

Jeff


Re: [HACKERS] appendStringInfo vs appendStringInfoString

2013-10-30 Thread Alvaro Herrera
Andres Freund escribió:
 On 2013-10-30 10:52:05 -0300, Alvaro Herrera wrote:
  Robert Haas escribió:
   On Wed, Oct 30, 2013 at 4:51 AM, David Rowley dgrowle...@gmail.com 
   wrote:
I've attached a re-based version of this.
   
   I don't see any compelling reason not to commit this.  Does anyone
   wish to object?
  
  I think a blanket substitution of places that currently have %s might
  cause bugs, particularly if the string is user-supplied.  It might be
  right for many cases, but did you/someone review each such callsite?
  
  No objection to the other half, that substitute calls that don't have
  %s.
 
 appendStringInfoString() doesn't expand format strings, so I am not sure
 what you're worried about?

Um.  Blame my lack of decent breakfast this morning.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?

2013-10-30 Thread Alvaro Herrera
Hiroshi Saito escribió:
 Hi MauMau-san
 
 it my bug patch applied to ver 1.6.2sorry..
 then, I made the next patch is there, please see,
 http://winpg.jp/~saito/pg_work/OSSP_win32/
 
 I will be adjusted and Ralf-san again.

At this point, I think we need to consider ossp-uuid as dead code.
Last release was in 2008 and there are lots of bugs reports still open,
even some with trivial patches attached.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Jeff Janes
On Wed, Oct 30, 2013 at 4:23 AM, Leonardo Francalanci m_li...@yahoo.itwrote:



 1) I haven't achieved what I need: realtime indexing. I can't query the
 current 15 minutes table efficiently. Plus, K*log(N) is not that great
 when you have a lot of K.


Are partitions read-only once time has moved on, or can stragglers show up
that need to be inserted into older partitions?

You could periodically merge older partitions into larger tables, index
those aggregated tables, then transactionally disinherit the old partitions
and inherit the new aggregated one.  This would keep the value of K down,
at the expense of re-writing data multiple times (but all method write data
multiple times, some just hide it from you).



 2) I'm not suggesting that this is top priority. I'm asking if there's
 something else, other than we don't have time for this, that I don't
 know. In fact, I don't even know if those indexes types would really help
 in my (specific) case. That's why my original question was why aren't
 there developments in this area: I didn't mean to imply someone should do
 it. I just wanted to know if those indexes were already discussed (and
 maybe dismissed for some reason) in the past...


There have been several ideas discussed, but not a whole lot of time to
implement them.

By the way, what is the transaction structure of your inserts?  Are they
large batches between commits, or is each row committed?

Cheers,

Jeff


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Robert Haas
On Wed, Oct 30, 2013 at 9:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 If I'm reading this correctly, the last three runs on frogmouth have
 all failed, and all of them have failed with a complaint about,
 specifically, Global/PostgreSQL.851401618.  Now, that really shouldn't
 be happening, because the code to choose that number looks like this:

 dsm_control_handle = random();

 Isn't this complaining about the main shm segment, not a DSM extension?

No.  That's why the identifier being assigned to has dsm in it.
I'll respond to this in more detail in a separate post.

 Also, why is the error not enough space, rather than something about
 a collision?  And if this is the explanation, why didn't the previous
 runs probing for allowable shmem size fail?

Good questions.  I think that my previous theory was wrong, and that
the patch from Amit which I pushed a while ago should fix the
breakage.

 BTW, regardless of the specific properties of random(), surely you ought
 to have code in there that would cope with a name collision.

I do have code in there to cope with a name collision.  However, that
doesn't mean it's good for it to choose the same name for the segment
by default every time.  If we were going to do it that way I ought to
have just made it serial (PostgreSQL.0, 1, 2, 3, ...) instead of using
random numbers to name them.  The reason I didn't do that is to
minimize the chances of collisions actually happening - and especially
to minimize the chances of a large number of collisions happening.
Especially for System V shared memory, the namespace is rather
constrained, so bouncing around randomly through the namespace makes
it unlikely that we'll hit a whole bunch of identifiers in a row that
are all already in use by some other postmaster or, indeed, a process
unrelated to PostgreSQL.  A linear scan starting at any fixed value
wouldn't have that desirable property.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Oct 30, 2013 at 9:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Also, why is the error not enough space, rather than something about
 a collision?  And if this is the explanation, why didn't the previous
 runs probing for allowable shmem size fail?

 Good questions.  I think that my previous theory was wrong, and that
 the patch from Amit which I pushed a while ago should fix the
 breakage.

Indeed, I see frogmouth just went green, so Amit nailed it.

I'm still wondering why we try to create a DSM segment in bootstrap.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?

2013-10-30 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 At this point, I think we need to consider ossp-uuid as dead code.

Yeah, but what shall we replace it with?  And can we preserve the
API contrib/uuid-ossp offers?  (Maybe we shouldn't even try, but
just deprecate that module and start fresh.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Yeah, I think that's probably what it is.  There's PostmasterRandom()
 to initialize the random-number generator on first use, but that
 doesn't help if some other module calls random().  I wonder if we
 ought to just get rid of PostmasterRandom() and instead have the
 postmaster run that initialization code very early in startup.

You could do arbitrary rearrangement of the postmaster's code and not
succeed in affecting this behavior in the slightest, because the
postmaster isn't running during bootstrap.  I continue to doubt that
there's a good reason to be creating DSM segment(s) here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Robert Haas
On Wed, Oct 30, 2013 at 9:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2013-10-30 09:26:42 -0400, Tom Lane wrote:
 Isn't this complaining about the main shm segment, not a DSM extension?

 Don't think so, that has a : in the name.

 If it *isn't* about the main memory segment, what the hell are we doing
 creating random addon segments during bootstrap?  None of the DSM code
 should even get control at this point, IMO.

Here's a short summary of what I posted back in August: at system
startup time, the postmaster creates one dynamic shared segment,
called the control segment.  That segment sticks around for the
lifetime of the server and records the identity of any *other* dynamic
shared memory segments that are subsequently created.  If the server
dies a horrible death (e.g. kill -9), the next postmaster will find
the previous control segment (whose ID is written to a file in the
data directory) and remove any leftover shared memory segments from
the previous run; without this, such segments would live until the
next server reboot unless manually removed by the user (which isn't
even practical on all platforms; e.g. there doesn't seem to be any way
to list all exstant POSIX shared memory segments on MacOS X, so a user
wouldn't know which segments to remove).

For my previous posting on this topic, see the following link,
particularly the paragraph which begins The actual implementation is
split up into two layers and the following one.

http://www.postgresql.org/message-id/CA+TgmoaDqDUgt=4Zs_QPOnBt=EstEaVNP+5t+m=fpnwship...@mail.gmail.com

Now, you might ask why not store this control information that we need
for cleanup purposes in the *main* shared memory segment rather than
in a dynamic shared memory segment.  The basic problem is that I don't
know how to dig it out of there in any reasonable way.  The dsm
control segment is small and has a very simple structure; when the
postmaster uses the previous postmaster's leftover control segment to
clean up orphaned shared memory segments, it will ignore that old
control segment unless it passes various sanity tests.  But even if
passes those sanity tests but is corrupted somehow otherwise, nothing
that happens as a result will cause a fatal error, let alone a server
crash.  You're of course welcome to critique that logic, but I tried
my best to make it bulletproof.  See
dsm_cleanup_using_control_segment().

The structure of the main shared memory segment is way more
complicated.  If we latched onto an old main shared memory segment,
we'd presumably need to traverse ShmemIndex to even find that portion
of the shared memory segment where the DSM control information was
slated to be stored.  And there's no way that's going to be robust in
the face of a possibly-corrupted shared memory segment left over from
a previous run.   And that's actually making the assumption that we
could even do it that way, which we really can't: as of 9.3, things
like ShmemIndex are stored in the MAP_SHARED anonymous mapping, and
the System V shared memory segment is small and fixed-size.  We could
try to refactor the code so that we merge the control segment data
into the residual System V segment, but I think it'd be ugly and I'm
not sure what it really buys us.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?

2013-10-30 Thread Andrew Dunstan


On 10/30/2013 12:43 PM, Tom Lane wrote:

Alvaro Herrera alvhe...@2ndquadrant.com writes:

At this point, I think we need to consider ossp-uuid as dead code.

Yeah, but what shall we replace it with?  And can we preserve the
API contrib/uuid-ossp offers?  (Maybe we shouldn't even try, but
just deprecate that module and start fresh.)





An alternative would be for someone to put up a fork on, say, github, 
and maintain it. (No, I'm not volunteering.)


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
Jeff Janes wrote
 The index insertions should be fast until the size of the active part of
 the indexes being inserted into exceeds shared_buffers by some amount
 (what
 that amount is would depend on how much dirty data the kernel is willing
 to
 allow in the page cache before it starts suffering anxiety about it).  If
 you have enough shared_buffers to make that last for 15 minutes, then you
 shouldn't have a problem inserting with live indexes.

Sooner or later you'll have to checkpoint those shared_buffers... and we are
talking about GB of data (my understanding is that we change basically every
btree page, resulting in re-writing of the whole index).






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776413.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Robert Haas
On Wed, Oct 30, 2013 at 12:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Yeah, I think that's probably what it is.  There's PostmasterRandom()
 to initialize the random-number generator on first use, but that
 doesn't help if some other module calls random().  I wonder if we
 ought to just get rid of PostmasterRandom() and instead have the
 postmaster run that initialization code very early in startup.

 You could do arbitrary rearrangement of the postmaster's code and not
 succeed in affecting this behavior in the slightest, because the
 postmaster isn't running during bootstrap.

Well, if you're telling me that it's not possible to find a way to
arrange things so that the random number is initialized before first
use, I'm gonna respectfully disagree.  If you're just critiquing my
particular suggestion about where to put that code - fair enough.
Maybe it really ought to live in our src/port implementation of
random() or pg_lrand48().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
Jeff Janes wrote
 Are partitions read-only once time has moved on, or can stragglers show up
 that need to be inserted into older partitions?
 
 You could periodically merge older partitions into larger tables, index
 those aggregated tables, then transactionally disinherit the old
 partitions
 and inherit the new aggregated one.  This would keep the value of K down,
 at the expense of re-writing data multiple times (but all method write
 data
 multiple times, some just hide it from you).

Yes, we could merge the partitions: the idea was to merge them during
night hour, when traffic is low ( and NSA people are sleeping ;) )



Jeff Janes wrote
 By the way, what is the transaction structure of your inserts?  Are they
 large batches between commits, or is each row committed?

Of course large batches (using COPY)




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776416.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
 Point being: hardware is marching along pretty fast (after 20+ years
 of stagnation) and it's dangerous (IMO) to make big software
 investments based on the situation on the ground *today*.


Yes, that's a good point.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Leonardo Francalanci
Jeff Janes wrote
 You could periodically merge older partitions into larger tables, index
 those aggregated tables, then transactionally disinherit the old
 partitions
 and inherit the new aggregated one.  This would keep the value of K down,
 at the expense of re-writing data multiple times (but all method write
 data
 multiple times, some just hide it from you).

Forgot to add:

I thought also that we could:

- use the RAM as tablespace for indexes, and move the indexes later (but
postgresql doesn't handle very well a machine crash in this case... it would
be cool to create an index as recreate on crash...)
- use unlogged tables and turn those to logged to speed up somehow the
insertion; I actually started to write a patch for it, but making it work
for replication was too hard (not that I'm using replication, but it
wouldn't be accepted for wal_level = minimal). But this wouldn't solve the
problem anyway.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776418.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Abhijit Menon-Sen
At 2013-10-30 11:04:36 -0400, t...@sss.pgh.pa.us wrote:

  As a compromise, perhaps we can unconditionally round the size up to be
  a multiple of 2MB? […]
 
 That sounds reasonably painless to me. 

Here's a patch that does that and adds a DEBUG1 log message when we try
with MAP_HUGETLB and fail and fallback to ordinary mmap.

-- Abhijit
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 707edf1..7d9d0a8 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -113,6 +113,7 @@
 
 #shared_buffers = 32MB			# min 128kB
 	# (change requires restart)
+#huge_tlb_pages = try			# try to map memory with MAP_HUGETLB (on, off, try)
 #temp_buffers = 8MB			# min 800kB
 #max_prepared_transactions = 0		# zero disables the feature
 	# (change requires restart)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 77a9303..e4ded7a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1052,6 +1052,49 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-huge-tlb-pages xreflabel=huge_tlb_pages
+  termvarnamehuge_tlb_pages/varname (typeenum/type)/term
+  indexterm
+   primaryvarnamehuge_tlb_pages/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Enables/disables the use of huge TLB pages. Valid values are
+literaltry/literal (the default), literalon/literal,
+and literaloff/literal.
+   /para
+
+   para
+At present, this feature is supported only on Linux. The setting
+is ignored on other systems.
+   /para
+
+   para
+The use of huge TLB pages results in smaller page tables and
+less CPU time spent on memory management. For more details, see
+ulink url=https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt;hugepages.txt
+/ulink in the Linux kernel documentation.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literaltry/literal,
+the server will try to use huge pages, but fall back to using
+normal allocation if the first attempt fails.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literalon/literal,
+the server will try to use huge pages, and treat failure as a
+FATAL error.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literaloff/literal,
+the server will not try to use huge pages.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-temp-buffers xreflabel=temp_buffers
   termvarnametemp_buffers/varname (typeinteger/type)/term
   indexterm

diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index b604407..fc0d74b 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -32,6 +32,7 @@
 #include portability/mem.h
 #include storage/ipc.h
 #include storage/pg_shmem.h
+#include utils/guc.h
 
 
 typedef key_t IpcMemoryKey;		/* shared memory key passed to shmget(2) */
@@ -367,14 +368,31 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port)
 	 */
 #ifndef EXEC_BACKEND
 	{
-		long		pagesize = sysconf(_SC_PAGE_SIZE);
+		int			flags = PG_MMAP_FLAGS;
+		long		pagesize = 2*1024*1024;
+
+#ifdef MAP_HUGETLB
+		if (huge_tlb_pages == HUGE_TLB_ON || huge_tlb_pages == HUGE_TLB_TRY)
+		{
+			flags |= MAP_HUGETLB;
+		}
+#endif
 
 		/*
 		 * Ensure request size is a multiple of pagesize.
 		 *
-		 * pagesize will, for practical purposes, always be a power of two.
-		 * But just in case it isn't, we do it this way instead of using
-		 * TYPEALIGN().
+		 * By doing this ourselves, we maximise the chances of being
+		 * able to use huge TLB pages even on kernels that do not round
+		 * up the request size correctly, for example due to this bug:
+		 * https://bugzilla.kernel.org/show_bug.cgi?id=56881
+		 *
+		 * The default value of 2MB for pagesize is chosen based on the
+		 * most common supported huge page size. Rounding up to a larger
+		 * value (e.g. 16MB) would use even larger pages if the hardware
+		 * supported them, but would potentially waste more space.
+		 *
+		 * We round up by hand instead of using TYPEALIGN(), but for all
+		 * practical purposes, pagesize will always be a power of two.
 		 */
 		if (pagesize  0  size % pagesize != 0)
 			size += pagesize - (size % pagesize);
@@ -386,8 +404,21 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port)
 		 * out to be false, we might need to add a run-time test here and do
 		 * this only if the running kernel supports it.
 		 */
-		AnonymousShmem = mmap(NULL, size, PROT_READ | PROT_WRITE, PG_MMAP_FLAGS,
-			  -1, 0);
+
+		AnonymousShmem = mmap(NULL, size, PROT_READ|PROT_WRITE, flags, -1, 0);
+
+#ifdef MAP_HUGETLB
+		if (huge_tlb_pages == HUGE_TLB_TRY  AnonymousShmem == MAP_FAILED)
+		{
+			elog(DEBUG1, mmap(%lu) with 

Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Andres Freund
On 2013-10-30 22:39:20 +0530, Abhijit Menon-Sen wrote:
 At 2013-10-30 11:04:36 -0400, t...@sss.pgh.pa.us wrote:
 
   As a compromise, perhaps we can unconditionally round the size up to be
   a multiple of 2MB? […]
  
  That sounds reasonably painless to me. 
 
 Here's a patch that does that and adds a DEBUG1 log message when we try
 with MAP_HUGETLB and fail and fallback to ordinary mmap.

But it's in no way guaranteed that the smallest hugepage size is
2MB. It'll be on current x86 hardware, but not on any other platform...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Sergey Konoplev
On Wed, Oct 30, 2013 at 8:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Sergey Konoplev gray...@gmail.com writes:
 On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Say what?  There's never been any hugepages support in Postgres.

 There were an ability to back shared memory with hugepages when using
 =9.2. I use it on ~30 servers for several years and it brings 8-17%
 of performance depending on the memory size. Here you will find
 several paragraphs of the description about how to do it
 https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.

 What this describes is how to modify Postgres to request huge pages.
 That's hardly built-in support.

I wasn't talking about a built-in support. It was about an ability (a
way) to back sh_buf with hugepages.

 In any case, as David already explained, we don't do feature additions
 in minor releases.  We'd be especially unlikely to make an exception
 for this, since it has uncertain portability and benefits.  Anything
 that carries portability risks has got to go through a beta testing
 cycle before we'll unleash it on the masses.

Yes, I got the idea. Thanks both of you for clarification.

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Jeff Janes
On Wed, Oct 30, 2013 at 9:54 AM, Leonardo Francalanci m_li...@yahoo.itwrote:

 Jeff Janes wrote
  The index insertions should be fast until the size of the active part of
  the indexes being inserted into exceeds shared_buffers by some amount
  (what
  that amount is would depend on how much dirty data the kernel is willing
  to
  allow in the page cache before it starts suffering anxiety about it).  If
  you have enough shared_buffers to make that last for 15 minutes, then you
  shouldn't have a problem inserting with live indexes.

 Sooner or later you'll have to checkpoint those shared_buffers...


True, but that is also true of indexes created in bulk.  It all has to
reach disk eventually--either the checkpointer writes it out and fsyncs it,
or the background writer or user backends writes it out and the checkpoint
fsyncs it.  If bulk creation uses a ring buffer strategy (I don't know if
it does), then it might kick the buffers to kernel in more or less physical
order, which would help the kernel get them to disk in long sequential
writes.  Or not.  I think that this is where sorted checkpoint could really
help.

 and we are
 talking about GB of data (my understanding is that we change basically
every
 btree page, resulting in re-writing of the whole index).

If the checkpoint interval is as long as the partitioning period, then
hopefully the active index buffers get re-dirtied while protected in
shared_buffers, and only get written to disk once.  If the buffers get
read, dirtied, and evicted from a small shared_buffers over and over again
then you are almost guaranteed that will get written to disk multiple times
while they are still hot, unless your kernel is very aggressive about
caching dirty data (which will cause other problems).

Cheers,

Jeff


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Claudio Freire
On Wed, Oct 30, 2013 at 10:53 AM, Simon Riggs si...@2ndquadrant.com wrote:

 LSM-tree also covers the goal of maintaining just 2 sub-trees and a
 concurrent process of merging sub-trees. That sounds like it would
 take a lot of additional time to get right and would need some
 off-line process to perform the merge.



Not necessarily.

Merging means applying insertions/deletions from one subtree to another.
While it's normally preferable and more efficient to do it in batches, I've
successfully implemented in-memory versions that use other writers to
perform the task, amortizing the cost of merging across many operations. In
essence, when there's a need to merge two subtrees, an inserting process
also merges one entry, so slowly trees get merged. That works in-memory
very well, it's quite clear that it's not necessarily generalizable to
external storage, but it's a technique to have in mind.

Alternatively, vacuum could do it. It's quite clearly a vacuuming task
anyway.


Re: [HACKERS] Fast insertion indexes: why no developments

2013-10-30 Thread Gavin Flower

On 31/10/13 06:46, Jeff Janes wrote:
On Wed, Oct 30, 2013 at 9:54 AM, Leonardo Francalanci 
m_li...@yahoo.it mailto:m_li...@yahoo.it wrote:


Jeff Janes wrote
 The index insertions should be fast until the size of the active
part of
 the indexes being inserted into exceeds shared_buffers by some
amount
 (what
 that amount is would depend on how much dirty data the kernel is
willing
 to
 allow in the page cache before it starts suffering anxiety about
it).  If
 you have enough shared_buffers to make that last for 15 minutes,
then you
 shouldn't have a problem inserting with live indexes.

Sooner or later you'll have to checkpoint those shared_buffers...


True, but that is also true of indexes created in bulk.  It all has to 
reach disk eventually--either the checkpointer writes it out and 
fsyncs it, or the background writer or user backends writes it out and 
the checkpoint fsyncs it.  If bulk creation uses a ring buffer 
strategy (I don't know if it does), then it might kick the buffers to 
kernel in more or less physical order, which would help the kernel get 
them to disk in long sequential writes.  Or not.  I think that this is 
where sorted checkpoint could really help.


 and we are
 talking about GB of data (my understanding is that we change 
basically every

 btree page, resulting in re-writing of the whole index).

If the checkpoint interval is as long as the partitioning period, then 
hopefully the active index buffers get re-dirtied while protected in 
shared_buffers, and only get written to disk once.  If the buffers get 
read, dirtied, and evicted from a small shared_buffers over and over 
again then you are almost guaranteed that will get written to disk 
multiple times while they are still hot, unless your kernel is very 
aggressive about caching dirty data (which will cause other problems).


Cheers,

Jeff

How about being able to mark indexes:
'MEMORY ONLY' to make them not go to disk
and
'PERSISTENT | TRANSIENT' to mark if they should be recreated on 
machine bootup?


or something similar


Cheers,
Gavin


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Alvaro Herrera
Sergey Konoplev escribió:
 On Wed, Oct 30, 2013 at 8:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Sergey Konoplev gray...@gmail.com writes:

  There were an ability to back shared memory with hugepages when using
  =9.2. I use it on ~30 servers for several years and it brings 8-17%
  of performance depending on the memory size. Here you will find
  several paragraphs of the description about how to do it
  https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.
 
  What this describes is how to modify Postgres to request huge pages.
  That's hardly built-in support.
 
 I wasn't talking about a built-in support. It was about an ability (a
 way) to back sh_buf with hugepages.

Then what you need is to set 
dynamic_shared_memory_type = sysv
in postgresql.conf.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Sergey Konoplev
On Wed, Oct 30, 2013 at 11:50 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
  There were an ability to back shared memory with hugepages when using
  =9.2. I use it on ~30 servers for several years and it brings 8-17%
  of performance depending on the memory size. Here you will find
  several paragraphs of the description about how to do it
  https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.
 
  What this describes is how to modify Postgres to request huge pages.
  That's hardly built-in support.

 I wasn't talking about a built-in support. It was about an ability (a
 way) to back sh_buf with hugepages.

 Then what you need is to set
 dynamic_shared_memory_type = sysv
 in postgresql.conf.

Neither I found this parameter in the docs nor it works when I specify
it in postgresql.conf.

LOG:  unrecognized configuration parameter
dynamic_shared_memory_type in file
/etc/postgresql/9.3/main/postgresql.conf line 114
FATAL:  configuration file /etc/postgresql/9.3/main/postgresql.conf
contains errors

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Alvaro Herrera
Alvaro Herrera escribió:
 Sergey Konoplev escribió:

  I wasn't talking about a built-in support. It was about an ability (a
  way) to back sh_buf with hugepages.
 
 Then what you need is to set 
 dynamic_shared_memory_type = sysv
 in postgresql.conf.

The above is mistaken -- there's no way to disable the mmap() segment in
9.3, other than recompiling with EXEC_BACKEND which is probably
undesirable for other reasons.

I don't think I had ever heard of that recipe to use huge pages in
previous versions; since the win is probably significant in some
systems, we could have made this configurable.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Sergey Konoplev
On Wed, Oct 30, 2013 at 12:17 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
  I wasn't talking about a built-in support. It was about an ability (a
  way) to back sh_buf with hugepages.

 Then what you need is to set
 dynamic_shared_memory_type = sysv
 in postgresql.conf.

 The above is mistaken -- there's no way to disable the mmap() segment in
 9.3, other than recompiling with EXEC_BACKEND which is probably
 undesirable for other reasons.

Alternatively, I assume it could be linked with libhugetlbfs and you
don't need any source modifications in this case. However I am not
sure it will work with shared memory.

 I don't think I had ever heard of that recipe to use huge pages in
 previous versions; since the win is probably significant in some
 systems, we could have made this configurable.

There are several articles in the web describing how to do this,
except the mine one. And the win becomes mostly significant when you
have 64GB and more on your server.

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?

2013-10-30 Thread MauMau

From: Hiroshi Saito hiro...@winpg.jp

it my bug patch applied to ver 1.6.2sorry..
then, I made the next patch is there, please see,
http://winpg.jp/~saito/pg_work/OSSP_win32/

I will be adjusted and Ralf-san again.


Thanks.  Yes, I wrote the wrong URL and meant this one.  Is this patch 
(uuid-1.6.2_win32_patch2) the one that EnterpriseDB is using to publish the 
community PostgreSQL package?


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?

2013-10-30 Thread MauMau

From: Tom Lane t...@sss.pgh.pa.us

Alvaro Herrera alvhe...@2ndquadrant.com writes:

At this point, I think we need to consider ossp-uuid as dead code.


Yeah, but what shall we replace it with?  And can we preserve the
API contrib/uuid-ossp offers?  (Maybe we shouldn't even try, but
just deprecate that module and start fresh.)


Would it be welcomed in this community if a patch for built-in UUID 
generation functions are provided?  (I'm not saying I can provide them 
soon.)  According to the following page, Linux has libuuid.so.  Microsoft 
provides UuidCreate() as a Win32 API.  I don't know about other UNIXes.


MySQL, Oracle (and perhaps other DBMSs) offer UUID generation functions.  I 
feel it convenient if I can use that function as the default value for a 
column.


Regards
MauMau





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Shave a few instructions from child-process startup sequence

2013-10-30 Thread Gurjeet Singh
Just a small patch; hopefully useful.

Best regards,
-- 
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index ccb8b86..48dc7af 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2236,6 +2236,8 @@ ClosePostmasterPorts(bool am_syslogger)
 			StreamClose(ListenSocket[i]);
 			ListenSocket[i] = PGINVALID_SOCKET;
 		}
+else
+break;
 	}
 
 	/* If using syslogger, close the read side of the pipe */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-10-30 Thread Sergey Konoplev
On Wed, Oct 30, 2013 at 12:51 PM, Sergey Konoplev gray...@gmail.com wrote:
 On Wed, Oct 30, 2013 at 12:17 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  I wasn't talking about a built-in support. It was about an ability (a
  way) to back sh_buf with hugepages.

 Then what you need is to set
 dynamic_shared_memory_type = sysv
 in postgresql.conf.

 The above is mistaken -- there's no way to disable the mmap() segment in
 9.3, other than recompiling with EXEC_BACKEND which is probably
 undesirable for other reasons.

 Alternatively, I assume it could be linked with libhugetlbfs and you
 don't need any source modifications in this case. However I am not
 sure it will work with shared memory.

BTW, I managed to run 9.3 backed with hugepages after I put
HUGETLB_MORECORE (see man libhugetlbfs) to the environment yesterday,
but, after some time of working, it failed with messages showed below.

syslog:

Oct 29 17:53:13 grayhemp kernel: [150579.903875] PID 7584 killed due
to inadequate hugepage pool

postgres:

libhugetlbfslibhugetlbfs2013-10-29 17:53:21 PDT LOG:  server process
(PID 7584) was terminated by signal 7: Bus error
2013-10-29 17:53:21 PDT LOG:  terminating any other active server processes
2013-10-29 1
7:53:21 PDT WARNING:  terminating connection because of crash of
another server process
2013-10-29 17:53:21 PDT DETAIL:  The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.

My theory is that it has happened after the amount of huge pages
(vm.nr_overcommit_hugepages + vm.nr_hugepages) was exceeded, but I
might be wrong.

Does anybody has some thoughts of why it has happened and how to work abound it?

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?

2013-10-30 Thread Tom Lane
MauMau maumau...@gmail.com writes:
 From: Tom Lane t...@sss.pgh.pa.us
 Yeah, but what shall we replace it with?  And can we preserve the
 API contrib/uuid-ossp offers?  (Maybe we shouldn't even try, but
 just deprecate that module and start fresh.)

 Would it be welcomed in this community if a patch for built-in UUID 
 generation functions are provided?

Based on what?  The reason we've held this code at arms length (in a
contrib module) is exactly that we weren't too satisfied with the
portability or adequacy of the generation methods.  Moving the functions
into core doesn't do anything to make that better; rather, it raises the
portability bar even higher, as well as making it almost impossible to
contemplate further API changes.

Note the lack of enthusiasm for taking on maintainership of the OSSP
code.  Pushing it into core would mean that we're buying into that
maintainership, hook line and sinker.  I don't think that such a
proposal would fly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Oct 30, 2013 at 9:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 If it *isn't* about the main memory segment, what the hell are we doing
 creating random addon segments during bootstrap?  None of the DSM code
 should even get control at this point, IMO.

 Here's a short summary of what I posted back in August: at system
 startup time, the postmaster creates one dynamic shared segment,
 called the control segment.

Well, as I've pointed out already in this thread, the postmaster does not
execute during bootstrap, which makes me think this code is getting called
from the wrong place.  What possible reason is there to create add-on shm
segments in bootstrap mode?  I'm even dubious that we should create them
in standalone backends, because there will be no other process to share
them with.

I'm inclined to think this initialization should be moved to the actual
postmaster (and I mean postmaster.c) from wherever it is now.  That might
fix the not-so-random name choice in itself, but if it doesn't, then we
could consider where to move the random-seed-initialization step to.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] appendStringInfo vs appendStringInfoString

2013-10-30 Thread Robert Haas
On Wed, Oct 30, 2013 at 12:12 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Andres Freund escribió:
 On 2013-10-30 10:52:05 -0300, Alvaro Herrera wrote:
  Robert Haas escribió:
   On Wed, Oct 30, 2013 at 4:51 AM, David Rowley dgrowle...@gmail.com 
   wrote:
I've attached a re-based version of this.
  
   I don't see any compelling reason not to commit this.  Does anyone
   wish to object?
 
  I think a blanket substitution of places that currently have %s might
  cause bugs, particularly if the string is user-supplied.  It might be
  right for many cases, but did you/someone review each such callsite?
 
  No objection to the other half, that substitute calls that don't have
  %s.

 appendStringInfoString() doesn't expand format strings, so I am not sure
 what you're worried about?

 Um.  Blame my lack of decent breakfast this morning.

So, does that mean we're good to go?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something fishy happening on frogmouth

2013-10-30 Thread Robert Haas
On Wed, Oct 30, 2013 at 9:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Oct 30, 2013 at 9:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 If it *isn't* about the main memory segment, what the hell are we doing
 creating random addon segments during bootstrap?  None of the DSM code
 should even get control at this point, IMO.

 Here's a short summary of what I posted back in August: at system
 startup time, the postmaster creates one dynamic shared segment,
 called the control segment.

 Well, as I've pointed out already in this thread, the postmaster does not
 execute during bootstrap, which makes me think this code is getting called
 from the wrong place.  What possible reason is there to create add-on shm
 segments in bootstrap mode?  I'm even dubious that we should create them
 in standalone backends, because there will be no other process to share
 them with.

 I'm inclined to think this initialization should be moved to the actual
 postmaster (and I mean postmaster.c) from wherever it is now.  That might
 fix the not-so-random name choice in itself, but if it doesn't, then we
 could consider where to move the random-seed-initialization step to.

The initialization code is currently called form
CreateSharedMemoryAndSemaphores(), like this:

/* Initialize dynamic shared memory facilities. */
if (!IsUnderPostmaster)
dsm_postmaster_startup();

The reason I put it there is that if the postmaster does a
crash-and-restart cycle, we need create a new control segment just as
we need to create a new main shared memory segment.  (We also need to
make sure all dynamic shared memory segments left over from the
previous postmaster lifetime get nuked, but that happens earlier, as
part of the shmem_exit sequence.)

There may be a good reason to move it elsewhere, but by and large I
have not had good luck deviating from the pattern laid down for the
main shared memory segment.  My respect for that battle-tested code is
growing daily; every time I think I know better, I get burned.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Long paths for tablespace leads to uninterruptible hang in Windows

2013-10-30 Thread Amit Kapila
On Wed, Oct 16, 2013 at 1:44 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Tue, Oct 15, 2013 at 6:28 PM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Oct 15, 2013 at 2:55 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Oct 14, 2013 at 1:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, that sucks.  So it's a Windows bug.

 I agree we'll probably want to work around it in the end, but I still
 think it should be put to Microsoft PSS if we can. The usual - have we
 actually produced a self-contained example that does just this (and
 doesn't include the full postgres support) and submitted it to
 *microsoft* for comments?

   I have written a self contained win32 console application with which
 the issue can be reproduced.
   The application project is attached with this mail.

Logged a support ticket with Microsoft, they could reproduce the issue
with the sample application (it is same what I had posted on hackers
in this thread) and working on it.
Progress on ticket can be checked at below link:
https://support.microsoft.com/oas/default.aspx?st=1as=1iid=113iguid=42d48223-e81d-4693-a7b2-2e70186f06b2_1_1c=SMCln=en-inincno=113102310885322

I could view above link using my Microsoft account.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Service on Windows does not start. ~ is not a valid Win32 application

2013-10-30 Thread Amit Kapila
On Tue, Oct 29, 2013 at 12:46 PM, Naoya Anzai
anzai-na...@mxu.nes.nec.co.jp wrote:
 Hi Sandeep

 I think, you should change the subject line  to Unquoted service path 
 containing space is vulnerable and can be exploited on Windows to get the 
 attention..  :)
 Thank you for advice!
 I'll try to post to pgsql-bugs again.

I could also reproduce this issue. The situation is very rare such
that an exe with name same as first part of directory should exist
in installation path.
I suggest you can post your patch in next commit fest.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Service on Windows does not start. ~ is not a valid Win32 application

2013-10-30 Thread Asif Naeem
On Thu, Oct 31, 2013 at 10:17 AM, Amit Kapila amit.kapil...@gmail.comwrote:

 On Tue, Oct 29, 2013 at 12:46 PM, Naoya Anzai
 anzai-na...@mxu.nes.nec.co.jp wrote:
  Hi Sandeep
 
  I think, you should change the subject line  to Unquoted service path
 containing space is vulnerable and can be exploited on Windows to get the
 attention..  :)
  Thank you for advice!
  I'll try to post to pgsql-bugs again.

 I could also reproduce this issue. The situation is very rare such
 that an exe with name same as first part of directory should exist
 in installation path.


I believe it is a security risk with bigger impact as it is related to
Windows environment and as installers rely on it.


 I suggest you can post your patch in next commit fest.


Yes. Are not vulnerabilities/security risk's taken care of more urgent
bases ?


 With Regards,
 Amit Kapila.
 EnterpriseDB: http://www.enterprisedb.com