Re: [HACKERS] Failure while inserting parent tuple to B-tree is not fun

2014-02-04 Thread Heikki Linnakangas

On 02/04/2014 02:40 AM, Peter Geoghegan wrote:

On Fri, Jan 31, 2014 at 9:09 AM, Heikki Linnakangas
 wrote:

I refactored the loop in _bt_moveright to, well, not have that bug anymore.
The 'page' and 'opaque' pointers are now fetched at the beginning of the
loop. Did I miss something?


I think so, yes. You still aren't assigning the value returned by
_bt_getbuf() to 'buf'.


D'oh, you're right.


Since, as I mentioned, _bt_finish_split() ultimately unlocks *and
unpins*, it may not be the same buffer as before, so even with the
refactoring there are race conditions.


Care to elaborate? Or are you just referring to the missing "buf = " ?


A closely related issue is that you haven't mentioned anything about
buffer pins/refcount side effects in comments above
_bt_finish_split(), even though I believe you should.


Ok.


A minor stylistic concern is that I think it would be better to only
have one pair of _bt_finish_split()/_bt_getbuf() calls regardless of
the initial value of 'access'.


Ok.

I also changed _bt_moveright to never return a write-locked buffer, when 
the caller asked for a read-lock (an issue you pointed out earlier in 
this thread).


Attached is a new version of the patch, with those issues fixed. 
btree-incomplete-split-4.patch is a complete patch against the latest 
fix-btree-page-deletion patch, and moveright-assign-fix.patch is just 
the changes to _bt_moveright, if you want to review just the changes 
since the previous patch I posted.


- Heikki
diff --git a/src/backend/access/nbtree/README b/src/backend/access/nbtree/README
index 03efc29..43ee75f 100644
--- a/src/backend/access/nbtree/README
+++ b/src/backend/access/nbtree/README
@@ -404,12 +404,34 @@ an additional insertion above that, etc).
 For a root split, the followon WAL entry is a "new root" entry rather than
 an "insertion" entry, but details are otherwise much the same.
 
-Because insertion involves multiple atomic actions, the WAL replay logic
-has to detect the case where a page split isn't followed by a matching
-insertion on the parent level, and then do that insertion on its own (and
-recursively for any subsequent parent insertion, of course).  This is
-feasible because the WAL entry for the split contains enough info to know
-what must be inserted in the parent level.
+Because splitting involves multiple atomic actions, it's possible that the
+system crashes between splitting a page and inserting the downlink for the
+new half to the parent. After recovery, the downlink for the new page will
+be missing. The search algorithm works correctly, as the page will be found
+by following the right-link from its left sibling, although if a lot of
+downlinks in the tree are missing, performance will suffer. A more serious
+consequence is that if the page without a downlink gets split again, the
+insertion algorithm will fail to find the location in the parent level to
+insert the downlink.
+
+Our approach is to create any missing downlinks on-they-fly, when
+searching the tree for a new insertion. It could be done during searches,
+too, but it seems best not to put any extra updates in what would otherwise
+be a read-only operation (updating is not possible in hot standby mode
+anyway). To identify missing downlinks, when a page is split, the left page
+is flagged to indicate that the split is not yet complete (INCOMPLETE_SPLIT).
+When the downlink is inserted to the parent, the flag is cleared atomically
+with the insertion. The child page is kept locked until the insertion in the
+parent is finished and the flag in the child cleared, but can be released
+immediately after that, before recursing up the tree, if the parent also
+needs to be split. This ensures that incompletely split pages should not be
+seen under normal circumstances; only when insertion to the parent fails
+for some reason.
+
+We flag the left page, even though it's the right page that's missing the
+downlink, beacuse it's more convenient to know already when following the
+right-link from the left page to the right page that it will need to have
+its downlink inserted to the parent.
 
 When splitting a non-root page that is alone on its level, the required
 metapage update (of the "fast root" link) is performed and logged as part
@@ -422,6 +444,14 @@ page is a second record.  If vacuum is interrupted for some reason, or the
 system crashes, the tree is consistent for searches and insertions.  The next
 VACUUM will find the half-dead leaf page and continue the deletion.
 
+Before 9.4, we used to keep track of incomplete splits and page deletions
+during recovery and finish them immediately at end of recovery, instead of
+doing it lazily at the next  insertion or vacuum. However, that made the
+recovery much more complicated, and only fixed the problem when crash
+recovery was performed. An incomplete split can also occur if an otherwise
+recoverable error, like out-of-memory or out-of-disk-space, happens while
+inserting the downlink to the parent.
+
 Sca

Re: [HACKERS] jsonb and nested hstore

2014-02-04 Thread Oleg Bartunov
Andrew provided us more information and we'll work on recv. What
people think about testing this stuff ?  btw, we don't have any
regression test on this.

Oleg

On Wed, Feb 5, 2014 at 2:03 AM, Josh Berkus  wrote:
> On 02/03/2014 07:27 AM, Andres Freund wrote:
>> On 2014-02-03 09:22:52 -0600, Merlin Moncure wrote:
 I lost my stomach (or maybe it was the glass of red) somewhere in the
 middle, but I think this needs a lot of work. Especially the io code
 doesn't seem ready to me. I'd consider ripping out the send/recv code
 for 9.4, that seems the biggest can of worms. It will still be usable
 without.
>>>
>>> Not having type send/recv functions is somewhat dangerous; it can
>>> cause problems for libraries that run everything through the binary
>>> wire format.  I'd give jsonb a pass on that, being a new type, but
>>> would be concerned if hstore had that ability revoked.
>>
>> Yea, removing it for hstore would be a compat problem...
>>
>>> offhand note: hstore_send seems pretty simply written and clean; it's
>>> a simple nonrecursive iterator...
>>
>> But a send function is pretty pointless without the corresponding recv
>> function... And imo recv simply is to dangerous as it's currently
>> written.
>> I am not saying that it cannot be made work, just that it's still nearly
>> as ugly as when I pointed out several of the dangers some weeks back.
>
> Oleg, Teodor, any comments on the above?
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Failback without rebuild

2014-02-04 Thread Michael Paquier
On Wed, Feb 5, 2014 at 3:14 PM, Amit Kapila  wrote:
> On Wed, Feb 5, 2014 at 10:30 AM, James Sewell 
>> I've seen some proposals and a tool (pg_rewind), but all seem to have draw
>> backs.
>
> As far as I remember, one of the main drawbacks for pg_rewind was related to
> hint bits which can be avoided by wal_log_hints. pg_rewind is not part of
> core
> PostgreSQL code, however if you wish, you can try that tool to see if can it
> solve your purpose.
For 9.3, pg_rewind is only safe with page checksums enabled. For 9.4,
yes wal_log_hints or checksums is mandatory. The code contains as well
some safety checks as well to ensure that a node not using those
parameters cannot be rewinded.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-04 Thread Heikki Linnakangas

On 02/03/2014 05:22 PM, Merlin Moncure wrote:

>I lost my stomach (or maybe it was the glass of red) somewhere in the
>middle, but I think this needs a lot of work. Especially the io code
>doesn't seem ready to me. I'd consider ripping out the send/recv code
>for 9.4, that seems the biggest can of worms. It will still be usable
>without.

Not having type send/recv functions is somewhat dangerous; it can
cause problems for libraries that run everything through the binary
wire format.  I'd give jsonb a pass on that, being a new type, but
would be concerned if hstore had that ability revoked.


send/recv functions are also needed for binary-format COPY. IMHO jsonb 
must have send/recv functions. All other built-in types have them, 
except for types like 'smgr', 'aclitem' and 'any*' that no-one should be 
using as column types.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Failback without rebuild

2014-02-04 Thread Amit Kapila
On Wed, Feb 5, 2014 at 10:30 AM, James Sewell 
wrote:
>
> Hello All,
>
> I have been reading through some of the recent discussions about failback
when in a streaming replication setup. I define failback as:
>
> Node A is master, Node B is slave
> Node A crashes || Node A is stopped || nothing happens
> Promote Node B to Master
> Attach Node A as slave
>
> My understanding is currently to achieve step three you need to take a
base backup of Node B and deploy it to Node A before starting streaming
replication (or use rsync etc...).

I think in above sentence you mean to say "to achieve step *four* .."


> This is very undesirable for many users, especially if they have a very
large database.
>
> From the discussions I can see that the problem is to do with Node A
writing changes to disk that Node B are not streamed before Node A crashes.

Yes, this is right.


> Has there been any consensus on this issue? Are there any solutions which
might make it into 9.4 or 9.5?

As far as I know, there is still no solution provided in 9.4, can't say
anything
for 9.5 with any certainity. However in 9.4, there is a new parameter
wal_log_hints which can be useful to overcome drawback of pg_rewind.

> I've seen some proposals and a tool (pg_rewind), but all seem to have
draw backs.

As far as I remember, one of the main drawbacks for pg_rewind was related to
hint bits which can be avoided by wal_log_hints. pg_rewind is not part of
core
PostgreSQL code, however if you wish, you can try that tool to see if can it
solve your purpose.

Note - James, in previous reply, I missed to cc to hackers, so sending it
again.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Amit Kapila
On Tue, Feb 4, 2014 at 11:58 PM, Robert Haas  wrote:
> On Tue, Feb 4, 2014 at 12:39 PM, Amit Kapila  wrote:
>> Now there is approximately 1.4~5% CPU gain for
>> "hundred tiny fields, half nulled" case

> Assuming that the logic isn't buggy, a point in need of further study,
> I'm starting to feel like we want to have this.  And I might even be
> tempted to remove the table-level off switch.

I have tried to stress on worst case more, as you are thinking to
remove table-level switch and found that even if we increase the
data by approx. 8 times ("ten long fields, all changed", each field contains
80 byte data), the CPU overhead is still < 5% which clearly shows that
the overhead doesn't increase much even if the length of unmatched data
is increased by much larger factor.
So the data for worst case adds more weight to your statement
("remove table-level switch"), however there is no harm in keeping
table-level option with default as 'true' and if some users are really sure
the updates in their system will have nothing in common, then they can
make this new option as 'false'.

Below is data for the new case " ten long fields, all changed" added
in attached script file:

Unpatched
   testname   | wal_generated | duration
--+---+--
 ten long fields, all changed |3473999520 | 45.0375978946686
 ten long fields, all changed |3473999864 | 45.2536928653717
 ten long fields, all changed |3474006880 | 45.1887288093567


After pgrb_delta_encoding_v8.patch
--
  testname   | wal_generated | duration
--+---+--
 ten long fields, all changed |3474006456 | 47.5744359493256
 ten long fields, all changed |3474000136 | 47.3830440044403
 ten long fields, all changed |3474002688 | 46.9923310279846



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


wal-update-testsuite.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Craig Ringer  writes:
> On 02/05/2014 06:29 AM, Tom Lane wrote:
>> I had been okay with the manual PGDLLIMPORT-sprinkling approach
>> (not happy with it, of course, but prepared to tolerate it) as long
>> as I believed the buildfarm would reliably tell us of the need for
>> it.  That assumption has now been conclusively disproven, though.

> I'm kind of horrified that the dynamic linker doesn't throw its toys
> when it sees this.

Indeed :-(.

The truly strange part of this is that it seems that the one Windows
buildfarm member that's telling the truth (or most nearly so, anyway)
is narwhal, which appears to have the oldest and cruftiest toolchain
of the lot.  I'd really like to come out the other end of this
investigation with a clear understanding of why the newer toolchains
are failing to report a link problem, and yet not building working
executables.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL Failback without rebuild

2014-02-04 Thread James Sewell
Hello All,

I have been reading through some of the recent discussions about failback
when in a streaming replication setup. I define failback as:


   1. Node A is master, Node B is slave
   2. Node A crashes || Node A is stopped || nothing happens
   3. Promote Node B to Master
   4. Attach Node A as slave

My understanding is currently to achieve step three you need to take a base
backup of Node B and deploy it to Node A before starting streaming
replication (or use rsync etc...).

This is very undesirable for many users, especially if they have a very
large database.

>From the discussions I can see that the problem is to do with Node A
writing changes to disk that Node B are not streamed before Node A crashes.

Has there been any consensus on this issue? Are there any solutions which
might make it into 9.4 or 9.5? I've seen some proposals and a tool
(pg_rewind), but all seem to have draw backs.

I've been looking mainly at these threads:

http://www.postgresql.org/message-id/CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=ggyu1kmt+s2xcq...@mail.gmail.com

http://www.postgresql.org/message-id/caf8q-gxg3pqtf71nvece-6ozraew5pwhk7yqtbjgwrfu513...@mail.gmail.com

http://www.postgresql.org/message-id/519df910.4020...@vmware.com

Cheers,

James Sewell,
PostgreSQL Team Lead / Solutions Architect
__


 Level 2, 50 Queen St, Melbourne VIC 3000

*P *(+61) 3 8370 8000  *W* www.lisasoft.com  *F *(+61) 3 8370 8099

-- 


--
The contents of this email are confidential and may be subject to legal or 
professional privilege and copyright. No representation is made that this 
email is free of viruses or other defects. If you have received this 
communication in error, you may not copy or distribute any part of it or 
otherwise disclose its contents to anyone. Please advise the sender of your 
incorrect receipt of this correspondence.


[HACKERS] Re: Viability of text HISTORY/INSTALL/regression README files (was Re: [COMMITTERS] pgsql: Document a few more regression test hazards.)

2014-02-04 Thread Noah Misch
On Tue, Feb 04, 2014 at 03:28:45PM -0500, Robert Haas wrote:
> On Tue, Feb 4, 2014 at 1:38 AM, Tom Lane  wrote:
> > Noah Misch  writes:
> >>> Robert Haas  writes:
>  I wonder if these standalone things are really worthwhile.
> >
> >> I wonder how difficult it would be to make sufficient link data available 
> >> when
> >> building the standalone files.  There would be no linking per se; we would
> >> just need the referent's text fragment emitted where the  tag 
> >> appears.
> >
> > IIRC, that's basically what the "workaround" is, except it's not very
> > automated.  Even if it were automated, though, there's still a problem:
> > such links aren't really *useful* in flat text format.  I think that
> > forcing the author to actually think about what to put there in the
> > flat text version is a good thing, if we're going to retain the flat
> > text version at all.
> 
> Right.  I mean, a lot of the links say things like "Section 26.2"
> which obviously makes no sense in a standalone text file.

For s normally displayed that way, text output could emit a URL, either
inline or in the form of a footnote.  For link targets (e.g. SQL commands)
having a friendly text fragment for  sites, use the normal fragment.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Duplicate Tuple Elidation during External Sort for DISTINCT

2014-02-04 Thread Michael Paquier
On Wed, Feb 5, 2014 at 10:33 AM, Jon Nelson  wrote:
> What - if anything - do I need to do to get this on the commitfest
> list for the next commitfest?
The list of instructions is here:
http://wiki.postgresql.org/wiki/Submitting_a_Patch#Patch_submission
Then the next commit fest (#1 for 9.5), will be in June and is here:
https://commitfest.postgresql.org/action/commitfest_view?id=22
Note that you will need an account on postgresql.org to register your patch.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Craig Ringer
On 02/05/2014 12:06 AM, Andrew Dunstan wrote:
> 
> On 02/04/2014 10:43 AM, Tom Lane wrote:
>> Andres Freund  writes:
>>> On 2014-02-04 02:10:47 -0500, Tom Lane wrote:
 Meh.  It might be that the DateStyle usage in postgres_fdw would
 accidentally fail to malfunction if it saw a bogus value of the
 variable.
 But it's hard to believe that this would be true of MainLWLockArray.
>>> There's not that much lwlock usage in contrib. It's just
>>> pg_stat_statements and pg_buffercache. Neither has tests... So it very
>>> well could be that breakage simply hasn't been observed.
>> Hm, you're right --- I'd have thought there were more of those.
>>
>> Ugh.  This problem was bad enough when I thought that it would only lead
>> to link-time errors detectable in the buildfarm.  If it can lead to
>> errors
>> only observable at runtime --- and maybe not obvious even then --- then
>> I think we *have to* do something about it.  By that I mean that we must
>> get rid of the need to manually plaster PGDLLIMPORT on global variables.
>>
>> Anybody with a Windows build environment want to test the "#define
>> extern"
>> trick?
>>
>>
> 
> 
> 
> We have details on how to build with Mingw/Msys on Windows on an Amazon
> VM  which is either
> free or very cheap. Do I need to give instructions on how to do this for
> MSVC builds too? It's really not terribly hard.

I've got some guidance on that here:

https://github.com/2ndQuadrant/pg_build_win

from setting up a clean Windows instance for builds, on to the build
process.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Craig Ringer
On 02/05/2014 06:29 AM, Tom Lane wrote:
> I had been okay with the manual PGDLLIMPORT-sprinkling approach
> (not happy with it, of course, but prepared to tolerate it) as long
> as I believed the buildfarm would reliably tell us of the need for
> it.  That assumption has now been conclusively disproven, though.
> The question therefore becomes, what are we going to do instead?
> "Keep on doing what we were doing" doesn't strike me as an acceptable
> answer.

I'm in complete agreement here. Silent failures we can't test for that
might sneak data corruption in are not cool.

I'll have a look into ways to making sure that globals with incorrect
linkage fail at runtime link time, as is the case for functions. I won't
be able to spend much time on it immediately; will take a quick look and
if I don't find anything, will follow up post-CF4.

I'm kind of horrified that the dynamic linker doesn't throw its toys
when it sees this.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 10:48 PM, Craig Ringer wrote:

On 02/05/2014 04:08 AM, Jeff Janes wrote:

  So doing a git bisect is just painful.  Is the MSVC
build faster?

Yes, but not on EC2.

I've found Windows EC2 instances so impossibly slow I just gave up
working with it. It took 1.5 hours to do a build and regression check
with msvc on a Medium EC2 instance; the same build takes 10 mins on my
tiny Intel i3 based Windows test machine.

It's possible that some of the larger instance types may perform better
as they use different approaches to virtualization than simple Xen HVM.
I haven't tested, as the cost of those instances rapidly becomes
problematic.




I typically use m1.medium. A spot instance for that is currently $0.033 
/ hour. When I was working on one such the other day it took nothing 
like 1.5 hours to build and test. I didn't time it so I can't tell you 
how long it took, but much less than that. Of course YMMV.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Craig Ringer
On 02/05/2014 02:53 AM, Tom Lane wrote:
> "Joshua D. Drake"  writes:
>> On 02/04/2014 09:34 AM, Tom Lane wrote:
>>> My own opinion is that I've already wasted untold man-hours thanks to
>>> the random porting problems induced by Windows, a platform that I never
>>> have and never will care about personally.  I will *not* spend my own
>>> time doing tests that someone else could do.  If we can't get some
>>> effort contributed by someone who does use that platform, I'm personally
>>> prepared to declare the entire damn thing no longer supported.
> 
>> Although that is obviously your prerogative it is important to remember 
>> that Windows is easily the second most used version of PostgreSQL out 
>> there (behind Linux).
> 
> [ shrug... ]  If it's so widely used, why is it so hard to find somebody
> who's willing to put in some development effort for it?

Sadly, I'm now convinced that Windows users are just much less likely to
contribute anything constructive to a project - code, documentation,
anything. It's a real "gimme" world, and has a really strong ethic that
the "vendor" does things with their software, you don't just go and get
involved.

That said, I think the fuss being made about the intrusiveness of
Windows support and its impact is overblown here. These are a few macros
that're noops on other platforms anyway, and some build code hidden away
in src/tools .

It's ugly. It's annoying. It's crap that users don't contribute back.
It's also just not that big a deal; there are many other things that are
similarly painful or more so.

Expecting folks to fire up an AMI and hand-control the build with a GUI
over a high latency connection is a waste of time better spent
elsewhere, though, and will result in everyone continuing to avoid any
sort of testing on Windows.

Personally what I think we need is a *public* Jenkins instance, or
similar, to which you can push a branch and have it automatically build
and "make check" on Windows. I've got that running for internal use, but
it's on a host I can't share access to (and an unreliable one, at that).

I'd be happy to share the setup for the Jenkins instance and the Windows
integration parts, along with the instructions I wrote on how to set up
the Windows build test node(s) and the tooling I'm using to automate the
Windows build.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Craig Ringer
On 02/05/2014 04:08 AM, Jeff Janes wrote:
>  So doing a git bisect is just painful.  Is the MSVC
> build faster?

Yes, but not on EC2.

I've found Windows EC2 instances so impossibly slow I just gave up
working with it. It took 1.5 hours to do a build and regression check
with msvc on a Medium EC2 instance; the same build takes 10 mins on my
tiny Intel i3 based Windows test machine.

It's possible that some of the larger instance types may perform better
as they use different approaches to virtualization than simple Xen HVM.
I haven't tested, as the cost of those instances rapidly becomes
problematic.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not create IPv6 socket (AI_ADDRCONFIG)

2014-02-04 Thread Tom Lane
Kyotaro HORIGUCHI  writes:
> getaddrinfo returned two same entries having the same address
> AF_INET "127.0.0.1:14357". One of them is for "::1" in
> hosts. This is worse than current behavior X-(

Yeah, the fundamental issue is that getaddrinfo tends to return bogus
info.

>>> How about just adding a HINT?

>> Hmm ... maybe, but how would you phrase the hint exactly?

> Putting the 'exactly' aside, is it something means 'You will get
> this message when the feature to handle the address family is
> disabled', only for EAFNOSUPPORT ?

>  Though I don't know whether such a hint is helpful for those who
> tend to mind that kind of message.

I still think the best thing might be to reduce the individual messages
to DEBUG-something, and only produce a LOG entry if we are unable to
bind to *any* of the addresses returned by getaddrinfo.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not create IPv6 socket (AI_ADDRCONFIG)

2014-02-04 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 04 Feb 2014 02:07:08 -0500, Tom Lane  wrote in 
<3176.1391497...@sss.pgh.pa.us>
> One good reason not to trust this too much is that getaddrinfo() is
> fundamentally a userspace DNS access function, and as such it has
> no very good way to know if there's currently an IPv4 or IPv6
> interface configured on the local system.  At minimum there are
> obvious race conditions in that.

A case which would be more common is "::1" in /etc/hosts. I had
following error with this patch for such a case.

| LOG:  could not bind IPv4 socket: Address already in use
| HINT:  Is another postmaster already running on port 5432? If not, wait a few 
seconds and retry.

getaddrinfo returned two same entries having the same address
AF_INET "127.0.0.1:14357". One of them is for "::1" in
hosts. This is worse than current behavior X-(

At Tue, 04 Feb 2014 10:31:03 -0500, Tom Lane  wrote in 
<12552.1391527...@sss.pgh.pa.us>
> Alvaro Herrera  writes:
> > Tom Lane wrote:
> >> Kyotaro HORIGUCHI  writes:
> >>> LOG:  could not create IPv6 socket: Address family not supported by 
> >>> protocol
> 
> >> That's merely a harmless log message.
> 
> > How about just adding a HINT?
> 
> Hmm ... maybe, but how would you phrase the hint exactly?

Putting the 'exactly' aside, is it something means 'You will get
this message when the feature to handle the address family is
disabled', only for EAFNOSUPPORT ?

 Though I don't know whether such a hint is helpful for those who
tend to mind that kind of message.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-04 Thread Etsuro Fujita

(2014/02/04 20:56), Robert Haas wrote:

On Sun, Feb 2, 2014 at 10:15 PM, Etsuro Fujita
 wrote:

Allowing ALTER COLUMN SET STORAGE on foreign tables would make sense if for
example, "SELECT * INTO local_table FROM foreign_table" did create a new
local table of columns having the storage types associated with those of a
foreign table?


Seems like a pretty weak argument.  It's not that we can't find
strange corner cases where applying SET STORAGE to a foreign table
doesn't do something; it's that they *are* strange corner cases.  The
options as we normally don't understand them just aren't sensible in
this context, and a good deal of work has been put into an alternative
options framework, which is what authors of FDWs ought to be using.


I just wanted to discuss the possiblity of allowing SET STORAGE on a 
foreign table, but I've got the point.  I'll resume the patch review.


Thanks,

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Duplicate Tuple Elidation during External Sort for DISTINCT

2014-02-04 Thread Jon Nelson
What - if anything - do I need to do to get this on the commitfest
list for the next commitfest?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Michael Paquier
On Wed, Feb 5, 2014 at 9:38 AM, Michael Paquier
 wrote:
> On Wed, Feb 5, 2014 at 8:59 AM, Michael Paquier
>  wrote:
>>  I'll update the patches according to that.
> Here are the updated patches with the following changes (according to
> previous comments):
> - Datatype is renamed to pglsn, documentation, file names, regressions
> and APIs are updated as well.
> - The DatumGet* and *GetDatum APIs are renamed with PGLSN (Should be
> PgLsn? But that's a detail)
> - pg_create_physical_replication_slot uses PGLSNGetDatum for its 6th argument
> For pageinspect, only page_header is impacted and I think that this
> should be a separated patch as it makes necessary to dump it to 1.2. I
> can write it later once the core parts are decided.
I just forgot to mention that the 2nd patch does not use context diffs
but git diffs because of filterdiff not able to catch all the new
content of slotfuncs.c.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-04 Thread Michael Paquier
On Wed, Feb 5, 2014 at 7:22 AM, Jeremy Harris  wrote:
> The attached patch replaces the existing siftup method for heapify with
> a siftdown method. Tested with random integers it does 18% fewer
> compares and takes 10% less time for the heapify, over the work_mem
> range 1024 to 1048576.
>
> Both algorithms appear to be O(n) (contradicting Wikipedia's claim
> that a siftup heapify is O(n log n)).
It looks interesting but it is too late to have that in 9.4... You
should definitely add this patch to the next commit fest so as it is
not lost in the void:
https://commitfest.postgresql.org/action/commitfest_view?id=22
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Misaligned BufferDescriptors causing major performance problems on AMD

2014-02-04 Thread Peter Geoghegan
On Mon, Feb 3, 2014 at 3:38 PM, Andres Freund  wrote:
>> > A quick hack (attached) making BufferDescriptor 64byte aligned indeed
>> > restored performance across all max_connections settings. It's not
>> > surprising that a misaligned buffer descriptor causes problems -
>> > there'll be plenty of false sharing of the spinlocks otherwise. Curious
>> > that the the intel machine isn't hurt much by this.

>> What fiddling are you thinking of?
>
> Basically always doing a TYPEALIGN(CACHELINE_SIZE, addr) before
> returning from ShmemAlloc() (and thereby ShmemInitStruct).

There is something you have not drawn explicit attention to that is
very interesting. If we take REL9_3_STABLE tip to be representative
(built with full -O2 optimization, no assertions just debugging
symbols), setting max_connections to 91 from 90 does not have the
effect of making the BufferDescriptors array aligned; it has the
effect of making it *misaligned*. You reported that 91 was much better
than 90. I think that the problem actually occurs when the array *is*
aligned!

I suspect that the scenario described in this article accounts for the
quite noticeable effect reported: http://danluu.com/3c-conflict

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2014-02-04 Thread Jeff Janes
On Mon, Feb 3, 2014 at 8:55 AM, Robert Haas  wrote:


> I've also had some further thoughts about the right way to drive
>  vacuum scheduling.  I think what we need to do is tightly couple the
> rate at which we're willing to do vacuuming to the rate at which we're
> incurring "vacuum debt".  That is, if we're creating 100kB/s of pages
> needing vacuum, we vacuum at 2-3MB/s (with default settings).


If we can tolerate 2-3MB/s without adverse impact on other work, then we
can tolerate it.  Do we gain anything substantial by sand-bagging it?



> If
> we're creating 10MB/s of pages needing vacuum, we *still* vacuum at
> 2-3MB/s.  Not shockingly, vacuum gets behind, the database bloats, and
> everything goes to heck.


(Your reference to bloat made be me think your comments here are about
vacuuming in general, not specific to IOS.  If that isn't the case, then
please ignore.)

If we can only vacuum at 2-3MB/s without adversely impacting other
activity, but we are creating 10MB/s of future vacuum need, then there are
basically two possibilities I can think of.  Either the 10MB/s represents a
spike, and vacuum should tolerate it and hope to catch up on the debt
later.  Or it represents a new permanent condition, in which case I bought
too few hard drives for the work load, and no scheduling decision that
autovacuum can make will save me from my folly. Perhaps there is some
middle ground between those possibilities, but I don't see room for much
middle ground.

I guess there might be entirely different possibilities not between those
two; for example, I don't realize I'm doing something that is generating
10MB/s of vacuum debt, and would like to have this thing I'm doing be
automatically throttled to the point it doesn't interfere with other
processes (either directly, or indirectly by bloat)



> The rate of vacuuming needs to be tied
> somehow to the rate at which we're creating stuff that needs to be
> vacuumed.  Right now we don't even have a way to measure that, let
> alone auto-regulate the aggressiveness of autovacuum on that basis.
>

There is the formula used to decide when a table gets vacuumed.  Isn't the
time delta in this formula a measure of how fast we are creating stuff that
needs to be vacuumed for bloat reasons?  Is your objection that it doesn't
include other reasons we might want to vacuum, or that it just doesn't work
very well, or that is not explicitly exposed?




> Similarly, for marking of pages as all-visible, we currently make the
> same decision whether the relation is getting index-scanned (in which
> case the failure to mark those pages all-visible may be suppressing
> the use of index scans or making them less effective) or whether it's
> not being accessed at all (in which case vacuuming it won't help
> anything, and might hurt by pushing other pages out of cache).


If it is not getting accessed at all because the database is not very
active right now, that would be the perfect time to vacuum it.  Between "I
can accurately project current patterns of (in)activity into the future"
and "People don't build large tables just to ignore them forever", I think
the latter is more likely to be true.  If the system is busy but this
particular table is not, then that would be a better reason to
de-prioritise vacuuming that table.  But can this degree of reasoning
really be implemented in a practical way?  In core?


>  Again,
> if we had better statistics, we could measure this - counting heap
> fetches for actual index-only scans plus heap fetches for index scans
> that might have been planned index-only scans but for the relation
> having too few all-visible pages doesn't sound like an impossible
> metric to gather.


My experience has been that if too few pages are all visible, it generally
switches to a seq scan, not an index scan of a different index.  But many
things that are semantically possible to be index-only-scans would never be
planned that way even if allvisible were 100%, so I think it would have to
do two planning passes, one with the real allvisible, and a hypothetical
one with allvisible set to 100%.  And then there is the possibility that,
while a high allvisible would be useful, the table is so active that no
amount of vacuuming could ever keep it high.

Cheers,

Jeff


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Michael Paquier
On Wed, Feb 5, 2014 at 5:26 AM, Peter Eisentraut  wrote:
> Perhaps this type should be called pglsn, since it's an
> implementation-specific detail and not a universal concept like int,
> point, or uuid.
It makes sense. I'll update the patches according to that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 05:47 PM, Tom Lane wrote:

Andrew Dunstan  writes:

On 02/04/2014 03:08 PM, Jeff Janes wrote:

Having an AMI would help, but even with an AMI in place, MinGW is
still insanely slow.  Running "make" on already made PostgreSQL (so
there was nothing to actually do) takes 1.5 minutes.  And a make after
a "make clean" takes half an hour.  This is on an actual desktop, not
an AWS micro instance.  So doing a git bisect is just painful.  Is the
MSVC build faster?

Would have to check with the same build options (cassert and debug have
major timing effects.) I agree it's not lightning fast like "make -j 4"
on a decent linux box.

I wonder if ccache exists for Mingw.  That thing makes a huge difference
in the perceived build speed ...





Indeed. But it's not really, AFAIK. Certainly it's not in the list of 
packages known to mingw-get on the machine i checked on (jacana).


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Misaligned BufferDescriptors causing major performance problems on AMD

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 4:21 AM, Andres Freund  wrote:
> Which imo means fixing this got more important...

I strongly agree.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Andrew Dunstan  writes:
> On 02/04/2014 03:08 PM, Jeff Janes wrote:
>> Having an AMI would help, but even with an AMI in place, MinGW is 
>> still insanely slow.  Running "make" on already made PostgreSQL (so 
>> there was nothing to actually do) takes 1.5 minutes.  And a make after 
>> a "make clean" takes half an hour.  This is on an actual desktop, not 
>> an AWS micro instance.  So doing a git bisect is just painful.  Is the 
>> MSVC build faster?

> Would have to check with the same build options (cassert and debug have 
> major timing effects.) I agree it's not lightning fast like "make -j 4" 
> on a decent linux box.

I wonder if ccache exists for Mingw.  That thing makes a huge difference
in the perceived build speed ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Andres Freund
On 2014-02-04 13:42:51 -0800, Peter Geoghegan wrote:
> On Tue, Feb 4, 2014 at 12:30 PM, Christian Kruse
>  wrote:
> > Ok, benchmark for baseline+alignment patch is running.
> 
> I see that you have enabled latency information. For this kind of
> thing I prefer to hack pgbench-tools to not collect this (i.e. to not
> pass the "-l" flag, "Per-Transaction Logging"). Just remove it and
> pgbench-tools rolls with it. It may well be that the overhead added is
> completely insignificant, but for something like this, where the
> latency information is unlikely to add any value, I prefer to not take
> the chance. This is a fairly minor point, however, especially since
> these are only 60 second runs where you're unlikely to accumulate
> enough transaction latency information to notice any effect.

Hm, I don't find that convincing. If you look at the results from the
last run the latency information is actually quite interesting.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Robert Haas  writes:
> You know, I would really prefer to just stick a PGDLLIMPORT on this
> place and any others that need it, and any others that come up, than
> turn this into a political football.  Having to sprinkle PGDLLIMPORT
> on the handful of variables that are accessed by contrib modules is,
> indeed, annoying.

I'm not actually trying to turn this into a political football.  What
I want is a solution that we can trust, ie, that will allow us to
ship Windows code that's not broken.  We have failed to do so for at
least the past year, and not even known it.

I had been okay with the manual PGDLLIMPORT-sprinkling approach
(not happy with it, of course, but prepared to tolerate it) as long
as I believed the buildfarm would reliably tell us of the need for
it.  That assumption has now been conclusively disproven, though.
The question therefore becomes, what are we going to do instead?
"Keep on doing what we were doing" doesn't strike me as an acceptable
answer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Minor performance improvement in transition to external sort

2014-02-04 Thread Jeremy Harris

The attached patch replaces the existing siftup method for heapify with
a siftdown method. Tested with random integers it does 18% fewer
compares and takes 10% less time for the heapify, over the work_mem
range 1024 to 1048576.

Both algorithms appear to be O(n) (contradicting Wikipedia's claim
that a siftup heapify is O(n log n)).

--
Cheers,
   Jeremy
diff --git a/src/backend/utils/sort/tuplesort.c 
b/src/backend/utils/sort/tuplesort.c
index 8b520c1..2ea7b2d 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1211,7 +1211,8 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
(void) grow_memtuples(state);
Assert(state->memtupcount < state->memtupsize);
}
-   state->memtuples[state->memtupcount++] = *tuple;
+   state->memtuples[state->memtupcount] = *tuple;
+   state->memtuples[state->memtupcount++].tupindex = 0;
 
/*
 * Check if it's time to switch over to a bounded 
heapsort. We do
@@ -1884,23 +1885,47 @@ inittapes(Tuplesortstate *state)
state->tp_tapenum = (int *) palloc0(maxTapes * sizeof(int));
 
/*
-* Convert the unsorted contents of memtuples[] into a heap. Each tuple 
is
-* marked as belonging to run number zero.
-*
-* NOTE: we pass false for checkIndex since there's no point in 
comparing
-* indexes in this step, even though we do intend the indexes to be part
-* of the sort key...
+* Convert the unsorted contents of memtuples[] into a heap. Each tuple 
was
+* marked as belonging to run number zero on input; we don't compare 
that.
 */
ntuples = state->memtupcount;
-   state->memtupcount = 0; /* make the heap empty */
-   for (j = 0; j < ntuples; j++)
{
-   /* Must copy source tuple to avoid possible overwrite */
-   SortTuple   stup = state->memtuples[j];
+   SortTuple * m = state->memtuples;
 
-   tuplesort_heap_insert(state, &stup, 0, false);
+   for (j = (ntuples-2)/2; /* comb the array from the last heap 
parent */
+j >= 0;/* to the start */
+j--)
+   {
+   int root = j;
+   int child, swap = 0;
+   SortTuple ptup = m[root];
+
+   while ((child = root*2 + 1) <= ntuples-1)
+   {   /* root has at least one child. Check 
left-child */
+   if (COMPARETUP(state, &ptup, &m[child]) < 0)
+   {   
/* [root] < [lchild] */
+   if (  ++child <= ntuples-1  
/* check right-child */
+  && COMPARETUP(state, &ptup, 
&m[child]) > 0)
+   swap = child;
+   else
+   break;  
/* [root] smallest */
+   }
+   else
+   {
+   swap = child;
+   if (  ++child <= ntuples-1  
/* check right-child */
+  && COMPARETUP(state, &m[swap], 
&m[child]) > 0)
+   swap = child;
+   }
+   /* [swap] is smallest of three; move into the 
parent slot */
+   m[root] = m[swap];
+   root = swap;/* and repeat 
with the child subtree */
+   }
+   if (swap)
+   m[swap] = ptup;
+   /* This and all heap nodes after are now 
well-positioned */
+   }
}
-   Assert(state->memtupcount == ntuples);
 
state->currentRun = 0;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] integrate pg_upgrade analyze_new_cluster.sh into vacuumdb

2014-02-04 Thread Jeff Janes
On Tue, Jan 21, 2014 at 9:06 AM, Oskari Saarenmaa  wrote:

> 09.01.2014 05:15, Peter Eisentraut kirjoitti:
>
>  pg_upgrade creates a script analyze_new_cluster.{sh|bat} that runs
>> vacuumdb --analyze-only in three stages with different statistics target
>> settings to get a fresh cluster analyzed faster.  I think this behavior
>> is also useful for clusters or databases freshly created by pg_restore
>> or any other loading mechanism, so it's suboptimal to have this
>> constrained to pg_upgrade.
>>
>
> I think the three stage analyze is a wrong solution to the "slow analyze"
> problem.


It certainly is not the best possible solution.  But it might be the best
one that can be arrived at within a reasonable amount of time.

If we really want to micromanage the process we could put a lot of work
into it.  Take the case of a small table of about 300 pages.  We read the
table 3 times (separated by enough time that it is probably no longer
cached), first keeping one tuple per page, then 10, then 100.   Instead it
should probably just jump directly to sampling at statistics target of 100
and then forget those small tables, at least if all we are concerned about
is IO costs.  (Since the selected tuples are sorted once for each column,
there might be a CPU reason to take a small sample at the first pass, if we
are more concerned with CPU than IO.)

But I do wonder what experience people have with the 3 stage process, how
useful is it empirically?  If you can't open the database for general use
until the 3rd phase is done, then you would just jump to doing that stage,
rather than working through all 3 of them.  If you can open the database
and muddle through without statistics for a while, why not muddle through
for the little bit longer that it would take to collect the full set right
off the bat, rather than making intermediate passes?

So I agree that the current system in not optimal.  But this patch is just
moving existing behavior from a less general location to a more general
one, so I don't think it should be held hostage to improvements that could
theoretically be made but which no one has offered to do.  I wouldn't want
to put in a change that forces users to learn something new for 9.4 only to
have it completely redone in 9.5 and then make them learn that.  But the
documentation and training burden of this change seems small enough that I
wouldn't worry about that.  (On the other hand, the benefit of the change
also seems pretty small.)

In my experience most of the analyze time goes to reading random blocks
> from the disk but we usually use only a small portion of that data (1 row
> per block.)
>
> If we were able to better utilize the data we read we could get good
> statistics with a lot less IO than we currently need.  This was discussed
> in length at
> http://www.postgresql.org/message-id/CAM-w4HOjRbNPMW=
> shjhw_qfapcuu5ege1tmdr0zqu+kqx8q...@mail.gmail.com but it hasn't turned
> into patches so far.


I don't think it is an accident that it hasn't turned into patches.  You
can't change the laws of statistics just by wanting it badly enough.  Our
current sampling method is already insufficiently random.  We aren't going
to fix things by making it even less random.   But I guess that that is an
empirical question again, have most statistics problems been due to the
sample being insufficiently large, or insufficiently random?  (Or
insufficiently recent?)

There could be a different stage-by-stage approach where
default_statistics_target is fixed but in the first pass you just take the
first 30,000 rows in each table, and then in second pass you take a random
30,000 rows.  But the knobs to do that currently do not exist, and I doubt
they would be welcomed if their only use is to support pg_upgrade.  So that
idea is not a blocker to this patch, either.

Cheers,

Jeff


Re: [HACKERS] jsonb and nested hstore

2014-02-04 Thread Josh Berkus
On 02/03/2014 07:27 AM, Andres Freund wrote:
> On 2014-02-03 09:22:52 -0600, Merlin Moncure wrote:
>>> I lost my stomach (or maybe it was the glass of red) somewhere in the
>>> middle, but I think this needs a lot of work. Especially the io code
>>> doesn't seem ready to me. I'd consider ripping out the send/recv code
>>> for 9.4, that seems the biggest can of worms. It will still be usable
>>> without.
>>
>> Not having type send/recv functions is somewhat dangerous; it can
>> cause problems for libraries that run everything through the binary
>> wire format.  I'd give jsonb a pass on that, being a new type, but
>> would be concerned if hstore had that ability revoked.
> 
> Yea, removing it for hstore would be a compat problem...
> 
>> offhand note: hstore_send seems pretty simply written and clean; it's
>> a simple nonrecursive iterator...
> 
> But a send function is pretty pointless without the corresponding recv
> function... And imo recv simply is to dangerous as it's currently
> written.
> I am not saying that it cannot be made work, just that it's still nearly
> as ugly as when I pointed out several of the dangers some weeks back.

Oleg, Teodor, any comments on the above?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 1:58 PM, Andres Freund  wrote:
> I think there's zero overlap. They're completely complimentary features. It's 
> not like normal WAL records have an irrelevant volume.

I'd have thought so too, but I would not like to assume. Like many
people commenting on this thread, I don't know very much about
compression.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Andres Freund
On February 4, 2014 10:50:10 PM CET, Peter Geoghegan  wrote:
>On Tue, Feb 4, 2014 at 11:11 AM, Andres Freund 
>wrote:
>>> Does this feature relate to compression of WAL page images at all?
>>
>> No.
>
>So the obvious question is: where, if anywhere, do the two efforts
>(this patch, and Fujii's patch) overlap? Does Fujii have any concerns
>about this patch as it relates to his effort to compress FPIs?

I think there's zero overlap. They're completely complimentary features. It's 
not like normal WAL records have an irrelevant volume.

Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 11:11 AM, Andres Freund  wrote:
>> Does this feature relate to compression of WAL page images at all?
>
> No.

So the obvious question is: where, if anywhere, do the two efforts
(this patch, and Fujii's patch) overlap? Does Fujii have any concerns
about this patch as it relates to his effort to compress FPIs?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Bruce Momjian
On Tue, Feb  4, 2014 at 08:11:18PM +0100, Andres Freund wrote:
> On 2014-02-04 14:09:57 -0500, Bruce Momjian wrote:
> > On Tue, Feb  4, 2014 at 01:28:38PM -0500, Robert Haas wrote:
> > > Meanwhile, in friendlier cases, like "one short and one long field, no
> > > change", we're seeing big improvements.  That particular case shows a
> > > speedup of 21% and a WAL reduction of 36%.  That's a pretty big deal,
> > > and I think not unrepresentative of many real-world workloads.  Some
> > > might well do better, having either more or longer unchanged fields.
> > > Assuming that the logic isn't buggy, a point in need of further study,
> > > I'm starting to feel like we want to have this.  And I might even be
> > > tempted to remove the table-level off switch.
> > 
> > Does this feature relate to compression of WAL page images at all?
> 
> No.

I guess it bothers me we are working on compressing row change sets
while the majority(?) of WAL is page images.  I know we had a page image
compression patch that got stalled.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 12:30 PM, Christian Kruse
 wrote:
> Ok, benchmark for baseline+alignment patch is running.

I see that you have enabled latency information. For this kind of
thing I prefer to hack pgbench-tools to not collect this (i.e. to not
pass the "-l" flag, "Per-Transaction Logging"). Just remove it and
pgbench-tools rolls with it. It may well be that the overhead added is
completely insignificant, but for something like this, where the
latency information is unlikely to add any value, I prefer to not take
the chance. This is a fairly minor point, however, especially since
these are only 60 second runs where you're unlikely to accumulate
enough transaction latency information to notice any effect.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2014-02-04 Thread Alexander Korotkov
On Mon, Feb 3, 2014 at 6:31 PM, Alexander Korotkov wrote:

> On Mon, Jan 27, 2014 at 7:30 PM, Alexander Korotkov 
> wrote:
>
>> On Mon, Jan 27, 2014 at 2:32 PM, Alexander Korotkov > > wrote:
>>
>>> On Sun, Jan 26, 2014 at 8:14 PM, Heikki Linnakangas <
>>> hlinnakan...@vmware.com> wrote:
>>>
 On 01/26/2014 08:24 AM, Tomas Vondra wrote:

> Hi!
>
> On 25.1.2014 22:21, Heikki Linnakangas wrote:
>
>> Attached is a new version of the patch set, with those bugs fixed.
>>
>
> I've done a bunch of tests with all the 4 patches applied, and it seems
> to work now. I've done tests with various conditions (AND/OR, number of
> words, number of conditions) and I so far I did not get any crashes,
> infinite loops or anything like that.
>
> I've also compared the results to 9.3 - by dumping the database and
> running the same set of queries on both machines, and indeed I got 100%
> match.
>
> I also did some performance tests, and that's when I started to worry.
>
> For example, I generated and ran 1000 queries that look like this:
>
>SELECT id FROM messages
> WHERE body_tsvector @@ to_tsquery('english','(header & 53 & 32 &
> useful & dropped)')
> ORDER BY ts_rank(body_tsvector, to_tsquery('english','(header & 53
> &
> 32 & useful & dropped)')) DESC;
>
> i.e. in this case the query always was 5 words connected by AND. This
> query is a pretty common pattern for fulltext search - sort by a list
> of
> words and give me the best ranked results.
>
> On 9.3, the script was running for ~23 seconds, on patched HEAD it was
> ~40. It's perfectly reproducible, I've repeated the test several times
> with exactly the same results. The test is CPU bound, there's no I/O
> activity at all. I got the same results with more queries (~100k).
>
> Attached is a simple chart with x-axis used for durations measured on
> 9.3.2, y-axis used for durations measured on patched HEAD. It's obvious
> a vast majority of queries is up to 2x slower - that's pretty obvious
> from the chart.
>
> Only about 50 queries are faster on HEAD, and >700 queries are more
> than
> 50% slower on HEAD (i.e. if the query took 100ms on 9.3, it takes
> >150ms
> on HEAD).
>
> Typically, the EXPLAIN ANALYZE looks something like this (on 9.3):
>
>   http://explain.depesz.com/s/5tv
>
> and on HEAD (same query):
>
>   http://explain.depesz.com/s/1lI
>
> Clearly the main difference is in the "Bitmap Index Scan" which takes
> 60ms on 9.3 and 120ms on HEAD.
>
> On 9.3 the "perf top" looks like this:
>
>  34.79%  postgres [.] gingetbitmap
>  28.96%  postgres [.] ginCompareItemPointers
>   9.36%  postgres [.] TS_execute
>   5.36%  postgres [.] check_stack_depth
>   3.57%  postgres [.] FunctionCall8Coll
>
> while on 9.4 it looks like this:
>
>  28.20%  postgres [.] gingetbitmap
>  21.17%  postgres [.] TS_execute
>   8.08%  postgres [.] check_stack_depth
>   7.11%  postgres [.] FunctionCall8Coll
>   4.34%  postgres [.] shimTriConsistentFn
>
> Not sure how to interpret that, though. For example where did the
> ginCompareItemPointers go? I suspect it's thanks to inlining, and that
> it might be related to the performance decrease. Or maybe not.
>

 Yeah, inlining makes it disappear from the profile, and spreads that
 time to the functions calling it.

 The profile tells us that the consistent function is called a lot more
 than before. That is expected - with the fast scan feature, we're calling
 consistent not only for potential matches, but also to refute TIDs based on
 just a few entries matching. If that's effective, it allows us to skip many
 TIDs and avoid consistent calls, which compensates, but if it's not
 effective, it's just overhead.

 I would actually expect it to be fairly effective for that query, so
 that's a bit surprising. I added counters to see where the calls are coming
 from, and it seems that about 80% of the calls are actually coming from
 this little the feature I explained earlier:


  In addition to that, I'm using the ternary consistent function to check
> if minItem is a match, even if we haven't loaded all the entries yet.
> That's less important, but I think for something like "rare1 | (rare2 &
> frequent)" it might be useful. It would allow us to skip fetching
> 'frequent', when we already know that 'rare1' matches for the current
> item. I'm not sure if that's worth the cycles, but it seemed like an
> obvious thing to do, no

Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 03:08 PM, Jeff Janes wrote:



Do you know about what it would cost?  Could official community funds 
be used for it (it seems like something that is cheap, but which you 
wouldn't want to be forgotten about some month.)


Having an AMI would help, but even with an AMI in place, MinGW is 
still insanely slow.  Running "make" on already made PostgreSQL (so 
there was nothing to actually do) takes 1.5 minutes.  And a make after 
a "make clean" takes half an hour.  This is on an actual desktop, not 
an AWS micro instance.  So doing a git bisect is just painful.  Is the 
MSVC build faster?






Would have to check with the same build options (cassert and debug have 
major timing effects.) I agree it's not lightning fast like "make -j 4" 
on a decent linux box.


cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Christian Kruse
Hi,

On 04/02/14 21:03, Andres Freund wrote:
> Christian, could you rerun with master (the commit on which the
> branch is based on), the alignment patch, and then the lwlock patch?
> Best with max_connections 200.  That's probably more important than
> the write tests as a first step..

Ok, benchmark for baseline+alignment patch is running. This will take
a couple of hours and since I have to get up at about 05:00 I won't be
able to post it before tomorrow.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgpsN5kcUiOgQ.pgp
Description: PGP signature


[HACKERS] Re: Viability of text HISTORY/INSTALL/regression README files (was Re: [COMMITTERS] pgsql: Document a few more regression test hazards.)

2014-02-04 Thread Robert Haas
On Tue, Feb 4, 2014 at 1:38 AM, Tom Lane  wrote:
> Noah Misch  writes:
>>> Robert Haas  writes:
 I wonder if these standalone things are really worthwhile.
>
>> I wonder how difficult it would be to make sufficient link data available 
>> when
>> building the standalone files.  There would be no linking per se; we would
>> just need the referent's text fragment emitted where the  tag appears.
>
> IIRC, that's basically what the "workaround" is, except it's not very
> automated.  Even if it were automated, though, there's still a problem:
> such links aren't really *useful* in flat text format.  I think that
> forcing the author to actually think about what to put there in the
> flat text version is a good thing, if we're going to retain the flat
> text version at all.

Right.  I mean, a lot of the links say things like "Section 26.2"
which obviously makes no sense in a standalone text file.

I agree with your comments upthread: INSTALL *might* still be useful
to somebody, but I would be pretty surprised if anyone uses HISTORY or
regress_README for anything any more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Peter Eisentraut
Perhaps this type should be called pglsn, since it's an
implementation-specific detail and not a universal concept like int,
point, or uuid.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Robert Haas
On Tue, Feb 4, 2014 at 12:34 PM, Tom Lane  wrote:
> Andrew Dunstan  writes:
>> If someone volunteered to pay for the storage, I'd be prepared to make
>> some time to create an AMI to reduce the startup time dramatically.
>> Basically it would be "boot the AMI and start testing your patches". I'd
>> even make it as friendly as possible for people who don't like to get
>> too far from unix-ish environments.
>
> My own opinion is that I've already wasted untold man-hours thanks to
> the random porting problems induced by Windows, a platform that I never
> have and never will care about personally.  I will *not* spend my own
> time doing tests that someone else could do.  If we can't get some
> effort contributed by someone who does use that platform, I'm personally
> prepared to declare the entire damn thing no longer supported.

You know, I would really prefer to just stick a PGDLLIMPORT on this
place and any others that need it, and any others that come up, than
turn this into a political football.  Having to sprinkle PGDLLIMPORT
on the handful of variables that are accessed by contrib modules is,
indeed, annoying.  But it's not any more annoying than twelve other
things that I have to do as a committer, like remembering to bump
whichever of catversion, the xlog page magic, and the control data
version are relevant to a particular commit, knowing which particular
SGML files have to build standalone, and enforcing the project's code
formatting, comment, and documentation conventions on everyone who
submits a patch.  So acting as if this particular annoyance is somehow
unique or a good reason to desupport what is despite all of our
misgivings one of our most popular platforms doesn't impress me one
bit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Christian Kruse
Hi,

On 04/02/14 12:02, Peter Geoghegan wrote:
> On Tue, Feb 4, 2014 at 11:39 AM, Christian Kruse
>  wrote:
> > I'm doing some benchmarks regarding this problem: one set with
> > baseline and one set with your patch. Machine was a 32 core machine (4
> > CPUs with 8 cores), 252 gib RAM.
> 
> What CPU model? Can you post /proc/cpuinfo? The distinction between
> logical and physical cores matters here.

model name  : Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz

32 physical cores, 64 logical cores. /proc/cpuinfo is applied.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgpZpWnKfQtb4.pgp
Description: PGP signature


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Jeff Janes
On Tue, Feb 4, 2014 at 9:26 AM, Andrew Dunstan  wrote:

>
> On 02/04/2014 11:30 AM, Andres Freund wrote:
>
>
>
>>>
>>> We have details on how to build with Mingw/Msys on Windows on an Amazon
>>>
>>> VM  which is
>>> either
>>> free or very cheap. Do I need to give instructions on how to do this
>>> for
>>> MSVC builds too? It's really not terribly hard.
>>>
>> Err. It might not be very hard but it certainly is time consuming. And
>> that for people not caring about windows.
>>
>> If there were usable, regularly refreshed, instances out there'd it'd be
>> slightly less bad. But this still by far the most annoying and intrusive
>> platform to care about.
>>
>>
>
> If someone volunteered to pay for the storage, I'd be prepared to make
> some time to create an AMI to reduce the startup time dramatically.
> Basically it would be "boot the AMI and start testing your patches". I'd
> even make it as friendly as possible for people who don't like to get too
> far from unix-ish environments.
>

Do you know about what it would cost?  Could official community funds be
used for it (it seems like something that is cheap, but which you wouldn't
want to be forgotten about some month.)

Having an AMI would help, but even with an AMI in place, MinGW is still
insanely slow.  Running "make" on already made PostgreSQL (so there was
nothing to actually do) takes 1.5 minutes.  And a make after a "make clean"
takes half an hour.  This is on an actual desktop, not an AWS micro
instance.  So doing a git bisect is just painful.  Is the MSVC build faster?

Cheers,

Jeff


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Andres Freund
On February 4, 2014 8:53:36 PM CET, Peter Geoghegan  wrote:
>On Tue, Feb 4, 2014 at 11:50 AM, Andres Freund 
>wrote:
>> I think he has applied the patch to hack around the alignment issue I
>> pushed to git for both branches. It's not nice enough to be applied
>yet,
>> but it should fix the issue.
>> I think the 201 is just a remembrance of debugging the issue.
>
>I guess that given that *both* cases tested had the patch applied,
>that makes sense. However, I would have liked to see a real master
>baseline.

Christian, could you rerun with master (the commit on which the branch is based 
on), the alignment patch, and then the lwlock patch? Best with max_connections 
200.
That's probably more important than the write tests as a first step..

Thanks,
Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 11:39 AM, Christian Kruse
 wrote:
> I'm doing some benchmarks regarding this problem: one set with
> baseline and one set with your patch. Machine was a 32 core machine (4
> CPUs with 8 cores), 252 gib RAM.

What CPU model? Can you post /proc/cpuinfo? The distinction between
logical and physical cores matters here.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Joshua D. Drake


On 02/04/2014 11:17 AM, Andrew Dunstan wrote:


prepared to declare the entire damn thing no longer supported.

Although that is obviously your prerogative it is important to remember
that Windows is easily the second most used version of PostgreSQL out
there (behind Linux).

[ shrug... ]  If it's so widely used, why is it so hard to find somebody
who's willing to put in some development effort for it?




Because Windows developers get paid and have better things to do than 
interact with a bunch of Open Source people and their opinions. Let's be 
honest, all of us here are not the easiest to work with and we are here 
largely due to an ideology. Why would a Windows developer bother?


I know lots of Windows Developers and most of them don't follow anywhere 
near the ideology we do. I know very few (read 0) Windows developers 
that do it for fun (I am not saying they don't have fun. I am saying 
they don't do it for fun). They do it for their career.


Heck, look back through the years at our archives. Nobody really thought 
that we would somehow bring windows developers into the fold if we 
ported to that platform. It was a market share play and it worked. 
However, now we are paying the price for it.


To be perfectly frank, the first thing I do when a customer calls and 
says we are on Windows is tell them I will spend our relationship 
pushing them to move PostgreSQL to Linux. It works over time for the 
most part but not on all.


Sincerely,

Joshua D. Drake




--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 11:50 AM, Andres Freund  wrote:
> I think he has applied the patch to hack around the alignment issue I
> pushed to git for both branches. It's not nice enough to be applied yet,
> but it should fix the issue.
> I think the 201 is just a remembrance of debugging the issue.

I guess that given that *both* cases tested had the patch applied,
that makes sense. However, I would have liked to see a real master
baseline.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Andres Freund
On 2014-02-04 11:48:14 -0800, Peter Geoghegan wrote:
> On Tue, Feb 4, 2014 at 11:39 AM, Christian Kruse
>  wrote:
> > I added -M prepared to the pgbench call in the benchwarmer script.
> >
> > The read-only tests are finished, I come to similiar results as yours:
> >
> > 
> 
> Note that Christian ran this test with max_connections=201, presumably
> to exercise the alignment problem.

I think he has applied the patch to hack around the alignment issue I
pushed to git for both branches. It's not nice enough to be applied yet,
but it should fix the issue.
I think the 201 is just a remembrance of debugging the issue.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 11:39 AM, Christian Kruse
 wrote:
> I added -M prepared to the pgbench call in the benchwarmer script.
>
> The read-only tests are finished, I come to similiar results as yours:
>
> 

Note that Christian ran this test with max_connections=201, presumably
to exercise the alignment problem.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 11:39 AM, Christian Kruse
 wrote:
> I'm doing some benchmarks regarding this problem: one set with
> baseline and one set with your patch. Machine was a 32 core machine (4
> CPUs with 8 cores), 252 gib RAM. Both versions have the type align
> patch applied.

It certainly seems as if the interesting cases are where clients > cores.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Christian Kruse
Hi,

I'm doing some benchmarks regarding this problem: one set with
baseline and one set with your patch. Machine was a 32 core machine (4
CPUs with 8 cores), 252 gib RAM. Both versions have the type align
patch applied. pgbench-tools config:

SCALES="100"
SETCLIENTS="1 4 8 16 32 48 64 96 128"
SETTIMES=2

I added -M prepared to the pgbench call in the benchwarmer script.

The read-only tests are finished, I come to similiar results as yours:



I think the small differences are caused by the fact that I use TCP
connections and not Unix domain sockets.

The results are pretty impressive… I will post the read-write results
as soon as they are finished.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgpKL7vRo_Vp0.pgp
Description: PGP signature


Re: [HACKERS] PQputCopyData dont signal error

2014-02-04 Thread Tom Lane
Jeff Davis  writes:
> On Thu, 2011-04-14 at 10:50 +0300, Heikki Linnakangas wrote:
>> I think you'll need to send all the data and finish the COPY until you 
>> get an error. If you have a lot of data to send, you might want to slice 
>> it into multiple COPY statements of say 50MB each, so that you can catch 
>> errors in between.

> If the remaining CopyData messages are dropped, I don't see why
> PQputCopyData can't return some kind of error indicating that further
> CopyData messages are useless so that it can stop sending them.

An error from PQputCopyData indicates a connection-level problem,
typically; that is it means that it couldn't send data not that the server
had detected a problem with some earlier data.  I'd not be surprised
if applications would respond to such an error indication by just
pre-emptively dropping the connection.  (A quick trawl through our
own code shows a precedent in pg_restore, in fact.)  So the idea
of having PQputCopyData start failing as soon as an error has arrived
from the server doesn't sound that attractive.

What'd be safer is to provide a way of detecting whether an error
has arrived (without actually consuming it, of course) so that the
case could be handled by adding something like

if (PQerrorIsPending(conn))
break;

to the send-data loop.  This would allow the application code to know
what is happening.  It would also not impose the penalty of checking
for errors on apps that prefer to optimize for the no-error case.

A different approach would be for PQputCopyData to just internally
suppress sending of further CopyData messages, *without* returning
an error; this would be transparent and it would save the network
traffic at least.  However, my reading of the original example is
that the OP was at least as interested in suppressing the creation
of further data as in suppressing the sending, so this wouldn't
really solve his problem completely.  And it's also optimizing
for the error case not the no-error case, in a way that doesn't give
the app any say in the matter.  So I'm not for this particularly,
just mentioning it for completeness.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore - large insert crashes server

2014-02-04 Thread Stephen Frost
* Erik Rijkers (e...@xs4all.nl) wrote:
> On Tue, February 4, 2014 18:56, Christian Kruse wrote:
> > On 04/02/14 17:41, Erik Rijkers wrote:
> >> 2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was 
> >> terminated by signal 9: Killed
> >
> > Did you check if this was the OOM killer? Should be logged in dmesg.
> 
> I would be surprised if it wasn't. (no access to that machine at the moment)
> 
> How do we regard such crashes?  It seems to me this was rather eaasily 
> 'provoked' (for want of a better word).
> 
> I am inclined to blame the patch...

It sounds like there is at least some investigation which should happen
here to see why we're using so much memory (well beyond work_mem and
even maint_work_mem it sounds like), but it would also be good to have
the machine reconfigured to not allow OOM killing.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 01:53 PM, Tom Lane wrote:

"Joshua D. Drake"  writes:

On 02/04/2014 09:34 AM, Tom Lane wrote:

My own opinion is that I've already wasted untold man-hours thanks to
the random porting problems induced by Windows, a platform that I never
have and never will care about personally.  I will *not* spend my own
time doing tests that someone else could do.  If we can't get some
effort contributed by someone who does use that platform, I'm personally
prepared to declare the entire damn thing no longer supported.

Although that is obviously your prerogative it is important to remember
that Windows is easily the second most used version of PostgreSQL out
there (behind Linux).

[ shrug... ]  If it's so widely used, why is it so hard to find somebody
who's willing to put in some development effort for it?




I suspect that the open hostility towards it doesn't help much. I do put 
in quite alot of effort supporting it one way and another, committing 
patches and running buildfarm members among other things. And we have 
several others who put in significant amounts of effort. And no, it's 
not my platform of choice either.


Anyway, I will explore my suggestion upthread of a pre-prepared windows 
development AMI, when I have time. Right now between work for actual 
money and things like jsonb I am slammed.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Adrian Klaver

On 02/04/2014 10:53 AM, Tom Lane wrote:

"Joshua D. Drake"  writes:

On 02/04/2014 09:34 AM, Tom Lane wrote:

My own opinion is that I've already wasted untold man-hours thanks to
the random porting problems induced by Windows, a platform that I never
have and never will care about personally.  I will *not* spend my own
time doing tests that someone else could do.  If we can't get some
effort contributed by someone who does use that platform, I'm personally
prepared to declare the entire damn thing no longer supported.



Although that is obviously your prerogative it is important to remember
that Windows is easily the second most used version of PostgreSQL out
there (behind Linux).


[ shrug... ]  If it's so widely used, why is it so hard to find somebody
who's willing to put in some development effort for it?


Well, from what I have seen of the Windows world there is a fairly sharp 
demarcation between users and developers. So use does not necessarily 
mean developer knowledge.





regards, tom lane





--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore - large insert crashes server

2014-02-04 Thread Tom Lane
"Erik Rijkers"  writes:
> On Tue, February 4, 2014 18:56, Christian Kruse wrote:
>> Did you check if this was the OOM killer? Should be logged in dmesg.

> I would be surprised if it wasn't. (no access to that machine at the moment)

> How do we regard such crashes?  It seems to me this was rather eaasily 
> 'provoked' (for want of a better word).

Well, it suggests that there may be a memory leak, which would be a bug
even though we'd assign it lower priority than a true crash.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQputCopyData dont signal error

2014-02-04 Thread Jeff Davis
On Thu, 2011-04-14 at 10:50 +0300, Heikki Linnakangas wrote:
> On 14.04.2011 10:15, Pavel Stehule wrote:
> > Hello
> >
> > I have a problem with PQputCopyData function. It doesn't signal some error.
> >
> > while ((row = mysql_fetch_row(res)) != NULL)
> > {
> > snprintf(buffer, sizeof(buffer), "%s%s\n", row[0], 
> > row[1]);
> > copy_result = PQputCopyData(pconn, buffer, 
> > strlen(buffer));
> > printf(">>%s<<\n", PQerrorMessage(pconn));
> > printf("%d\n", copy_result);
> > if (copy_result != 1)
> > {
> > fprintf(stderr, "Copy to target table failed: 
> > %s",
> > PQerrorMessage(pconn));
> > EXIT;
> > }
> > }
> >
> > it returns 1 for broken values too :(
> >
> > Is necessary some special check?
> 
> The way COPY works is that PQputCopyData just sends the data to the 
> server, and the server will buffer it in its internal buffer and 
> processes it when it feels like it. The PQputCopyData() calls don't even 
> need to match line boundaries.
> 
> I think you'll need to send all the data and finish the COPY until you 
> get an error. If you have a lot of data to send, you might want to slice 
> it into multiple COPY statements of say 50MB each, so that you can catch 
> errors in between.

[ replying to old thread ]

According to the protocol docs[1]:

"In the event of a backend-detected error during copy-in mode (including
receipt of a CopyFail message), the backend will issue an ErrorResponse
message. If the COPY command was issued via an extended-query message,
the backend will now discard frontend messages until a Sync message is
received, then it will issue ReadyForQuery and return to normal
processing. If the COPY command was issued in a simple Query message,
the rest of that message is discarded and ReadyForQuery is issued. In
either case, any subsequent CopyData, CopyDone, or CopyFail messages
issued by the frontend will simply be dropped."

If the remaining CopyData messages are dropped, I don't see why
PQputCopyData can't return some kind of error indicating that further
CopyData messages are useless so that it can stop sending them.

Asking the client to break the copy into multiple COPY commands is bad,
because then the client needs to figure out the line breaks, which is a
burden in many cases.

Certainly we don't want to *guarantee* that the backend will issue an
error at any particular point, because of the buffering on the server
side. But from a practical standpoint, the server will let the client
know fairly quickly and it will avoid a lot of client-side work and
network traffic.

Would a change to PQputCopyData be welcome?

Regards,
Jeff Davis

[1] http://www.postgresql.org/docs/9.3/static/protocol-flow.html





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Andres Freund
On 2014-02-04 14:09:57 -0500, Bruce Momjian wrote:
> On Tue, Feb  4, 2014 at 01:28:38PM -0500, Robert Haas wrote:
> > Meanwhile, in friendlier cases, like "one short and one long field, no
> > change", we're seeing big improvements.  That particular case shows a
> > speedup of 21% and a WAL reduction of 36%.  That's a pretty big deal,
> > and I think not unrepresentative of many real-world workloads.  Some
> > might well do better, having either more or longer unchanged fields.
> > Assuming that the logic isn't buggy, a point in need of further study,
> > I'm starting to feel like we want to have this.  And I might even be
> > tempted to remove the table-level off switch.
> 
> Does this feature relate to compression of WAL page images at all?

No.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Bruce Momjian
On Tue, Feb  4, 2014 at 01:28:38PM -0500, Robert Haas wrote:
> Meanwhile, in friendlier cases, like "one short and one long field, no
> change", we're seeing big improvements.  That particular case shows a
> speedup of 21% and a WAL reduction of 36%.  That's a pretty big deal,
> and I think not unrepresentative of many real-world workloads.  Some
> might well do better, having either more or longer unchanged fields.
> Assuming that the logic isn't buggy, a point in need of further study,
> I'm starting to feel like we want to have this.  And I might even be
> tempted to remove the table-level off switch.

Does this feature relate to compression of WAL page images at all?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
"Joshua D. Drake"  writes:
> On 02/04/2014 09:34 AM, Tom Lane wrote:
>> My own opinion is that I've already wasted untold man-hours thanks to
>> the random porting problems induced by Windows, a platform that I never
>> have and never will care about personally.  I will *not* spend my own
>> time doing tests that someone else could do.  If we can't get some
>> effort contributed by someone who does use that platform, I'm personally
>> prepared to declare the entire damn thing no longer supported.

> Although that is obviously your prerogative it is important to remember 
> that Windows is easily the second most used version of PostgreSQL out 
> there (behind Linux).

[ shrug... ]  If it's so widely used, why is it so hard to find somebody
who's willing to put in some development effort for it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore - large insert crashes server

2014-02-04 Thread Erik Rijkers
On Tue, February 4, 2014 18:56, Christian Kruse wrote:
> Hi,
>
> On 04/02/14 17:41, Erik Rijkers wrote:
>
>> 2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was 
>> terminated by signal 9: Killed
>
> Did you check if this was the OOM killer? Should be logged in dmesg.
>

I would be surprised if it wasn't. (no access to that machine at the moment)

How do we regard such crashes?  It seems to me this was rather eaasily 
'provoked' (for want of a better word).

I am inclined to blame the patch...









-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore - large insert crashes server

2014-02-04 Thread Christian Kruse
Hi,

On 04/02/14 17:41, Erik Rijkers wrote:

> 2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was 
> terminated by signal 9: Killed

Did you check if this was the OOM killer? Should be logged in dmesg.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgpfuIWxnh8kv.pgp
Description: PGP signature


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Robert Haas
On Tue, Feb 4, 2014 at 12:39 PM, Amit Kapila  wrote:
> Now there is approximately 1.4~5% CPU gain for
> "hundred tiny fields, half nulled" case

I don't want to advocate too strongly for this patch because, number
one, Amit is a colleague and more importantly, number two, I can't
claim to be an expert on compression.  But that having been said, I
think these numbers are starting to look awfully good.  The only
remaining regressions are in the cases where a large fraction of the
tuple turns over, and they're not that big even then.  The two *worst*
tests now seem to be "hundred tiny fields, all changed" and "hundred
tiny fields, half changed".  For the "all changed" case, the median
unpatched time is 16.3172590732574 and the median patched time is
16.9294109344482, a <4% loss; for the "half changed" case, the median
unpatched time is 16.5795118808746 and the median patched time is
17.0454230308533, a <3% loss.  Both cases show minimal change in WAL
volume.

Meanwhile, in friendlier cases, like "one short and one long field, no
change", we're seeing big improvements.  That particular case shows a
speedup of 21% and a WAL reduction of 36%.  That's a pretty big deal,
and I think not unrepresentative of many real-world workloads.  Some
might well do better, having either more or longer unchanged fields.
Assuming that the logic isn't buggy, a point in need of further study,
I'm starting to feel like we want to have this.  And I might even be
tempted to remove the table-level off switch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Jeff Janes
On Tue, Feb 4, 2014 at 8:06 AM, Andrew Dunstan  wrote:

>
> On 02/04/2014 10:43 AM, Tom Lane wrote:
>
>>
>> Ugh.  This problem was bad enough when I thought that it would only lead
>> to link-time errors detectable in the buildfarm.  If it can lead to errors
>> only observable at runtime --- and maybe not obvious even then --- then
>> I think we *have to* do something about it.  By that I mean that we must
>> get rid of the need to manually plaster PGDLLIMPORT on global variables.
>>
>> Anybody with a Windows build environment want to test the "#define extern"
>> trick?
>>
>>
>>
>
>
>
> We have details on how to build with Mingw/Msys on Windows on an Amazon VM
>  which is either
> free or very cheap. Do I need to give instructions on how to do this for
> MSVC builds too? It's really not terribly hard.
>


If you gave step by step instructions like the ones for MinGW, I would at
least give it a try.  Last time a looked into it, I gave up after I
couldn't figure out which of the umpteen similarly-named products was the
one I needed to buy/download-for-free/register-and-download and then
install, and I tried a few of them at random without much success.

Cheers,

Jeff


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Joshua D. Drake


On 02/04/2014 09:34 AM, Tom Lane wrote:

Andrew Dunstan  writes:

If someone volunteered to pay for the storage, I'd be prepared to make
some time to create an AMI to reduce the startup time dramatically.
Basically it would be "boot the AMI and start testing your patches". I'd
even make it as friendly as possible for people who don't like to get
too far from unix-ish environments.


My own opinion is that I've already wasted untold man-hours thanks to
the random porting problems induced by Windows, a platform that I never
have and never will care about personally.  I will *not* spend my own
time doing tests that someone else could do.  If we can't get some
effort contributed by someone who does use that platform, I'm personally
prepared to declare the entire damn thing no longer supported.


Although that is obviously your prerogative it is important to remember 
that Windows is easily the second most used version of PostgreSQL out 
there (behind Linux).


I know many people that run it on Windows, especially in the medical 
field. I also know many people that embed it (Apple not withstanding).


Yes it is an obnoxious platform but it is THE platform, no matter how 
much we like to tout our Linux(Hiker) creds.


Sincerely,

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Andrew Dunstan  writes:
> If someone volunteered to pay for the storage, I'd be prepared to make 
> some time to create an AMI to reduce the startup time dramatically. 
> Basically it would be "boot the AMI and start testing your patches". I'd 
> even make it as friendly as possible for people who don't like to get 
> too far from unix-ish environments.

My own opinion is that I've already wasted untold man-hours thanks to
the random porting problems induced by Windows, a platform that I never
have and never will care about personally.  I will *not* spend my own
time doing tests that someone else could do.  If we can't get some
effort contributed by someone who does use that platform, I'm personally
prepared to declare the entire damn thing no longer supported.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 11:30 AM, Andres Freund wrote:





We have details on how to build with Mingw/Msys on Windows on an Amazon

VM  which is
either
free or very cheap. Do I need to give instructions on how to do this
for
MSVC builds too? It's really not terribly hard.

Err. It might not be very hard but it certainly is time consuming. And that for 
people not caring about windows.

If there were usable, regularly refreshed, instances out there'd it'd be 
slightly less bad. But this still by far the most annoying and intrusive 
platform to care about.




If someone volunteered to pay for the storage, I'd be prepared to make 
some time to create an AMI to reduce the startup time dramatically. 
Basically it would be "boot the AMI and start testing your patches". I'd 
even make it as friendly as possible for people who don't like to get 
too far from unix-ish environments.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] specifying repeatable read in PGOPTIONS

2014-02-04 Thread Tom Lane
Andres Freund  writes:
> On 2014-02-04 11:36:22 -0500, Tom Lane wrote:
>> -1.  This is not a general solution to the problem.  There are other
>> GUCs for which people might want spaces in the value.

> Sure, I didn't say it was. But I don't see any oother values that are
> likely being passed via PGOPTIONS that frequently contain spaces.

application_name --- weren't we just reading about people passing entire
command lines there?  (They must be using some other way of setting it
currently, but PGOPTIONS doesn't seem like an implausible source.)

>> Yeah.  See pg_split_opts(), which explicitly acknowledges that it'll fall
>> down for space-containing options.  Not sure what the most appropriate
>> quoting convention would be there, but I'm sure we can think of something.

> No argument against introducing it. What about simply allowing escaping
> of the next character using \?

The same thought had occurred to me.  Since it'll typically already be
inside some levels of quoting, any quoted-string convention seems like
it'd be a pain to use.  But a straight backslash-escapes-the-next-char
thing wouldn't be too awful, I think.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extension_control_path

2014-02-04 Thread David E. Wheeler
On Jan 30, 2014, at 10:06 AM, Sergey Muraviov  
wrote:

> Now it looks fine for me.

Just as another data point, I recently submitted pgTAP to the Homebrew project 
This is the build-from-source system for OS X, used by a lot of web developers. 
In my build script, I originally had

   depends_on :postgresql

Which means, “require any version of PostgreSQL.” But then tests failed on OS X 
Server, which includes a system-distributed PostgreSQL. Homebrew installs 
everything in /usr/local, and not only does it disallow installing anything 
outside of that directory, it doesn’t have any permissions to do so. The 
install failed, of course, because extensions want to install in 
$PGROOT/share/extensions. For now, I had to change it to

   depends_on 'postgresql'

A subtle difference that means, “require the latest version of the 
Homebrew-built PostgreSQL in /usr/local.”

However, if extension_control_path was supported, I could change it back to 
requiring any Postgres and install pgTAP somewhere under /usr/local, as 
required for Homebrew. Then all the user would have to do to use it with their 
preferred Postgres would be to set extension_control_path.

In other words, I am strongly in favor of this patch, as it gives distribution 
systems a lot more flexibility (for better and for worse) in determining where 
extensions should be installed.

My $0.02.

Best,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] specifying repeatable read in PGOPTIONS

2014-02-04 Thread Andres Freund
On 2014-02-04 11:36:22 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > PGOPTIONS='-c default_transaction_isolation=serializable' \
> > psql ... -c "SHOW default_transaction_isolation"
> > works well enough, but
> > PGOPTIONS='-c default_transaction_isolation=repeatable read' \
> > psql ... -c "SHOW default_transaction_isolation"
> > doesn't, because of the whitespace. I couldn't come up with any adequate
> > quoting.
> 
> > I'd like to propose adding aliases with dashes instead of spaces to the
> > isolation_level_options array? I'd even like to backport it, because it
> > makes benchmarking across versions unneccessarily hard.
> 
> -1.  This is not a general solution to the problem.  There are other
> GUCs for which people might want spaces in the value.

Sure, I didn't say it was. But I don't see any oother values that are
likely being passed via PGOPTIONS that frequently contain spaces. Sure,
you can generate a search_path that does so, but that's just asking for
problems. Most other GUCs that can contain spaces are
PGC_SIGHUP/POSTMASTER.
And having to use quoting just makes it awkward to use from shell. Since
all the other option values try to take not to force using spaces, I see
little reason not to do so here as well.

> > Additionally we might want to think about a bit better quoting support
> > for such options?
> 
> Yeah.  See pg_split_opts(), which explicitly acknowledges that it'll fall
> down for space-containing options.  Not sure what the most appropriate
> quoting convention would be there, but I'm sure we can think of something.

No argument against introducing it. What about simply allowing escaping
of the next character using \?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] nested hstore - large insert crashes server

2014-02-04 Thread Erik Rijkers
CentOS Release 6.5 (final)
AMD FX(tm)-8120 Eight-Core
2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 x86_64 
x86_64 GNU/Linux
memory: 8GB

I am testing nested hstore, on a server with both with these patches:
  jsonb-9.patch.gz
  nested-hstore-9.patch.gz

One of the first tries brings down the server (gobbles up too much memory, I 
think).

When I run:


select version();

drop   table if exists t_1000 ;
create table   t_1000 (
idserial
  , hshstore primary key
  , hs2   hstore
);
insert into t_1000 (hs,hs2)
  select
   ( '['   || i || ','  || i|| ']' ) ::hstore
,  ( '{'   || i || '=>' || i|| '}' ) ::hstore
  from generate_series(1, 1000) as f(i)
;
-


I get:

$ time psql -af nestedhs.sql
\timing on
Timing is on.
select version();
 version
--
 PostgreSQL 9.4devel_nested_hstore_20140204_0814_00d4f2a on 
x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.2, 64-bit
(1 row)

Time: 1.288 ms
drop   table if exists t_1000 ;
DROP TABLE
Time: 0.808 ms
create table   t_1000 (
idserial
  , hshstore primary key
  , hs2   hstore
);
CREATE TABLE
Time: 111.397 ms
insert into t_1000 (hs,hs2)
  select
  ( '['   || i || ','  || i|| ']' ) ::hstore
, ( '{'   || i || '=>' || i|| '}' ) ::hstore
  from generate_series(1, 1000) as f(i)
;
[psql:nestedhs.sql:14: connection to server was lost

real5m4.780s
user0m0.005s
sys 0m0.009s

logging:

2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was 
terminated by signal 9: Killed
2014-02-04 10:34:25.854 CET 29133 DETAIL:  Failed process was running: insert 
into t_1000 (hs,hs2)
select
 ( '['   || i || ','  || i|| ']' ) ::hstore
  ,  ( '{'   || i || '=>' || i|| '}' ) ::hstore
from generate_series(1, 1000) as f(i)
;
2014-02-04 10:34:25.884 CET 29133 LOG:  terminating any other active server 
processes
2014-02-04 10:34:28.541 CET 29133 LOG:  all server processes terminated; 
reinitializing
2014-02-04 10:34:30.002 CET 29534 LOG:  database system was interrupted; last 
known up at 2014-02-04 10:30:42 CET
2014-02-04 10:34:30.933 CET 29535 FATAL:  the database system is in recovery 
mode
2014-02-04 10:34:31.150 CET 29534 LOG:  database system was not properly shut 
down; automatic recovery in progress
2014-02-04 10:34:31.344 CET 29534 LOG:  redo starts at 1/B1CC92F8
2014-02-04 10:34:46.681 CET 29534 LOG:  unexpected pageaddr 1/86F4A000 in log 
segment 0001000100CC, offset
16031744
2014-02-04 10:34:46.681 CET 29534 LOG:  redo done at 1/CCF49F50
2014-02-04 10:34:52.039 CET 29133 LOG:  database system is ready to accept 
connections



(and btw, I end up with a large but unusable table:

testdb=# \dt+ t_1000
   List of relations
 Schema |Name| Type  |  Owner   |  Size  | Description
++---+--++-
 public | t_1000 | table | aardvark | 291 MB |
(1 row)

testdb=# select count(*) from t_1000;
 count
---
 0
(1 row)

)

Main .conf settings:

setting  |   current_setting
--+--
 autovacuum   | off
 port | 6541
 shared_buffers   | 512MB
 effective_cache_size | 2GB
 work_mem | 50MB
 maintenance_work_mem | 1GB
 checkpoint_segments  | 20
 server_version   | 9.4devel_nested_hstore_20140204_0814_00d4f2a
 pg_postmaster_start_time | 2014-02-04 10:12 (uptime: 0d 0h 33m 42s)
 data_checksums   | off

 feature_id |feature_name| is_supported | is_verified_by |  
comments
++--++-
 PKG100 | project name   | YES  | ej | nested_hstore
 PKG101 | patched| YES  | ej | YES
 PKG102 | patch file | YES  | ej |
/home/aardvark/download/pgpatches/0094/nested_hstore/20140130/jsonb-9.patch 
   +
||  ||
/home/aardvark/download/pgpatches/0094/nested_hstore/20140130/nested-hstore-9.patch
 PKG103 | build time | YES  | ej | 2014-02-04 
08:19:13.600371+01
 PKG104 | server_version | YES  | ej | 
9.4devel_nested_hstore_20140204_0814_00d4f2a
 PKG105 | server_version_num | YES  | ej | 90400
 PKG106 | port   | YES  | ej | 6541
 PKG110 | commit hash| YES  | ej  

Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andres Freund
On February 4, 2014 5:06:52 PM CET, Andrew Dunstan  wrote:
>
>On 02/04/2014 10:43 AM, Tom Lane wrote:
>> Andres Freund  writes:
>>> On 2014-02-04 02:10:47 -0500, Tom Lane wrote:
 Meh.  It might be that the DateStyle usage in postgres_fdw would
 accidentally fail to malfunction if it saw a bogus value of the
>variable.
 But it's hard to believe that this would be true of
>MainLWLockArray.
>>> There's not that much lwlock usage in contrib. It's just
>>> pg_stat_statements and pg_buffercache. Neither has tests... So it
>very
>>> well could be that breakage simply hasn't been observed.
>> Hm, you're right --- I'd have thought there were more of those.
>>
>> Ugh.  This problem was bad enough when I thought that it would only
>lead
>> to link-time errors detectable in the buildfarm.  If it can lead to
>errors
>> only observable at runtime --- and maybe not obvious even then ---
>then
>> I think we *have to* do something about it.  By that I mean that we
>must
>> get rid of the need to manually plaster PGDLLIMPORT on global
>variables.
>>
>> Anybody with a Windows build environment want to test the "#define
>extern"
>> trick?
>>
>>  
>
>
>
>We have details on how to build with Mingw/Msys on Windows on an Amazon
>
>VM  which is
>either 
>free or very cheap. Do I need to give instructions on how to do this
>for 
>MSVC builds too? It's really not terribly hard.

Err. It might not be very hard but it certainly is time consuming. And that for 
people not caring about windows.

If there were usable, regularly refreshed, instances out there'd it'd be 
slightly less bad. But this still by far the most annoying and intrusive 
platform to care about.

Andres


-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] specifying repeatable read in PGOPTIONS

2014-02-04 Thread Tom Lane
Andres Freund  writes:
> PGOPTIONS='-c default_transaction_isolation=serializable' \
> psql ... -c "SHOW default_transaction_isolation"
> works well enough, but
> PGOPTIONS='-c default_transaction_isolation=repeatable read' \
> psql ... -c "SHOW default_transaction_isolation"
> doesn't, because of the whitespace. I couldn't come up with any adequate
> quoting.

> I'd like to propose adding aliases with dashes instead of spaces to the
> isolation_level_options array? I'd even like to backport it, because it
> makes benchmarking across versions unneccessarily hard.

-1.  This is not a general solution to the problem.  There are other
GUCs for which people might want spaces in the value.

> Additionally we might want to think about a bit better quoting support
> for such options?

Yeah.  See pg_split_opts(), which explicitly acknowledges that it'll fall
down for space-containing options.  Not sure what the most appropriate
quoting convention would be there, but I'm sure we can think of something.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [doc patch] extra_float_digits and casting from real to numeric

2014-02-04 Thread Christoph Berg
Re: To Tom Lane 2014-01-08 <20140108094017.ga20...@msgid.df7cb.de>
> What about this patch to mention this gotcha more explicitely in the
> documentation?
> 
> diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
> new file mode 100644
> index 0386330..968f4a7
> *** a/doc/src/sgml/datatype.sgml
> --- b/doc/src/sgml/datatype.sgml
> *** NUMERIC
> *** 689,694 
> --- 689,697 
> 0, the output is the same on every platform
> supported by PostgreSQL.  Increasing it will produce output that
> more accurately represents the stored value, but may be unportable.
> +   Casts to other numeric datatypes and the to_char
> +   function are not affected by this setting, it affects only the text
> +   representation.
>
>   
>   

Anyone for that patch?

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 10:43 AM, Tom Lane wrote:

Andres Freund  writes:

On 2014-02-04 02:10:47 -0500, Tom Lane wrote:

Meh.  It might be that the DateStyle usage in postgres_fdw would
accidentally fail to malfunction if it saw a bogus value of the variable.
But it's hard to believe that this would be true of MainLWLockArray.

There's not that much lwlock usage in contrib. It's just
pg_stat_statements and pg_buffercache. Neither has tests... So it very
well could be that breakage simply hasn't been observed.

Hm, you're right --- I'd have thought there were more of those.

Ugh.  This problem was bad enough when I thought that it would only lead
to link-time errors detectable in the buildfarm.  If it can lead to errors
only observable at runtime --- and maybe not obvious even then --- then
I think we *have to* do something about it.  By that I mean that we must
get rid of the need to manually plaster PGDLLIMPORT on global variables.

Anybody with a Windows build environment want to test the "#define extern"
trick?






We have details on how to build with Mingw/Msys on Windows on an Amazon 
VM  which is either 
free or very cheap. Do I need to give instructions on how to do this for 
MSVC builds too? It's really not terribly hard.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Andres Freund  writes:
> On 2014-02-04 02:10:47 -0500, Tom Lane wrote:
>> Meh.  It might be that the DateStyle usage in postgres_fdw would
>> accidentally fail to malfunction if it saw a bogus value of the variable.
>> But it's hard to believe that this would be true of MainLWLockArray.

> There's not that much lwlock usage in contrib. It's just
> pg_stat_statements and pg_buffercache. Neither has tests... So it very
> well could be that breakage simply hasn't been observed.

Hm, you're right --- I'd have thought there were more of those.

Ugh.  This problem was bad enough when I thought that it would only lead
to link-time errors detectable in the buildfarm.  If it can lead to errors
only observable at runtime --- and maybe not obvious even then --- then
I think we *have to* do something about it.  By that I mean that we must
get rid of the need to manually plaster PGDLLIMPORT on global variables.

Anybody with a Windows build environment want to test the "#define extern"
trick?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not create IPv6 socket (AI_ADDRCONFIG)

2014-02-04 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Kyotaro HORIGUCHI  writes:
>>> Hello, I have often seen inquiries about an log message from
>>> PostgreSQL server.
>>> LOG:  could not create IPv6 socket: Address family not supported by protocol

>> That's merely a harmless log message.

> How about just adding a HINT?

Hmm ... maybe, but how would you phrase the hint exactly?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not create IPv6 socket (AI_ADDRCONFIG)

2014-02-04 Thread Alvaro Herrera
Tom Lane wrote:
> Kyotaro HORIGUCHI  writes:
> > Hello, I have often seen inquiries about an log message from
> > PostgreSQL server.
> >> LOG:  could not create IPv6 socket: Address family not supported by 
> >> protocol
> 
> That's merely a harmless log message.


> If we're concerned about users worrying about log messages from
> this, I'd rather see us downgrade those log messages to DEBUG level
> than risk breaking the code with behaviors that were proven to be
> a bad idea a decade ago.  But TBH I see no strong need to do anything
> here.

How about just adding a HINT?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] specifying repeatable read in PGOPTIONS

2014-02-04 Thread Andres Freund
Hi,

I recently had the need to bury the used isolation level in the
connection string, but it turns out that doesn't work that well...

PGOPTIONS='-c default_transaction_isolation=serializable' \
psql ... -c "SHOW default_transaction_isolation"
works well enough, but
PGOPTIONS='-c default_transaction_isolation=repeatable read' \
psql ... -c "SHOW default_transaction_isolation"
doesn't, because of the whitespace. I couldn't come up with any adequate
quoting.

I'd like to propose adding aliases with dashes instead of spaces to the
isolation_level_options array? I'd even like to backport it, because it
makes benchmarking across versions unneccessarily hard.

Additionally we might want to think about a bit better quoting support
for such options?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Misaligned BufferDescriptors causing major performance problems on AMD

2014-02-04 Thread Andres Freund
On 2014-02-04 00:38:19 +0100, Andres Freund wrote:
> > > A quick hack (attached) making BufferDescriptor 64byte aligned indeed
> > > restored performance across all max_connections settings. It's not
> > > surprising that a misaligned buffer descriptor causes problems -
> > > there'll be plenty of false sharing of the spinlocks otherwise. Curious
> > > that the the intel machine isn't hurt much by this.
> >
> > I think that is explained here:
> >
> > http://www.agner.org/optimize/blog/read.php?i=142&v=t
> >
> > With Sandy Bridge, "Misaligned memory operands [are] handled efficiently".
> 
> No, I don't think so. Those improvements afair refer to unaligned
> accesses as in accessing a 4 byte variable at address % 4 != 0.

So, Christian did some benchmarking on the intel machine, and his
results were also lower than mine, and I've since confirmed that it's
also possible to reproduce the alignment problems on the intel machine.

Which imo means fixing this got more important...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Andres Freund
On 2014-02-04 21:04:13 +0900, Michael Paquier wrote:
> On Tue, Feb 4, 2014 at 7:22 PM, Andres Freund  wrote:
> > On 2014-02-04 19:17:51 +0900, Michael Paquier wrote:
> >> @@ -180,7 +175,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
> >> else
> >> nulls[i++] = true;
> >> if (restart_lsn != InvalidTransactionId)
> >> -   values[i++] = CStringGetTextDatum(restart_lsn_s);
> >> +   values[i++] = restart_lsn;
> >> else
> >> nulls[i++] = true;
> >
> > Isn't that missing a LSNGetDatum()?
> Oops yes. Will fix.
> 
> > Also, isn't it lacking the corresponding pg_proc change?
> restart_lsn is the 6th argument of pg_get_replication_slots, and the
> list of arguments of this function is already changed like that in my
> patch:
> {25,25,26,16,28,25} => {25,25,26,16,28,3220}
> Regards,

Ok.

I think the patch should also adapt pageinspect...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Michael Paquier
On Tue, Feb 4, 2014 at 7:22 PM, Andres Freund  wrote:
> On 2014-02-04 19:17:51 +0900, Michael Paquier wrote:
>> @@ -180,7 +175,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
>> else
>> nulls[i++] = true;
>> if (restart_lsn != InvalidTransactionId)
>> -   values[i++] = CStringGetTextDatum(restart_lsn_s);
>> +   values[i++] = restart_lsn;
>> else
>> nulls[i++] = true;
>
> Isn't that missing a LSNGetDatum()?
Oops yes. Will fix.

> Also, isn't it lacking the corresponding pg_proc change?
restart_lsn is the 6th argument of pg_get_replication_slots, and the
list of arguments of this function is already changed like that in my
patch:
{25,25,26,16,28,25} => {25,25,26,16,28,3220}
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] postgres.exe fails to start on Windows Server 2012 due to ASLR

2014-02-04 Thread Craig Ringer
On 02/04/2014 07:28 PM, MauMau wrote:
>>
> 
> Please don't mind, I didn't misunderstand your intent.  I think we
> should apply this in the next minor release to avoid unnecessary
> confusion -- more new users would use PostgreSQL on Windows 8/2012 and
> hit this problem.
> 
> I added this patch to the CommitFest.

It's really a bugfix suitable for backpatching IMO.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-04 Thread Robert Haas
On Sun, Feb 2, 2014 at 10:15 PM, Etsuro Fujita
 wrote:
> Allowing ALTER COLUMN SET STORAGE on foreign tables would make sense if for
> example, "SELECT * INTO local_table FROM foreign_table" did create a new
> local table of columns having the storage types associated with those of a
> foreign table?

Seems like a pretty weak argument.  It's not that we can't find
strange corner cases where applying SET STORAGE to a foreign table
doesn't do something; it's that they *are* strange corner cases.  The
options as we normally don't understand them just aren't sensible in
this context, and a good deal of work has been put into an alternative
options framework, which is what authors of FDWs ought to be using.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] postgres.exe fails to start on Windows Server 2012 due to ASLR

2014-02-04 Thread MauMau

From: "Craig Ringer" 

I completely agree; just saying that any installer can set the key.

I'm convinced that setting this flag is appropriate, at least while Pg
relies on having the shared memory segment mapped in the same zone in
every executable. Just pointing out that there's a workaround in the
mean time.


Please don't mind, I didn't misunderstand your intent.  I think we should 
apply this in the next minor release to avoid unnecessary confusion -- more 
new users would use PostgreSQL on Windows 8/2012 and hit this problem.


I added this patch to the CommitFest.

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation

2014-02-04 Thread MauMau
I'm sorry, I'm replying to an older mail, because I lost your latest mail by 
mistake.



Ah. Sorry, I missed that part. As NTFS junctions and symbolic links are
different (although they behave similarly), there seems only a minor
inconvenience related to misleading error message i.e.


You are right.  Fixed.

Regards
MauMau


remove_tblspc_symlink_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Andres Freund
On 2014-02-04 19:17:51 +0900, Michael Paquier wrote:
> On Tue, Feb 4, 2014 at 6:15 PM, Andres Freund  wrote:
> >
> >> + /*--
> >> +  *  Relational operators for LSNs
> >> +  *-*/
> >
> > Isn't it just operators? They aren't really relational...
> Operators for LSNs?

Fine with me.

> >> + #define DatumGetLogSeqNum(X) ((XLogRecPtr) GET_8_BYTES(X))
> >
> > I am not a fan of LogSegNum. I think at this point fewer people
> > understand that than LSN. There's also no reason to invent a third term
> > for LSNs. We'd have LSN, XLogRecPtr, and LogSeqNum.
> So let's go with DatumGetLSN and LSNGetDatum instead...

Sup.

> >> *** a/src/backend/replication/slotfuncs.c
> >> --- b/src/backend/replication/slotfuncs.c
> >> ***
> >> *** 141,148  pg_get_replication_slots(PG_FUNCTION_ARGS)
> >>   boolactive;
> >>   Oid database;
> >>   const char *slot_name;
> >> -
> >> - charrestart_lsn_s[MAXFNAMELEN];
> >>   int i;
> >>
> >>   SpinLockAcquire(&slot->mutex);
> >> --- 141,146 
> >
> > Unrelated change.
> Funnily, the patch attached in my previous mail did not include all
> the diffs, it is an error with filterdiff that I use to generate
> context diff patches... My original branch includes the following

Ah, then it makes more sense.

> diffs as well in slotfuncs.c for the second patch:
> diff --git a/src/backend/replication/slotfuncs.c
> b/src/backend/replication/slotfuncs.c
> index 98a860e..68ecdcd 100644
> --- a/src/backend/replication/slotfuncs.c
> +++ b/src/backend/replication/slotfuncs.c
> @@ -141,8 +141,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
> boolactive;
> Oid database;
> const char *slot_name;
> -
> -   charrestart_lsn_s[MAXFNAMELEN];
> int i;
> 
> SpinLockAcquire(&slot->mutex);
> @@ -164,9 +162,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
> 
> memset(nulls, 0, sizeof(nulls));
> 
> -   snprintf(restart_lsn_s, sizeof(restart_lsn_s), "%X/%X",
> -(uint32) (restart_lsn >> 32),
> (uint32) restart_lsn);
> -
> i = 0;
> values[i++] = CStringGetTextDatum(slot_name);
> if (database == InvalidOid)
> @@ -180,7 +175,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
> else
> nulls[i++] = true;
> if (restart_lsn != InvalidTransactionId)
> -   values[i++] = CStringGetTextDatum(restart_lsn_s);
> +   values[i++] = restart_lsn;
> else
> nulls[i++] = true;

Isn't that missing a LSNGetDatum()? Also, isn't it lacking the
corresponding pg_proc change?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Michael Paquier
On Tue, Feb 4, 2014 at 6:15 PM, Andres Freund  wrote:
>
>> + /*--
>> +  *  Relational operators for LSNs
>> +  *-*/
>
> Isn't it just operators? They aren't really relational...
Operators for LSNs?

>> *** 302,307  extern struct varlena *pg_detoast_datum_packed(struct 
>> varlena * datum);
>> --- 303,309 
>>   #define PG_RETURN_CHAR(x)return CharGetDatum(x)
>>   #define PG_RETURN_BOOL(x)return BoolGetDatum(x)
>>   #define PG_RETURN_OID(x) return ObjectIdGetDatum(x)
>> + #define PG_RETURN_LSN(x) return LogSeqNumGetDatum(x)
>>   #define PG_RETURN_POINTER(x) return PointerGetDatum(x)
>>   #define PG_RETURN_CSTRING(x) return CStringGetDatum(x)
>>   #define PG_RETURN_NAME(x)return NameGetDatum(x)
>> *** a/src/include/postgres.h
>> --- b/src/include/postgres.h
>> ***
>> *** 484,489  typedef Datum *DatumPtr;
>> --- 484,503 
>>   #define ObjectIdGetDatum(X) ((Datum) SET_4_BYTES(X))
>>
>>   /*
>> +  * DatumGetLogSeqNum
>> +  *  Returns log sequence number of a datum.
>> +  */
>> +
>> + #define DatumGetLogSeqNum(X) ((XLogRecPtr) GET_8_BYTES(X))
>
> I am not a fan of LogSegNum. I think at this point fewer people
> understand that than LSN. There's also no reason to invent a third term
> for LSNs. We'd have LSN, XLogRecPtr, and LogSeqNum.
So let's go with DatumGetLSN and LSNGetDatum instead...

>> *** a/src/backend/replication/slotfuncs.c
>> --- b/src/backend/replication/slotfuncs.c
>> ***
>> *** 141,148  pg_get_replication_slots(PG_FUNCTION_ARGS)
>>   boolactive;
>>   Oid database;
>>   const char *slot_name;
>> -
>> - charrestart_lsn_s[MAXFNAMELEN];
>>   int i;
>>
>>   SpinLockAcquire(&slot->mutex);
>> --- 141,146 
>
> Unrelated change.
Funnily, the patch attached in my previous mail did not include all
the diffs, it is an error with filterdiff that I use to generate
context diff patches... My original branch includes the following
diffs as well in slotfuncs.c for the second patch:
diff --git a/src/backend/replication/slotfuncs.c
b/src/backend/replication/slotfuncs.c
index 98a860e..68ecdcd 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -141,8 +141,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
boolactive;
Oid database;
const char *slot_name;
-
-   charrestart_lsn_s[MAXFNAMELEN];
int i;

SpinLockAcquire(&slot->mutex);
@@ -164,9 +162,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)

memset(nulls, 0, sizeof(nulls));

-   snprintf(restart_lsn_s, sizeof(restart_lsn_s), "%X/%X",
-(uint32) (restart_lsn >> 32),
(uint32) restart_lsn);
-
i = 0;
values[i++] = CStringGetTextDatum(slot_name);
if (database == InvalidOid)
@@ -180,7 +175,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
else
nulls[i++] = true;
if (restart_lsn != InvalidTransactionId)
-   values[i++] = CStringGetTextDatum(restart_lsn_s);
+   values[i++] = restart_lsn;
else
nulls[i++] = true;

Anything else?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andres Freund
On 2014-02-04 02:10:47 -0500, Tom Lane wrote:
> Amit Kapila  writes:
> > In the function where it is used, it seems to me that it is setting 
> > DateStyle
> > as ISO if it is not ISO and function configure_remote_session() will set it
> > to ISO initially. So basically even if the value of DateStyle is junk, it 
> > will
> > just do what we wanted i.e setting it to ISO. Does the failure is due to 
> > reason
> > that it is not expecting DateStyle to be ISO and due to junk value of this
> > parameter it sets the same to ISO.
> 
> Meh.  It might be that the DateStyle usage in postgres_fdw would
> accidentally fail to malfunction if it saw a bogus value of the variable.
> But it's hard to believe that this would be true of MainLWLockArray.

There's not that much lwlock usage in contrib. It's just
pg_stat_statements and pg_buffercache. Neither has tests... So it very
well could be that breakage simply hasn't been observed.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Andres Freund
Hi,

On 2014-02-04 10:23:14 +0900, Michael Paquier wrote:
> On Tue, Feb 4, 2014 at 10:10 AM, Tom Lane  wrote:
> > Michael Paquier  writes:
> >> Please find attached a patch implementing lsn as a datatype, based on
> >> the one Robert wrote a couple of years ago.
> >
> >> Patch contains regression tests as well as a bit of documentation.
> >> Perhaps this is too late for 9.4, so if there are no objections I'll
> >> simply add this patch to the next commit fest in June for 9.5.
> >
> > I may have lost count, but aren't a bunch of the affected functions new
> > in 9.4?  If so, there's a good argument to be made that we should get
> > this in now, rather than waiting and having an API change for those
> > functions in 9.5.

Yes, that sounds sensible.

> + /*--
> +  *  Relational operators for LSNs
> +  *-*/

Isn't it just operators? They aren't really relational...


> *** 302,307  extern struct varlena *pg_detoast_datum_packed(struct 
> varlena * datum);
> --- 303,309 
>   #define PG_RETURN_CHAR(x)return CharGetDatum(x)
>   #define PG_RETURN_BOOL(x)return BoolGetDatum(x)
>   #define PG_RETURN_OID(x) return ObjectIdGetDatum(x)
> + #define PG_RETURN_LSN(x) return LogSeqNumGetDatum(x)
>   #define PG_RETURN_POINTER(x) return PointerGetDatum(x)
>   #define PG_RETURN_CSTRING(x) return CStringGetDatum(x)
>   #define PG_RETURN_NAME(x)return NameGetDatum(x)
> *** a/src/include/postgres.h
> --- b/src/include/postgres.h
> ***
> *** 484,489  typedef Datum *DatumPtr;
> --- 484,503 
>   #define ObjectIdGetDatum(X) ((Datum) SET_4_BYTES(X))
>   
>   /*
> +  * DatumGetLogSeqNum
> +  *  Returns log sequence number of a datum.
> +  */
> + 
> + #define DatumGetLogSeqNum(X) ((XLogRecPtr) GET_8_BYTES(X))

I am not a fan of LogSegNum. I think at this point fewer people
understand that than LSN. There's also no reason to invent a third term
for LSNs. We'd have LSN, XLogRecPtr, and LogSeqNum.

> *** a/src/backend/replication/slotfuncs.c
> --- b/src/backend/replication/slotfuncs.c
> ***
> *** 141,148  pg_get_replication_slots(PG_FUNCTION_ARGS)
>   boolactive;
>   Oid database;
>   const char *slot_name;
> - 
> - charrestart_lsn_s[MAXFNAMELEN];
>   int i;
>   
>   SpinLockAcquire(&slot->mutex);
> --- 141,146 

Unrelated change.


Looks reasonable on a first look. Thanks!

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-02-04 Thread Rajeev rastogi

On 4th February 2014, Christian kruse Wrote:
> On 04/02/14 12:38, Fujii Masao wrote:
> > ISTM that the phrase "Request queue" is not used much around the lock.
> > Using the phrase "wait queue" or Simon's suggestion sound better to
> at least me.
> > Thought?
> 
> Sounds reasonable to me. Attached patch changes messages to the
> following:
> 
> Process holding the lock: A. Wait queue: B.
> Processes holding the lock: A, B. Wait queue: C.

This looks good to me also.

Thanks and Regards,
Kumar Rajeev Rastogi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-02-04 Thread Andres Freund
On 2014-02-03 17:51:20 -0800, Peter Geoghegan wrote:
> On Sun, Feb 2, 2014 at 6:00 AM, Andres Freund  wrote:
> > On 2014-02-01 19:47:29 -0800, Peter Geoghegan wrote:
> >> Here are the results of a benchmark on Nathan Boley's 64-core, 4
> >> socket server: 
> >> http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/amd-4-socket-rwlocks/
> >
> > That's interesting. The maximum number of what you see here (~293125)
> > is markedly lower than what I can get.
> >
> > ... poke around ...
> >
> > Hm, that's partially because you're using pgbench without -M prepared if
> > I see that correctly. The bottleneck in that case is primarily memory
> > allocation. But even after that I am getting higher
> > numbers: ~342497.
> >
> > Trying to nail down the differnce it oddly seems to be your
> > max_connections=80 vs my 100. The profile in both cases is markedly
> > different, way much more spinlock contention with 80. All in
> > Pin/UnpinBuffer().
> 
> I updated this benchmark, with your BufferDescriptors alignment patch
> [1] applied on top of master (while still not using "-M prepared" in
> order to keep the numbers comparable). So once again, that's:
> 
> http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/amd-4-socket-rwlocks/
> 
> It made a bigger, fairly noticeable difference, but not so big a
> difference as you describe here. Are you sure that you saw this kind
> of difference with only 64 clients, as you mentioned elsewhere [1]
> (perhaps you fat-fingered [1] -- "-cj" is ambiguous)? Obviously
> max_connections is still 80 in the above. Should I have gone past 64
> clients to see the problem? The best numbers I see with the [1] patch
> applied on master is only ~327809 for -S 10 64 clients. Perhaps I've
> misunderstood.

That's likely -M prepared.  It was with -c 64 -j 64...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-02-04 Thread Christian Kruse
Hi,

On 04/02/14 12:38, Fujii Masao wrote:
> ISTM that the phrase "Request queue" is not used much around the lock.
> Using the phrase "wait queue" or Simon's suggestion sound better to at least 
> me.
> Thought?

Sounds reasonable to me. Attached patch changes messages to the following:

Process holding the lock: A. Wait queue: B.
Processes holding the lock: A, B. Wait queue: C.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

diff --git a/doc/src/sgml/sources.sgml b/doc/src/sgml/sources.sgml
index 881b0c3..aa20807 100644
--- a/doc/src/sgml/sources.sgml
+++ b/doc/src/sgml/sources.sgml
@@ -251,6 +251,15 @@ ereport(ERROR,


 
+ errdetail_log_plural(const char *fmt_singuar, const char
+ *fmt_plural, unsigned long n, ...) is like
+ errdetail_log, but with support for various plural forms of
+ the message.
+ For more information see .
+
+   
+   
+
  errhint(const char *msg, ...) supplies an optional
  hint message; this is to be used when offering suggestions
  about how to fix the problem, as opposed to factual details about
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index fb449a8..9620f6e 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1209,13 +1209,23 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		 */
 		if (log_lock_waits && deadlock_state != DS_NOT_YET_CHECKED)
 		{
-			StringInfoData buf;
+			StringInfoData buf,
+		lock_waiters_sbuf,
+		lock_holders_sbuf;
 			const char *modename;
 			long		secs;
 			int			usecs;
 			long		msecs;
+			SHM_QUEUE  *procLocks;
+			PROCLOCK   *proclock;
+			bool		first_holder = true,
+		first_waiter = true;
+			int			lockHoldersNum = 0;
 
 			initStringInfo(&buf);
+			initStringInfo(&lock_waiters_sbuf);
+			initStringInfo(&lock_holders_sbuf);
+
 			DescribeLockTag(&buf, &locallock->tag.lock);
 			modename = GetLockmodeName(locallock->tag.lock.locktag_lockmethodid,
 	   lockmode);
@@ -1225,10 +1235,67 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			msecs = secs * 1000 + usecs / 1000;
 			usecs = usecs % 1000;
 
+			/*
+			 * we loop over the lock's procLocks to gather a list of all
+			 * holders and waiters. Thus we will be able to provide more
+			 * detailed information for lock debugging purposes.
+			 *
+			 * lock->procLocks contains all processes which hold or wait for
+			 * this lock.
+			 */
+
+			LWLockAcquire(partitionLock, LW_SHARED);
+
+			procLocks = &(lock->procLocks);
+			proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+			   offsetof(PROCLOCK, lockLink));
+
+			while (proclock)
+			{
+/*
+ * we are a waiter if myProc->waitProcLock == proclock; we are
+ * a holder if it is NULL or something different
+ */
+if (proclock->tag.myProc->waitProcLock == proclock)
+{
+	if (first_waiter)
+	{
+		appendStringInfo(&lock_waiters_sbuf, "%d",
+		 proclock->tag.myProc->pid);
+		first_waiter = false;
+	}
+	else
+		appendStringInfo(&lock_waiters_sbuf, ", %d",
+		 proclock->tag.myProc->pid);
+}
+else
+{
+	if (first_holder)
+	{
+		appendStringInfo(&lock_holders_sbuf, "%d",
+		 proclock->tag.myProc->pid);
+		first_holder = false;
+	}
+	else
+		appendStringInfo(&lock_holders_sbuf, ", %d",
+		 proclock->tag.myProc->pid);
+
+	lockHoldersNum++;
+}
+
+proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
+			   offsetof(PROCLOCK, lockLink));
+			}
+
+			LWLockRelease(partitionLock);
+
 			if (deadlock_state == DS_SOFT_DEADLOCK)
 ereport(LOG,
 		(errmsg("process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms",
-			  MyProcPid, modename, buf.data, msecs, usecs)));
+MyProcPid, modename, buf.data, msecs, usecs),
+		 (errdetail_log_plural("Process holding the lock: %s. Wait queue: %s.",
+		   "Processes holding the lock: %s. Wait queue: %s.",
+			   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data;
 			else if (deadlock_state == DS_HARD_DEADLOCK)
 			{
 /*
@@ -1240,13 +1307,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
  */
 ereport(LOG,
 		(errmsg("process %d detected deadlock while waiting for %s on %s after %ld.%03d ms",
-			  MyProcPid, modename, buf.data, msecs, usecs)));
+MyProcPid, modename, buf.data, msecs, usecs),
+		 (errdetail_log_plural("Process holding the lock: %s. Wait queue: %s.",
+		   "Processes holding the lock: %s. Wait queue: %s.",
+			   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data;
 			}
 
 			if (myWaitStatus == STATUS_WAITING)
 ereport(LOG,
 		(errmsg("process %d still waiting for %s on %s after %ld.%03d ms",
-			  MyProcPid, modename, buf.

Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Amit Kapila
On Tue, Feb 4, 2014 at 12:40 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>> In the function where it is used, it seems to me that it is setting DateStyle
>> as ISO if it is not ISO and function configure_remote_session() will set it
>> to ISO initially. So basically even if the value of DateStyle is junk, it 
>> will
>> just do what we wanted i.e setting it to ISO. Does the failure is due to 
>> reason
>> that it is not expecting DateStyle to be ISO and due to junk value of this
>> parameter it sets the same to ISO.
>
> Meh.  It might be that the DateStyle usage in postgres_fdw would
> accidentally fail to malfunction if it saw a bogus value of the variable.
> But it's hard to believe that this would be true of MainLWLockArray.

Thats true, but for me its failing as MainLWLockArray contains Junk value.
Now the point to look out is why its passing on some of the build farm m/c's.
I will study about this more, if you have anything specific in mind
then, do let me know.

Can I get the details of m/c env on which it is passing like which
windows version and msvc version?
It might give some clue.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers