Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Andres Freund
Hi Petr,

On 2016-01-02 09:17:02 +0100, Petr Jelinek wrote:
> so the commit which triggers this issue is
> 387da18874afa17156ee3af63766f17efb53c4b9 , not sure why yet (wanted to give
> heads up early since multiple people are looking at this). Note that the
> compilation around this commit is made harder by the fact that commit
> 91fa7b4719ac583420d9143132ba4ccddefbc5b2 breaks linking and fix is few
> commits later.

Thanks for bisecting! What exactly did you use to reproduce? Shay's
npgsql binary? Or a plain psql session where you killed the client?
Given that Amit couldn't reproduce with the latter that seems an
interesting difference.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Andres Freund
On 2016-01-02 14:26:47 +0100, Andres Freund wrote:
> On 2016-01-02 18:40:38 +0530, Amit Kapila wrote:
> > If we
> > remember the closed socket event and then take appropriate action,
> > then this problem won't happen.  Attached patch which by no-means
> > a complete fix shows what I wanted to say and after this the problem
> > mentioned by Shay doesn't happen, although I get LOG message
> > which is due to the reason that proper handling for socket closure
> > needs to be done in this path.  This patch is based on the code
> > after commit 387da18874afa17156ee3af63766f17efb53c4b9.  I
> > will do testing and refine the fix based on HEAD later as I am done
> > for the today.
> 
> It's weird that this fixes the problem. As we were previously, according
> to Shay, not busy looping, this seems to indicate that FD_CLOSE is only
> reported once or somesuch?
> 
> It'd be very interesting to add a debug elog() into the
>   if (resEvents.lNetworkEvents & FD_CLOSE)
>   {
>   if (wakeEvents & WL_SOCKET_READABLE)
>   result |= WL_SOCKET_READABLE;
>   if (wakeEvents & WL_SOCKET_WRITEABLE)
>   result |= WL_SOCKET_WRITEABLE;
>   }
> 
> path in WaitLatchOrSocket. If it actually returns with the current code,
> we have a better idea where to look for problems.


I wonder if the following is the problem: The docs for WSAEventSelect()
says:
"Having successfully recorded the occurrence of the network event (by
setting the corresponding bit in the internal network event record) and
signaled the associated event object, no further actions are taken for
that network event until the application makes the function call that
implicitly reenables the setting of that network event and signaling of
the associated event object."
and also notes specifically for FD_CLOSE that there's no re-enabling
functions.

See
https://msdn.microsoft.com/en-us/library/windows/desktop/ms741576%28v=vs.85%29.aspx
which goes on to talk about some level triggered events (FD_READ, ...)
and others being edge triggered. It's not clear to me from that whether
FD_CLOSE is supposed to be edge or level triggered.

If FD_CLOSE is indeed edge and not level triggered - which imo would be
supremely insane - we'd be in trouble. It'd explain why some failures
are noticed and others not.

ISTM this should relatively easily be debuggable by adding a few debug
elogs.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Petr Jelinek

On 2016-01-02 12:05, Amit Kapila wrote:

On Sat, Jan 2, 2016 at 3:16 PM, Andres Freund > wrote:

Hi Petr,

On 2016-01-02 09:17:02 +0100, Petr Jelinek wrote:
> so the commit which triggers this issue is
> 387da18874afa17156ee3af63766f17efb53c4b9 , not sure why yet (wanted to 
give
> heads up early since multiple people are looking at this). Note that the
> compilation around this commit is made harder by the fact that commit
> 91fa7b4719ac583420d9143132ba4ccddefbc5b2 breaks linking and fix is few
> commits later.

Thanks for bisecting! What exactly did you use to reproduce? Shay's
npgsql binary? Or a plain psql session where you killed the client?
Given that Amit couldn't reproduce with the latter that seems an
interesting difference.


I am also able to reproduce now. The reason was that I didn't have
latest .Net framework and Visual Studio, which is must for the recent
version of Npgsql.

One probable reason of the problem seems to be that now for windows, we
are emulating non-blocking behaviour by setting pgwin32_noblock = true
which makes function pgwin32_recv() return EWOULDBLOCK and it would
wait using WaitLatchOrSocket() instead of pgwin32_waitforsinglesocket().
There are some differences in the way both the API's (WaitLatchOrSocket()
and pgwin32_waitforsinglesocket()) do wait, now may be that is the reason
for this behaviour.  One thing I have tried is that if I don't
set pgwin32_noblock
in secure_raw_read(), then this problem won't occur which lead to above
reasoning.  I am still investigating.



Well, without pgwin32_noblock = true we never enter the code block which 
calls WaitLatchOrSocket and hangs as in my testing this was always 
called because of EWOULDBLOCK.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Andres Freund
On 2016-01-02 18:40:38 +0530, Amit Kapila wrote:
> What I wanted to say is that the handling of socket closure is not
> same in WaitLatchOrSocket() and pgwin32_waitforsinglesocket()
> due to which this problem can arise and it seems that is the
> right line of direction to pursue.  I have found that
> in WaitLatchOrSocket(),
> even when the socket is closed, we remember the result as
> WL_SOCKET_READABLE and again tries to wait whereas the
> same is handled properly in pgwin32_waitforsinglesocket().

That's actually intentional, and part of the design:
 * When waiting on a socket, EOF and error conditions are reported by
 * returning the socket as readable/writable or both, depending on
 * WL_SOCKET_READABLE/WL_SOCKET_WRITEABLE being specified.

The way this is supposed to work, and does on unixoid systems, is that
WaitLatchOS returns, the recv is retried and signals an error.

> If we
> remember the closed socket event and then take appropriate action,
> then this problem won't happen.  Attached patch which by no-means
> a complete fix shows what I wanted to say and after this the problem
> mentioned by Shay doesn't happen, although I get LOG message
> which is due to the reason that proper handling for socket closure
> needs to be done in this path.  This patch is based on the code
> after commit 387da18874afa17156ee3af63766f17efb53c4b9.  I
> will do testing and refine the fix based on HEAD later as I am done
> for the today.

It's weird that this fixes the problem. As we were previously, according
to Shay, not busy looping, this seems to indicate that FD_CLOSE is only
reported once or somesuch?

It'd be very interesting to add a debug elog() into the
if (resEvents.lNetworkEvents & FD_CLOSE)
{
if (wakeEvents & WL_SOCKET_READABLE)
result |= WL_SOCKET_READABLE;
if (wakeEvents & WL_SOCKET_WRITEABLE)
result |= WL_SOCKET_WRITEABLE;
}

path in WaitLatchOrSocket. If it actually returns with the current code,
we have a better idea where to look for problems.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Petr Jelinek

Hi,

so the commit which triggers this issue is 
387da18874afa17156ee3af63766f17efb53c4b9 , not sure why yet (wanted to 
give heads up early since multiple people are looking at this). Note 
that the compilation around this commit is made harder by the fact that 
commit 91fa7b4719ac583420d9143132ba4ccddefbc5b2 breaks linking and fix 
is few commits later.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Amit Kapila
On Sat, Jan 2, 2016 at 3:16 PM, Andres Freund  wrote:

> Hi Petr,
>
> On 2016-01-02 09:17:02 +0100, Petr Jelinek wrote:
> > so the commit which triggers this issue is
> > 387da18874afa17156ee3af63766f17efb53c4b9 , not sure why yet (wanted to
> give
> > heads up early since multiple people are looking at this). Note that the
> > compilation around this commit is made harder by the fact that commit
> > 91fa7b4719ac583420d9143132ba4ccddefbc5b2 breaks linking and fix is few
> > commits later.
>
> Thanks for bisecting! What exactly did you use to reproduce? Shay's
> npgsql binary? Or a plain psql session where you killed the client?
> Given that Amit couldn't reproduce with the latter that seems an
> interesting difference.
>
>
I am also able to reproduce now. The reason was that I didn't have
latest .Net framework and Visual Studio, which is must for the recent
version of Npgsql.

One probable reason of the problem seems to be that now for windows, we
are emulating non-blocking behaviour by setting pgwin32_noblock = true
which makes function pgwin32_recv() return EWOULDBLOCK and it would
wait using WaitLatchOrSocket() instead of pgwin32_waitforsinglesocket().
There are some differences in the way both the API's (WaitLatchOrSocket()
and pgwin32_waitforsinglesocket()) do wait, now may be that is the reason
for this behaviour.  One thing I have tried is that if I don't
set pgwin32_noblock
in secure_raw_read(), then this problem won't occur which lead to above
reasoning.  I am still investigating.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] pg_dump LOCK TABLE ONLY question

2016-01-02 Thread Michael Paquier
On Sat, Jan 2, 2016 at 12:28 PM, Noah Misch  wrote:
> On Sat, Oct 31, 2015 at 10:14:18AM +0100, Simon Riggs wrote:
>> I agree with Filip that this is a bug. pg_dump clearly doesn't work
>> correctly with inheritance.
>>
>> If I run this command
>>
>>   pg_dump -t tab1
>>
>> then I get a dump of "tab1".  No data is included from tables that inherit
>> tab1 because COPY refers only to the target table.
>>
>> Why should that action cause a lock to be taken on another table that
>> inherits from tab1?
>>
>> It seems clear that the user is requesting an action ONLY on tab1, so we
>> should use LOCK TABLE tab1 ONLY;
>
> +1

Hm. Looking one extra time at that, the patch upthread should as well
do the same for the parallel mode introduced in 9.3 as data is always
bumped using FROM ONLY. See for example the attached..
-- 
Michael
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index ff823e5..25d18ba 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -834,8 +834,12 @@ lockTableNoWait(ArchiveHandle *AH, TocEntry *te)
 			PQgetvalue(res, 0, 0),
 			PQgetvalue(res, 0, 1));
 
-	appendPQExpBuffer(query, "LOCK TABLE %s IN ACCESS SHARE MODE NOWAIT",
-	  qualId);
+	if (AHX->remoteVersion >= 80400)
+		appendPQExpBuffer(query, "LOCK TABLE ONLY %s IN ACCESS SHARE MODE NOWAIT",
+		  qualId);
+	else
+		appendPQExpBuffer(query, "LOCK TABLE %s IN ACCESS SHARE MODE NOWAIT",
+		  qualId);
 	PQclear(res);
 
 	res = PQexec(AH->connection, query->data);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 36863df..755046f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -5169,8 +5169,8 @@ getTables(Archive *fout, DumpOptions *dopt, int *numTables)
 		 * Read-lock target tables to make sure they aren't DROPPED or altered
 		 * in schema before we get around to dumping them.
 		 *
-		 * Note that we don't explicitly lock parents of the target tables; we
-		 * assume our lock on the child is enough to prevent schema
+		 * Note that we don't explicitly lock parents or children of the target
+		 * tables; we assume our lock on the child is enough to prevent schema
 		 * alterations to parent tables.
 		 *
 		 * NOTE: it'd be kinda nice to lock other relations too, not only
@@ -5179,11 +5179,18 @@ getTables(Archive *fout, DumpOptions *dopt, int *numTables)
 		if (tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)
 		{
 			resetPQExpBuffer(query);
-			appendPQExpBuffer(query,
-			  "LOCK TABLE %s IN ACCESS SHARE MODE",
-			  fmtQualifiedId(fout->remoteVersion,
-		tblinfo[i].dobj.namespace->dobj.name,
-			 tblinfo[i].dobj.name));
+			if (fout->remoteVersion >= 80400)
+appendPQExpBuffer(query,
+  "LOCK TABLE ONLY %s IN ACCESS SHARE MODE",
+  fmtQualifiedId(fout->remoteVersion,
+			tblinfo[i].dobj.namespace->dobj.name,
+ tblinfo[i].dobj.name));
+			else
+appendPQExpBuffer(query,
+  "LOCK TABLE %s IN ACCESS SHARE MODE",
+  fmtQualifiedId(fout->remoteVersion,
+			tblinfo[i].dobj.namespace->dobj.name,
+ tblinfo[i].dobj.name));
 			ExecuteSqlStatement(fout, query->data);
 		}
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Release notes of 9.0~9.3 mentioning recovery_min_apply_delay incorrectly

2016-01-02 Thread Michael Paquier
Hi all,

While doing a git grep on recovery_min_apply_delay I noticed the following:
$ git grep recovery_min_apply -- *release*.sgml
src/sgml/release-9.0.sgml:  that recovery_min_apply_delay
failed to delay application
src/sgml/release-9.1.sgml:  that recovery_min_apply_delay
failed to delay application
src/sgml/release-9.2.sgml:  that recovery_min_apply_delay
failed to delay application
src/sgml/release-9.3.sgml:  that recovery_min_apply_delay
failed to delay application
src/sgml/release-9.4.sgml:  Avoid busy-waiting with short
recovery_min_apply_delay

This is obviously incorrect because recovery_min_apply_delay has been only
introduced in 9.4. The culprit is visibly the commit message of 8049839 and
others that mentioned the parameter, though the patch applied does nothing
about it. Please see attached a patch to fix that.
Regards,
-- 
Michael
diff --git a/doc/src/sgml/release-9.0.sgml b/doc/src/sgml/release-9.0.sgml
index ef8eb1c..de6550f 100644
--- a/doc/src/sgml/release-9.0.sgml
+++ b/doc/src/sgml/release-9.0.sgml
@@ -1493,12 +1493,6 @@
   Fix several cases where recovery logic improperly ignored WAL records
   for COMMIT/ABORT PREPARED (Heikki Linnakangas)
  
-
- 
-  The most notable oversight was
-  that recovery_min_apply_delay failed to delay application
-  of a two-phase commit.
- 
 
 
 
diff --git a/doc/src/sgml/release-9.1.sgml b/doc/src/sgml/release-9.1.sgml
index fde6b61..ef2e849 100644
--- a/doc/src/sgml/release-9.1.sgml
+++ b/doc/src/sgml/release-9.1.sgml
@@ -1663,12 +1663,6 @@ Branch: REL9_0_STABLE [9d6af7367] 2015-08-15 11:02:34 -0400
   Fix several cases where recovery logic improperly ignored WAL records
   for COMMIT/ABORT PREPARED (Heikki Linnakangas)
  
-
- 
-  The most notable oversight was
-  that recovery_min_apply_delay failed to delay application
-  of a two-phase commit.
- 
 
 
 
diff --git a/doc/src/sgml/release-9.2.sgml b/doc/src/sgml/release-9.2.sgml
index 4bfede5..72203fb 100644
--- a/doc/src/sgml/release-9.2.sgml
+++ b/doc/src/sgml/release-9.2.sgml
@@ -1848,12 +1848,6 @@ Branch: REL9_2_STABLE [6b700301c] 2015-02-17 16:03:00 +0100
   Fix several cases where recovery logic improperly ignored WAL records
   for COMMIT/ABORT PREPARED (Heikki Linnakangas)
  
-
- 
-  The most notable oversight was
-  that recovery_min_apply_delay failed to delay application
-  of a two-phase commit.
- 
 
 
 
diff --git a/doc/src/sgml/release-9.3.sgml b/doc/src/sgml/release-9.3.sgml
index 1ac6abe..a1a7781 100644
--- a/doc/src/sgml/release-9.3.sgml
+++ b/doc/src/sgml/release-9.3.sgml
@@ -2413,12 +2413,6 @@ Branch: REL9_0_STABLE [804983961] 2014-07-29 11:58:17 +0300
   Fix several cases where recovery logic improperly ignored WAL records
   for COMMIT/ABORT PREPARED (Heikki Linnakangas)
  
-
- 
-  The most notable oversight was
-  that recovery_min_apply_delay failed to delay application
-  of a two-phase commit.
- 
 
 
 

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-01-02 Thread Fabrízio de Royes Mello
On Wed, Dec 30, 2015 at 1:10 PM, Oleg Bartunov  wrote:
>
>
>
> On Wed, Dec 30, 2015 at 5:44 PM, Andres Freund  wrote:
>>
>> On 2015-12-30 11:37:19 -0300, Alvaro Herrera wrote:
>> > Aleksander Alekseev wrote:
>> >
>> > > Here is a funny thing - benchmark results I shared 22.12.2015 are
wrong
>> > > because I forgot to run `make clean` after changing lwlock.h
(autotools
>> > > doesn't rebuild project properly after changing .h files).
>> >
>> > Running configure with --enable-depend should avoid this problem.
>>
>> I still maintain that --enable-depend should be on by default. We're
>
>
> +1
>

+1, I always use it.

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Petr Jelinek

On 2016-01-02 10:46, Andres Freund wrote:

Hi Petr,

On 2016-01-02 09:17:02 +0100, Petr Jelinek wrote:

so the commit which triggers this issue is
387da18874afa17156ee3af63766f17efb53c4b9 , not sure why yet (wanted to give
heads up early since multiple people are looking at this). Note that the
compilation around this commit is made harder by the fact that commit
91fa7b4719ac583420d9143132ba4ccddefbc5b2 breaks linking and fix is few
commits later.


Thanks for bisecting! What exactly did you use to reproduce? Shay's
npgsql binary? Or a plain psql session where you killed the client?
Given that Amit couldn't reproduce with the latter that seems an
interesting difference.



I used the npgsql binary to reproduce, when I try killing psql from task 
manager it behaves correctly.



--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Amit Kapila
On Sat, Jan 2, 2016 at 5:02 PM, Petr Jelinek  wrote:

> On 2016-01-02 12:05, Amit Kapila wrote:
>>
>> I am also able to reproduce now. The reason was that I didn't have
>> latest .Net framework and Visual Studio, which is must for the recent
>> version of Npgsql.
>>
>> One probable reason of the problem seems to be that now for windows, we
>> are emulating non-blocking behaviour by setting pgwin32_noblock = true
>> which makes function pgwin32_recv() return EWOULDBLOCK and it would
>> wait using WaitLatchOrSocket() instead of pgwin32_waitforsinglesocket().
>> There are some differences in the way both the API's (WaitLatchOrSocket()
>> and pgwin32_waitforsinglesocket()) do wait, now may be that is the reason
>> for this behaviour.  One thing I have tried is that if I don't
>> set pgwin32_noblock
>> in secure_raw_read(), then this problem won't occur which lead to above
>> reasoning.  I am still investigating.
>>
>>
> Well, without pgwin32_noblock = true we never enter the code block which
> calls WaitLatchOrSocket and hangs as in my testing this was always called
> because of EWOULDBLOCK.
>
>
What I wanted to say is that the handling of socket closure is not
same in WaitLatchOrSocket() and pgwin32_waitforsinglesocket()
due to which this problem can arise and it seems that is the
right line of direction to pursue.  I have found that
in WaitLatchOrSocket(),
even when the socket is closed, we remember the result as
WL_SOCKET_READABLE and again tries to wait whereas the
same is handled properly in pgwin32_waitforsinglesocket().  If we
remember the closed socket event and then take appropriate action,
then this problem won't happen.  Attached patch which by no-means
a complete fix shows what I wanted to say and after this the problem
mentioned by Shay doesn't happen, although I get LOG message
which is due to the reason that proper handling for socket closure
needs to be done in this path.  This patch is based on the code
after commit 387da18874afa17156ee3af63766f17efb53c4b9.  I
will do testing and refine the fix based on HEAD later as I am done
for the today.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


win_socket_wait_issue_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Core dump with nested CREATE TEMP TABLE

2016-01-02 Thread Tom Lane
Noah Misch  writes:
> On Thu, Sep 03, 2015 at 11:04:11PM -0400, Tom Lane wrote:
>> *** AtSubAbort_Portals(SubTransactionId mySu

>> --- 909,966 
>> {
>> Portal   portal = hentry->portal;
>> 
>> +/* Was it created in this subtransaction? */
>> if (portal->createSubid != mySubid)
>> +{
>> +/* No, but maybe it was used in this subtransaction? */
>> +if (portal->activeSubid == mySubid)
>> +{
> ...
>> +if (portal->status == PORTAL_ACTIVE)
>> +MarkPortalFailed(portal);

> Do you have a test case that reaches this particular MarkPortalFailed() call?
> My attempts stumbled over the fact that, before we reach here, each of the
> three MarkPortalActive() callers will have already called MarkPortalFailed()
> in its PG_CATCH block.  ("make check" does not reach this call.)

Offhand I think that's just belt-and-suspenders-too coding.  As you say,
we'd typically have failed active portals already before getting here.
But the responsibility of this routine is to *guarantee* that no broken
portals remain active, so I'd not want to remove this check.

Do you have a particular reason for asking?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Andres Freund
On 2016-01-02 16:20:58 +0100, Andres Freund wrote:
> I really right now can see only two somewhat surgical fixes:
> 
> 1) We do a nonblocking or select() *after* registering our events. Both
>in WaitLatchOrSocket() and waitforsinglesocket. Since select/poll are
>explicitly level triggered, that should make us notice any events we
>might have missed. select() appears to have been available for a fair
>while.
> 
> 2) We explicitly shutdown(SD_BOTH) the socket whenever we get a FD_CLOSE
>object. I *think* this should trigger errors in WSArecv, WSAEventSelect
>et al. Doesn't solve the problem that we might miss important events
>though.
> 
> 
> Given 2) isn't a complete fix and I can't find reliable documentation
> since when shutdown() is supported I'm inclined to go with 1).


Attached is an attempt to blindly implement 1). It needs some further
polishing and thought, but I'd like to verify it actually fixes things
before investing more time in this.

Greetings,

Andres Freund
diff --git a/src/backend/port/win32_latch.c b/src/backend/port/win32_latch.c
index 0e3aaee..b4441d8 100644
--- a/src/backend/port/win32_latch.c
+++ b/src/backend/port/win32_latch.c
@@ -177,6 +177,65 @@ WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock,
 	/* Ensure that signals are serviced even if latch is already set */
 	pgwin32_dispatch_queued_signals();
 
+	/*
+	 * FD_WRITE and FD_CLOSE are edge and not level triggered
+	 * (c.f. WSAEventSelect's documentation), and at least FD_CLOSE is only
+	 * signalled when the state changes while the event is attached to the
+	 * socket. To avoid waiting, potentially indefinitely, do a non-blocking
+	 * select() on the socket diagnosing the current state. Check this after
+	 * the signal dispatching above, to avoid starving signal dispatch.
+	 *
+	 * Unfortunately, according to the documentation, windows doesn't signal
+	 * writeability when a socket is closed. Only WSAPoll() does so - but it's
+	 * only available in newer versions of windows. XXX: Is this an active
+	 * race condition?
+	 */
+	if (wakeEvents & (WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE))
+	{
+		fd_set		input_mask;
+		fd_set	   *input_maskp = NULL;
+		fd_set		output_mask;
+		fd_set	   *output_maskp = NULL;
+		struct timeval tv;
+
+		FD_ZERO(_mask);
+		FD_ZERO(_mask);
+		tv.tv_sec = 0;
+		tv.tv_usec = 0;
+
+		if (wakeEvents & WL_SOCKET_READABLE)
+		{
+			FD_SET(sock, _mask);
+			input_maskp = _mask;
+		}
+		if (wakeEvents & WL_SOCKET_WRITEABLE)
+		{
+			FD_SET(sock, _mask);
+			output_maskp = _mask;
+		}
+
+		rc = select(1, input_maskp, output_maskp, NULL, );
+
+		if (rc < 0)
+		{
+			elog(ERROR, "select() failed: %lu", GetLastError());
+		}
+		else if (rc > 0)
+		{
+			if ((wakeEvents & WL_SOCKET_READABLE) && FD_ISSET(sock, _mask))
+			{
+/* data available in socket, or EOF */
+result |= WL_SOCKET_READABLE;
+			}
+			if ((wakeEvents & WL_SOCKET_WRITEABLE) && FD_ISSET(sock, _mask))
+			{
+/* socket is writable */
+result |= WL_SOCKET_WRITEABLE;
+			}
+			return result;
+		}
+	}
+
 	do
 	{
 		/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Tom Lane
Andres Freund  writes:
> A bit of searching around brought up that we saw issues around this
> before:
> http://www.postgresql.org/message-id/4351.1336927...@sss.pgh.pa.us

Indeed.  It doesn't look like any of the cleanup I suggested in that
thread has ever gotten done.  I suspect that we'll continue to see
problems until we get rid of the transient event object attachments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Andres Freund
On 2016-01-02 13:00:09 -0500, Tom Lane wrote:
> : More generally, it seems clear to me that Microsoft's code is designed
> : around the assumption that an event object remains attached to a socket
> : for the lifetime of the socket.  This business of transiently associating
> : event objects with sockets looks quite inefficient and is evidently
> : triggering a lot of unpleasant corner-case behaviors.  I wonder whether we
> : should not try to make "pgsocket" encapsulate a socket and an associated
> : event object as a single entity on Windows.  (Such a struct would be a
> : good place to keep a per-socket noblock flag, too.)  I'm not going to
> : tackle that myself though.
> 
> which I think is the same as what you're suggesting.

Pretty similar, yes. I think we're going to additionally need a 'pending
events' flag or something to store edge triggered
status. E.g. 'writable' which'd only be cleared by EWOULDBLOCK being
returned by send, and 'closed' which is never cleared.

I guess it'd make sense to do this for all platforms; so most code can
just do pg_send/recv/... and the platform differences live in
pgsock_win/posix or somesuch.

I'm inclined to fix this for 9.5 with something like the select() hack I
posted upthread, and then revamp this in 9.6. I don't want to push a
relatively large API overhaul into 9.5 at this point.



Because of the recent thread about WL_POSTMASTER_DEATH scaling badly I'm
also wondering if we should change the latch API to make
OwnLatch/InitLatch have a fixed association with the socket that can be
waited on. My testing shows that e.g. linux' epoll is noticeably faster,
even measurably so in the single threaded case or without
WL_POSTMASTER_DEATH, because it requires less locking on the kernel side
than setting up poll/select datastructures in the kernel everytime we're
blocking.  Afaics there are no uses of latches where we'd want to use
different sockets.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Andres Freund
On 2016-01-02 15:40:03 +0100, Andres Freund wrote:
> I wonder if the following is the problem: The docs for WSAEventSelect()
> says:
> "Having successfully recorded the occurrence of the network event (by
> setting the corresponding bit in the internal network event record) and
> signaled the associated event object, no further actions are taken for
> that network event until the application makes the function call that
> implicitly reenables the setting of that network event and signaling of
> the associated event object."
> and also notes specifically for FD_CLOSE that there's no re-enabling
> functions.
> 
> See
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms741576%28v=vs.85%29.aspx
> which goes on to talk about some level triggered events (FD_READ, ...)
> and others being edge triggered. It's not clear to me from that whether
> FD_CLOSE is supposed to be edge or level triggered.
> 
> If FD_CLOSE is indeed edge and not level triggered - which imo would be
> supremely insane - we'd be in trouble. It'd explain why some failures
> are noticed and others not.

I found a few more resources confirming that FD_CLOSE is edge
triggered. Which probably doesn't just make our code buggy when waiting
twice on the same socket, but probably also makes it very timing
dependent: As the event is only triggered when the close actually occurs
it's possible that we don't have any event associated with that socket:
We only do so for shorts amount of time in WaitLatchOrSocket() and
pgwin32_waitforsinglesocket().

A bit of searching around brought up that we saw issues around this
before:
http://www.postgresql.org/message-id/4351.1336927...@sss.pgh.pa.us


I really right now can see only two somewhat surgical fixes:

1) We do a nonblocking or select() *after* registering our events. Both
   in WaitLatchOrSocket() and waitforsinglesocket. Since select/poll are
   explicitly level triggered, that should make us notice any events we
   might have missed. select() appears to have been available for a fair
   while.

2) We explicitly shutdown(SD_BOTH) the socket whenever we get a FD_CLOSE
   object. I *think* this should trigger errors in WSArecv, WSAEventSelect
   et al. Doesn't solve the problem that we might miss important events
   though.


Given 2) isn't a complete fix and I can't find reliable documentation
since when shutdown() is supported I'm inclined to go with 1).


Better ideas?


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Welcome to 2016, time to run src/tools/copyright.pl

2016-01-02 Thread Bruce Momjian
On Fri, Jan  1, 2016 at 04:43:07PM +0900, Michael Paquier wrote:
> Hi all,
> 
> Happy new year to all and best wishes!
> 
> I guess that the following command followed by a commit is the common
> happy-new-year thing to do:
> perl src/tools/copyright.pl
> This would update the copyright headers to 2016 :)

Yes, it is on my January 1 list of things to do.   I will do it today.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Andres Freund
On January 2, 2016 6:28:10 PM GMT+01:00, Tom Lane  wrote:
>Andres Freund  writes:
>> A bit of searching around brought up that we saw issues around this
>> before:
>> http://www.postgresql.org/message-id/4351.1336927...@sss.pgh.pa.us
>
>Indeed.  It doesn't look like any of the cleanup I suggested in that
>thread has ever gotten done.  I suspect that we'll continue to see
>problems until we get rid of the transient event object attachments.

That'd address some of the problem, but that'd not address the edge triggered 
behaviour of FD-CLOSE. I think we'll have to abstract away windows sockets, and 
store the event & state there.

Andres


--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failing assertions in spgtextproc.c

2016-01-02 Thread Andreas Seltenreich
Tom Lane writes:

> Andreas Seltenreich  writes:
>> TRAP: FailedAssertion([...], File: "spgtextproc.c", Line: 424)
>> TRAP: FailedAssertion([...], File: "spgtextproc.c", Line: 564)
>
> Can you show us the definition of the index that's causing this,
> and some samples of the data you're putting in it?

Here's a recipe for triggering the former:

create table t(c text);
create index on t using spgist(c);
insert into t select '' from generate_series(1,1);
set enable_seqscan to off; select count(1) from t;

I still think it's just the assertions being too strict.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Andres Freund
On 2016-01-02 15:40:03 +0100, Andres Freund wrote:
> If FD_CLOSE is indeed edge and not level triggered - which imo would be
> supremely insane - we'd be in trouble. It'd explain why some failures
> are noticed and others not.

I wonder if the FD_CLOSE and FD_WRITE being edge-triggered is the actual
root cause for:
/*
 * Just a workaround of unknown locking problem with writing in UDP 
socket
 * under high load: Client's pgsql backend sleeps infinitely in
 * WaitForMultipleObjectsEx, pgstat process sleeps in pgwin32_select().
 * So, we will wait with small timeout(0.1 sec) and if socket is still
 * blocked, try WSASend (see comments in pgwin32_select) and wait again.
 */
if ((what & FD_WRITE) && isUDP)
in pgwin32_waitforsinglesocket().

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Tom Lane
Andres Freund  writes:
> On January 2, 2016 6:28:10 PM GMT+01:00, Tom Lane  wrote:
>> Indeed.  It doesn't look like any of the cleanup I suggested in that
>> thread has ever gotten done.  I suspect that we'll continue to see
>> problems until we get rid of the transient event object attachments.

> That'd address some of the problem, but that'd not address the edge triggered 
> behaviour of FD-CLOSE. I think we'll have to abstract away windows sockets, 
> and store the event & state there.

Right.  What I wrote in the 2012 thread was

: More generally, it seems clear to me that Microsoft's code is designed
: around the assumption that an event object remains attached to a socket
: for the lifetime of the socket.  This business of transiently associating
: event objects with sockets looks quite inefficient and is evidently
: triggering a lot of unpleasant corner-case behaviors.  I wonder whether we
: should not try to make "pgsocket" encapsulate a socket and an associated
: event object as a single entity on Windows.  (Such a struct would be a
: good place to keep a per-socket noblock flag, too.)  I'm not going to
: tackle that myself though.

which I think is the same as what you're suggesting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Welcome to 2016, time to run src/tools/copyright.pl

2016-01-02 Thread Bruce Momjian
On Sat, Jan  2, 2016 at 12:00:36PM -0500, Bruce Momjian wrote:
> On Fri, Jan  1, 2016 at 04:43:07PM +0900, Michael Paquier wrote:
> > Hi all,
> > 
> > Happy new year to all and best wishes!
> > 
> > I guess that the following command followed by a commit is the common
> > happy-new-year thing to do:
> > perl src/tools/copyright.pl
> > This would update the copyright headers to 2016 :)
> 
> Yes, it is on my January 1 list of things to do.   I will do it today.

Done.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Tom Lane
Andres Freund  writes:
> I found a few more resources confirming that FD_CLOSE is edge
> triggered. Which probably doesn't just make our code buggy when waiting
> twice on the same socket, but probably also makes it very timing
> dependent: As the event is only triggered when the close actually occurs
> it's possible that we don't have any event associated with that socket:
> We only do so for shorts amount of time in WaitLatchOrSocket() and
> pgwin32_waitforsinglesocket().

Does the timing dependence explain why we've not been able to trigger this
by killing psql?

If the bug only occurs when the client connection drops when we're not
waiting for input, that would likely explain why nobody noticed it for
ten months.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Release notes of 9.0~9.3 mentioning recovery_min_apply_delay incorrectly

2016-01-02 Thread Tom Lane
Michael Paquier  writes:
> This is obviously incorrect because recovery_min_apply_delay has been only
> introduced in 9.4. The culprit is visibly the commit message of 8049839 and
> others that mentioned the parameter, though the patch applied does nothing
> about it. Please see attached a patch to fix that.

Good catch!  That's on me I guess for not checking what the patch had done
in the back branches.

I didn't like simply deleting any description of the patch's effects, though.
Instead I made it talk about recovery_target_xid, which does exist in all
those branches.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Core dump with nested CREATE TEMP TABLE

2016-01-02 Thread Noah Misch
On Sat, Jan 02, 2016 at 01:46:07PM -0500, Tom Lane wrote:
> Noah Misch  writes:
> > On Thu, Sep 03, 2015 at 11:04:11PM -0400, Tom Lane wrote:
> >> *** AtSubAbort_Portals(SubTransactionId mySu
> 
> >> --- 909,966 
> >> {
> >> Portal portal = hentry->portal;
> >> 
> >> +  /* Was it created in this subtransaction? */
> >> if (portal->createSubid != mySubid)
> >> +  {
> >> +  /* No, but maybe it was used in this subtransaction? */
> >> +  if (portal->activeSubid == mySubid)
> >> +  {
> > ...
> >> +  if (portal->status == PORTAL_ACTIVE)
> >> +  MarkPortalFailed(portal);
> 
> > Do you have a test case that reaches this particular MarkPortalFailed() 
> > call?
> > My attempts stumbled over the fact that, before we reach here, each of the
> > three MarkPortalActive() callers will have already called MarkPortalFailed()
> > in its PG_CATCH block.  ("make check" does not reach this call.)
> 
> Offhand I think that's just belt-and-suspenders-too coding.  As you say,
> we'd typically have failed active portals already before getting here.
> But the responsibility of this routine is to *guarantee* that no broken
> portals remain active, so I'd not want to remove this check.
> 
> Do you have a particular reason for asking?

After reading the log message for and comments added in commit c5454f9, I
understood that the commit changed PostgreSQL to fail portals that did not
fail before.  That is to say, queries that formerly completed without error
would now elicit errors.  I was looking into what sorts of queries it affected
in this way.  If the new MarkPortalFailed() is indeed dead code, then the
commit affects no query that way.  The meat of the commit is then the
ResourceOwnerNewParent() call, which lets queries ERROR cleanly instead of
seeing assertion failures or undefined behavior.

I am inclined to add an Assert(portal->status != PORTAL_ACTIVE) to emphasize
that this is backup only.  MarkPortalActive() callers remain responsible for
updating the status to something else before relinquishing control.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Core dump with nested CREATE TEMP TABLE

2016-01-02 Thread Tom Lane
Noah Misch  writes:
> I am inclined to add an Assert(portal->status != PORTAL_ACTIVE) to emphasize
> that this is backup only.  MarkPortalActive() callers remain responsible for
> updating the status to something else before relinquishing control.

No, I do not think that would be an improvement.  There is no contract
saying that this must be done earlier, IMO.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Petr Jelinek

On 2016-01-02 22:31, Andres Freund wrote:

On 2016-01-02 22:25:31 +0100, Brar Piening wrote:

Andres Freund wrote:

That seems like a pretty straight forward bug. But it hinges on the
client side calling shutdown() on the socket. I don't know enough about
.net's internals to judge wether it does so. I've traced things far
enough to find
"Disposing a Stream object flushes any buffered data, and essentially
calls the Flush method for you. Dispose also releases operating system
resources such as file handles, network connections, or memory used for
any internal buffering. The BufferedStream class provides the capability
of wrapping a buffered stream around another stream in order to improve
read and write performance."
https://msdn.microsoft.com/en-us/library/system.io.stream%28v=vs.110%29.aspx

which'd plausibly use shutdown().


In the new days of Microsoft you can confirm that even more...
http://referencesource.microsoft.com/#System/net/System/Net/Sockets/Socket.cs,6245


Thanks for digging thatup!

Indeed it does use shutdown(). If I read the npgsql code that'll even be
done in the exception handling path. So fixing the 0 byte case might
already do the trick.

Could any of the windows user try to check whether fixing
if (r != SOCKET_ERROR && b > 0)
/* Read succeeded right away */
return b;
to
if (r != SOCKET_ERROR && b >= 0)
/* Read succeeded right away */
return b;
already does the trick?



Yes it does indeed fix it.

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Andres Freund
On 2016-01-02 15:11:42 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > I found a few more resources confirming that FD_CLOSE is edge
> > triggered. Which probably doesn't just make our code buggy when waiting
> > twice on the same socket, but probably also makes it very timing
> > dependent: As the event is only triggered when the close actually occurs
> > it's possible that we don't have any event associated with that socket:
> > We only do so for shorts amount of time in WaitLatchOrSocket() and
> > pgwin32_waitforsinglesocket().
> 
> Does the timing dependence explain why we've not been able to trigger this
> by killing psql?

I think so. Possibly it'd be reproducible in psql by sending a
interrupting while executing pg_sleep(3000) or something involving the
copy protocol?

> If the bug only occurs when the client connection drops when we're not
> waiting for input, that would likely explain why nobody noticed it for
> ten months.

Yea. I think there also might also be another issue: Windows' recv() -
which we're not using, but I don't see differing documentation for
WSARecv() - returns 0 bytes if a socket was closed 'gracefully',
i.e. shutdown(SD_SEND) was called on the client side (similar to
unix' recv).

pgwin32_recv() converts a 0 byte return from WSARecv() into
if (pgwin32_noblock)
{
/*
 * No data received, and we are in "emulated non-blocking 
mode", so
 * return indicating that we'd block if we were to continue.
 */
errno = EWOULDBLOCK;
return -1;
}

which would explain why we'd eat the FD_CLOSE and then just continue
waiting...

That seems like a pretty straight forward bug. But it hinges on the
client side calling shutdown() on the socket. I don't know enough about
.net's internals to judge wether it does so. I've traced things far
enough to find
"Disposing a Stream object flushes any buffered data, and essentially
calls the Flush method for you. Dispose also releases operating system
resources such as file handles, network connections, or memory used for
any internal buffering. The BufferedStream class provides the capability
of wrapping a buffered stream around another stream in order to improve
read and write performance."
https://msdn.microsoft.com/en-us/library/system.io.stream%28v=vs.110%29.aspx

which'd plausibly use shutdown().

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improved error reporting in format()

2016-01-02 Thread Jim Nasby

On 1/2/16 5:57 PM, Jim Nasby wrote:

Attached patch clarifies that %-related error messages with hints as
well as (IMHO) improving the clarity of the message:


Sorry, forgot to update regression tests. New patch attached.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 8683188..9583ed0 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -4647,7 +4647,8 @@ text_reverse(PG_FUNCTION_ARGS)
if (++(ptr) >= (end_ptr)) \
ereport(ERROR, \

(errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
-errmsg("unterminated format 
specifier"))); \
+errmsg("unterminated format() type 
specifier"), \
+errhint("For a single \"%%\" use 
\"\"."))); \
} while (0)
 
 /*
@@ -4779,8 +4780,9 @@ text_format(PG_FUNCTION_ARGS)
if (strchr("sIL", *cp) == NULL)
ereport(ERROR,

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-errmsg("unrecognized conversion type 
specifier \"%c\"",
-   *cp)));
+errmsg("unrecognized format() type 
specifier \"%c\"",
+   *cp),
+errhint("For a single \"%%\" use 
\"\".")));
 
/* If indirect width was specified, get its value */
if (widthpos >= 0)
@@ -4791,7 +4793,7 @@ text_format(PG_FUNCTION_ARGS)
if (arg >= nargs)
ereport(ERROR,

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-errmsg("too few arguments for 
format")));
+errmsg("too few arguments for 
format()")));
 
/* Get the value and type of the selected argument */
if (!funcvariadic)
@@ -4899,8 +4901,9 @@ text_format(PG_FUNCTION_ARGS)
/* should not get here, because of previous 
check */
ereport(ERROR,

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized conversion type 
specifier \"%c\"",
-*cp)));
+ errmsg("unrecognized format() type 
specifier \"%c\"",
+*cp),
+ errhint("For a single \"%%\" use 
\"\".")));
break;
}
}
diff --git a/src/test/regress/expected/text.out 
b/src/test/regress/expected/text.out
index 1587046..a74b9ce 100644
--- a/src/test/regress/expected/text.out
+++ b/src/test/regress/expected/text.out
@@ -211,7 +211,8 @@ ERROR:  too few arguments for format
 select format('Hello %s');
 ERROR:  too few arguments for format
 select format('Hello %x', 20);
-ERROR:  unrecognized conversion type specifier "x"
+ERROR:  unrecognized format() type specifier "x"
+HINT:  For a single "%" use "%%".
 -- check literal and sql identifiers
 select format('INSERT INTO %I VALUES(%L,%L)', 'mytab', 10, 'Hello');
  format 
@@ -263,9 +264,11 @@ ERROR:  format specifies argument 0, but arguments are 
numbered from 1
 select format('%*0$s', 'Hello');
 ERROR:  format specifies argument 0, but arguments are numbered from 1
 select format('%1$', 1);
-ERROR:  unterminated format specifier
+ERROR:  unterminated format() type specifier
+HINT:  For a single "%" use "%%".
 select format('%1$1', 1);
-ERROR:  unterminated format specifier
+ERROR:  unterminated format() type specifier
+HINT:  For a single "%" use "%%".
 -- check mix of positional and ordered placeholders
 select format('Hello %s %1$s %s', 'World', 'Hello again');
 format 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Improved error reporting in format()

2016-01-02 Thread Jim Nasby
The current error message for an invalid format conversion type is 
extremely confusing except in the simplest of uses:


select format( '% moo');
ERROR:  unrecognized conversion type specifier " "

Obviously in that example you can figure out what's going on, but 
frequently format() is used in a complex context where it's not at all 
obvious that format is the problem. Even worse, "conversion" makes it 
sound like a localization issue.


Attached patch clarifies that %-related error messages with hints as 
well as (IMHO) improving the clarity of the message:


select format( '% moo');
ERROR:  unrecognized format() type specifier " "
HINT:  For a single "%" use "%%"

I also made the use of "format()" consistent in all the other error 
messages.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index a89f586..a4552ad 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -4647,7 +4647,8 @@ text_reverse(PG_FUNCTION_ARGS)
if (++(ptr) >= (end_ptr)) \
ereport(ERROR, \

(errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
-errmsg("unterminated format 
specifier"))); \
+errmsg("unterminated format() type 
specifier"), \
+errhint("For a single \"%%\" use 
\"\"."))); \
} while (0)
 
 /*
@@ -4779,8 +4780,9 @@ text_format(PG_FUNCTION_ARGS)
if (strchr("sIL", *cp) == NULL)
ereport(ERROR,

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-errmsg("unrecognized conversion type 
specifier \"%c\"",
-   *cp)));
+errmsg("unrecognized format() type 
specifier \"%c\"",
+   *cp),
+errhint("For a single \"%%\" use 
\"\".")));
 
/* If indirect width was specified, get its value */
if (widthpos >= 0)
@@ -4791,7 +4793,7 @@ text_format(PG_FUNCTION_ARGS)
if (arg >= nargs)
ereport(ERROR,

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-errmsg("too few arguments for 
format")));
+errmsg("too few arguments for 
format()")));
 
/* Get the value and type of the selected argument */
if (!funcvariadic)
@@ -4899,8 +4901,9 @@ text_format(PG_FUNCTION_ARGS)
/* should not get here, because of previous 
check */
ereport(ERROR,

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized conversion type 
specifier \"%c\"",
-*cp)));
+ errmsg("unrecognized format() type 
specifier \"%c\"",
+*cp),
+ errhint("For a single \"%%\" use 
\"\".")));
break;
}
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Brar Piening

Andres Freund wrote:

That seems like a pretty straight forward bug. But it hinges on the
client side calling shutdown() on the socket. I don't know enough about
.net's internals to judge wether it does so. I've traced things far
enough to find
"Disposing a Stream object flushes any buffered data, and essentially
calls the Flush method for you. Dispose also releases operating system
resources such as file handles, network connections, or memory used for
any internal buffering. The BufferedStream class provides the capability
of wrapping a buffered stream around another stream in order to improve
read and write performance."
https://msdn.microsoft.com/en-us/library/system.io.stream%28v=vs.110%29.aspx

which'd plausibly use shutdown().


In the new days of Microsoft you can confirm that even more...
http://referencesource.microsoft.com/#System/net/System/Net/Sockets/Socket.cs,6245

Regards,
Brar


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failing assertions in spgtextproc.c

2016-01-02 Thread Tom Lane
Andreas Seltenreich  writes:
> Here's a recipe for triggering the former:

> create table t(c text);
> create index on t using spgist(c);
> insert into t select '' from generate_series(1,1);
> set enable_seqscan to off; select count(1) from t;

Ah-hah.  Thanks for the test case.

> I still think it's just the assertions being too strict.

Yes, now I agree.  Fix pushed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Andres Freund
On 2016-01-02 22:25:31 +0100, Brar Piening wrote:
> Andres Freund wrote:
> >That seems like a pretty straight forward bug. But it hinges on the
> >client side calling shutdown() on the socket. I don't know enough about
> >.net's internals to judge wether it does so. I've traced things far
> >enough to find
> >"Disposing a Stream object flushes any buffered data, and essentially
> >calls the Flush method for you. Dispose also releases operating system
> >resources such as file handles, network connections, or memory used for
> >any internal buffering. The BufferedStream class provides the capability
> >of wrapping a buffered stream around another stream in order to improve
> >read and write performance."
> >https://msdn.microsoft.com/en-us/library/system.io.stream%28v=vs.110%29.aspx
> >
> >which'd plausibly use shutdown().
> 
> In the new days of Microsoft you can confirm that even more...
> http://referencesource.microsoft.com/#System/net/System/Net/Sockets/Socket.cs,6245

Thanks for digging thatup!

Indeed it does use shutdown(). If I read the npgsql code that'll even be
done in the exception handling path. So fixing the 0 byte case might
already do the trick.

Could any of the windows user try to check whether fixing
if (r != SOCKET_ERROR && b > 0)
/* Read succeeded right away */
return b;
to
if (r != SOCKET_ERROR && b >= 0)
/* Read succeeded right away */
return b;
already does the trick?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-02 Thread Amit Kapila
On Sun, Jan 3, 2016 at 3:01 AM, Andres Freund  wrote:
> On 2016-01-02 22:25:31 +0100, Brar Piening wrote:
> > Andres Freund wrote:
> > >That seems like a pretty straight forward bug. But it hinges on the
> > >client side calling shutdown() on the socket. I don't know enough about
> > >.net's internals to judge wether it does so. I've traced things far
> > >enough to find
> > >"Disposing a Stream object flushes any buffered data, and essentially
> > >calls the Flush method for you. Dispose also releases operating system
> > >resources such as file handles, network connections, or memory used for
> > >any internal buffering. The BufferedStream class provides the
capability
> > >of wrapping a buffered stream around another stream in order to improve
> > >read and write performance."
> > >
https://msdn.microsoft.com/en-us/library/system.io.stream%28v=vs.110%29.aspx
> > >
> > >which'd plausibly use shutdown().
> >
> > In the new days of Microsoft you can confirm that even more...
> >
http://referencesource.microsoft.com/#System/net/System/Net/Sockets/Socket.cs,6245
>
> Thanks for digging thatup!
>
> Indeed it does use shutdown(). If I read the npgsql code that'll even be
> done in the exception handling path. So fixing the 0 byte case might
> already do the trick.
>

I think this true for a TCP socket, but this code-path is used for UDP
(SOCK_DGRAM) sockets as well and there is a comment below in
that function which seems to be indicating why originally 0 byte case
has not been handled at the place suggested by you (now it seems to
be much less relevant).

pgwin32_recv()

{

..

/* No error, zero bytes (win2000+) or error+WSAEWOULDBLOCK (<=nt4) */


for (n = 0; n < 5; n++)

{
..
}


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Fwd: Core dump with nested CREATE TEMP TABLE

2016-01-02 Thread Noah Misch
On Sat, Jan 02, 2016 at 07:22:13PM -0500, Tom Lane wrote:
> Noah Misch  writes:
> > I am inclined to add an Assert(portal->status != PORTAL_ACTIVE) to emphasize
> > that this is backup only.  MarkPortalActive() callers remain responsible for
> > updating the status to something else before relinquishing control.
> 
> No, I do not think that would be an improvement.  There is no contract
> saying that this must be done earlier, IMO.

Indeed, nobody wrote a contract.  The assertion would record what has been the
sole standing practice for eleven years (since commit a393fbf9).  It would
prompt discussion if a proposed patch would depart from that practice, and
that is a good thing.  Also, every addition of dead code should label that
code to aid future readers.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers