Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-17 Thread Michael Paquier
On Mon, Jun 15, 2015 at 8:26 AM, Michael Paquier wrote:
> hamster is legendary slow and has a slow disc, hence it improves
> chances of catching race conditions, and it is the only slow buildfarm
> machine enabling the TAP tests (by comparison dangomushi has never
> failed with the TAP tests) hence I would prefer thinking that the
> problem is not specific to ArchLinux ARM. In this case the failure
> seems to be related to the timing test servers stop and start even if
> -w switch is used with pg_ctl, particularly that PGPORT is set to the
> same value for all servers... Still, for the time being I don't mind
> disabling them and just did so now. I will try to investigate further
> on the machine itself.

So, I have been doing some progress on this thing.. And wrote the
patch attached that improves the way TAP tests log their output using
IPC::Run::run. This simply creates a log file as regress_log/test_name
for each test run, capturing all the command outputs. For example on
current HEAD, some commands of pg_ctl are simply silenced or have
their output redirected to /dev/null. With this patch, all their
output is redirected to the log file. This is as well compatible with
Windows, and logging of pg_rewind tests is unified with the new
structure of TestLib.pm. I am going to apply that to hamster and see
what is causing the error reported.

I think that it would be useful as well to improve the buildfarm
output. Thoughts?
Regards
-- 
Michael
From b74108a2f3680930df2f9749743cb797d8dbb024 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 18 Jun 2015 15:44:42 +0900
Subject: [PATCH] Improve log capture of TAP tests

All tests have their logs stored as regress_log/$TEST_NAME, with content
captured from the many commands run during the tests.
---
 src/Makefile.global.in |  1 +
 src/bin/pg_basebackup/.gitignore   |  1 +
 src/bin/pg_basebackup/Makefile |  2 +-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl   |  4 +-
 src/bin/pg_controldata/.gitignore  |  1 +
 src/bin/pg_controldata/Makefile|  2 +-
 src/bin/pg_controldata/t/001_pg_controldata.pl |  2 +-
 src/bin/pg_ctl/.gitignore  |  1 +
 src/bin/pg_ctl/Makefile|  2 +-
 src/bin/pg_ctl/t/001_start_stop.pl |  2 +-
 src/bin/pg_ctl/t/002_status.pl |  6 +--
 src/bin/pg_rewind/RewindTest.pm| 58 --
 src/bin/scripts/.gitignore |  1 +
 src/bin/scripts/Makefile   |  2 +-
 src/test/perl/TestLib.pm   | 46 ++--
 src/test/ssl/.gitignore|  1 +
 src/test/ssl/Makefile  |  3 ++
 src/test/ssl/ServerSetup.pm|  8 ++--
 src/test/ssl/t/001_ssltests.pl |  2 +-
 19 files changed, 85 insertions(+), 60 deletions(-)

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index c583b44..a21aa19 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -336,6 +336,7 @@ cd $(srcdir) && TESTDIR='$(CURDIR)' PATH="$(bindir):$$PATH" PGPORT='6$(DEF_PGPOR
 endef
 
 define prove_check
+rm -r $(srcdir)/regress_log
 cd $(srcdir) && TESTDIR='$(CURDIR)' $(with_temp_install) PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)/$(top_builddir)' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) t/*.pl
 endef
 
diff --git a/src/bin/pg_basebackup/.gitignore b/src/bin/pg_basebackup/.gitignore
index 36a2f12..746908b 100644
--- a/src/bin/pg_basebackup/.gitignore
+++ b/src/bin/pg_basebackup/.gitignore
@@ -2,4 +2,5 @@
 /pg_receivexlog
 /pg_recvlogical
 
+/regress_log/
 /tmp_check/
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index 0d8421a..66771dc 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -48,7 +48,7 @@ clean distclean maintainer-clean:
 	rm -f pg_basebackup$(X) pg_receivexlog$(X) pg_recvlogical$(X) \
 		pg_basebackup.o pg_receivexlog.o pg_recvlogical.o \
 		$(OBJS)
-	rm -rf tmp_check
+	rm -rf tmp_check regress_log
 
 check:
 	$(prove_check)
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 3476ea6..ecff372 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -22,7 +22,7 @@ print HBA "local replication all trust\n";
 print HBA "host replication all 127.0.0.1/32 trust\n";
 print HBA "host replication all ::1/128 trust\n";
 close HBA;
-system_or_bail 'pg_ctl', '-s', '-D', "$tempdir/pgdata", 'reload';
+run_or_bail(['pg_ctl', '-s', '-D', "$tempdir/pgdata", 'reload']);
 
 command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup" ],
@@ -51,7 +51,7 @@ ok(-f "$tempdir/tarbackup/base.tar", 'backup tar was created');
 
 my $superlongname = "superlongname_" . ("x" x 100);
 
-system_or_bail 'touch', "$tempdir/pgdata/$superlongname";
+run_or_bail(['touch', "$tempdir/pgdata/$superlo

Re: [HACKERS] 9.5 make world failing due to sgml tools missing

2015-06-17 Thread Peter Eisentraut
On 6/17/15 3:35 PM, Keith Fiske wrote:
> The current HEAD of postgres in the git repo is not building when using
> "make world". It's been like this for about a month or so that I've been
> aware of. I didn't really need the world build so been making due
> without it. At PGCon now, though, so asked Bruce and he said this error
> was due to my not having the sgml tools installed, which should not be a
> requirement.

make world has always required documentation build tools.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2015-06-17 Thread Michael Paquier
On Thu, Jun 18, 2015 at 9:47 AM, Noah Misch  wrote:
> On Wed, Jun 03, 2015 at 05:25:45PM +0900, Michael Paquier wrote:
>> On Tue, Jun 2, 2015 at 4:19 PM, Michael Paquier   
>> wrote:
>> > On Sun, May 24, 2015 at 2:43 AM, Noah Misch  wrote:
>> > > It would be good to purge the code of precisions on "s" conversion 
>> > > specifiers,
>> > > then Assert(!pointflag) in fmtstr() to catch new introductions.  I won't 
>> > > plan
>> > > to do it myself, but it would be a nice little defensive change.
>> >
>> > This sounds like a good protection idea, but as it impacts existing
>> > backend code relying in sprintf's port version we should only do the
>> > assertion in HEAD in my opinion, and mention it in the release notes of the
>> > next major version at the time a patch in this area is applied. I guess
>
> Adding the assertion would be master-only.  We don't necessarily release-note
> C API changes.

Cool. So we are on the same page.

>> > that we had better backpatch the places using .*s though on back-branches.
>
> I would tend to back-patch only the ones that cause interesting bugs.  For
> example, we should not reach the read.c elog() calls anyway, so it's not a big
> deal if the GNU libc bug makes them a bit less helpful in back branches.
> (Thanks for the list of code sites; it was more complete than anything I had.)
> So far, only tar.c looks harmed enough to back-patch.
>
>> Attached is a patch purging a bunch of places from using %.*s, this will
>> make the code more solid when facing non-ASCII strings. I let pg_upgrade
>> and pg_basebackup code paths alone as it reduces the code lisibility by
>> moving out of this separator. We may want to fix them though if file path
>> names have non-ASCII characters, but it seems less critical.
>
> To add the assertion, we must of course fix all uses.

Sure.

> Having seen the patch I requested, I don't like the result as much as I
> expected to like it.  The patched code is definitely harder to read and write:
>
>> @@ -1534,7 +1541,10 @@ parseNodeString(void)
>>   return_value = _readDeclareCursorStmt();
>>   else
>>   {
>> - elog(ERROR, "badly formatted node string \"%.32s\"...", token);
>> + char buf[33];
>> + memcpy(buf, token, 32);
>> + buf[33] = '\0';
>> + elog(ERROR, "badly formatted node string \"%s\"...", buf);
>>   return_value = NULL;/* keep compiler quiet */
>>   }

We could spread what the first patch did in readfuncs.c by having some
more macros doing the duplicated work. Not that it would improve the
code readability of those macros..

> (Apropos, that terminator belongs in buf[32], not buf[33].)

Indeed.

> Perhaps we're better off setting aside the whole idea,

At least on OSX (10.8), I am seeing that no more than the number of
bytes defined by the precision is written. So it looks that we are
safe there. So yes thinking long-term this looks the better
alternative. And I am wondering about the potential breakage that this
could actually have with Postgres third-part tools using src/port's
snprintf.

> or forcing use of snprintf.c on configurations having the bug?

I am less sure about that. It doesn't seem worth it knowing that the
tendance is to evaluate the precision in terms in bytes and not
characters.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2015-06-17 Thread Noah Misch
On Wed, Jun 03, 2015 at 05:25:45PM +0900, Michael Paquier wrote:
> On Tue, Jun 2, 2015 at 4:19 PM, Michael Paquier   
> wrote:
> > On Sun, May 24, 2015 at 2:43 AM, Noah Misch  wrote:
> > > It would be good to purge the code of precisions on "s" conversion 
> > > specifiers,
> > > then Assert(!pointflag) in fmtstr() to catch new introductions.  I won't 
> > > plan
> > > to do it myself, but it would be a nice little defensive change.
> >
> > This sounds like a good protection idea, but as it impacts existing
> > backend code relying in sprintf's port version we should only do the
> > assertion in HEAD in my opinion, and mention it in the release notes of the
> > next major version at the time a patch in this area is applied. I guess

Adding the assertion would be master-only.  We don't necessarily release-note
C API changes.

> > that we had better backpatch the places using .*s though on back-branches.

I would tend to back-patch only the ones that cause interesting bugs.  For
example, we should not reach the read.c elog() calls anyway, so it's not a big
deal if the GNU libc bug makes them a bit less helpful in back branches.
(Thanks for the list of code sites; it was more complete than anything I had.)
So far, only tar.c looks harmed enough to back-patch.

> Attached is a patch purging a bunch of places from using %.*s, this will
> make the code more solid when facing non-ASCII strings. I let pg_upgrade
> and pg_basebackup code paths alone as it reduces the code lisibility by
> moving out of this separator. We may want to fix them though if file path
> names have non-ASCII characters, but it seems less critical.

To add the assertion, we must of course fix all uses.

Having seen the patch I requested, I don't like the result as much as I
expected to like it.  The patched code is definitely harder to read and write:

> @@ -1534,7 +1541,10 @@ parseNodeString(void)
>   return_value = _readDeclareCursorStmt();
>   else
>   {
> - elog(ERROR, "badly formatted node string \"%.32s\"...", token);
> + char buf[33];
> + memcpy(buf, token, 32);
> + buf[33] = '\0';
> + elog(ERROR, "badly formatted node string \"%s\"...", buf);
>   return_value = NULL;/* keep compiler quiet */
>   }

(Apropos, that terminator belongs in buf[32], not buf[33].)

Perhaps we're better off setting aside the whole idea, or forcing use of
snprintf.c on configurations having the bug?

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Function to get size of asynchronous notification queue

2015-06-17 Thread Brendan Jurd
Posting v2 of the patch, incorporating some helpful suggestions from Merlin.

Cheers,
BJ
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 14800,14805  SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
--- 14800,14811 

  

+pg_notification_queue_usage()
+double
+proportion of the asynchronous notification queue currently occupied (0-1)
+   
+ 
+   
 pg_my_temp_schema()
 oid
 OID of session's temporary schema, or 0 if none
***
*** 14939,14948  SET search_path TO schema , schema, ..
  pg_listening_channels
 
  
 
  pg_listening_channels returns a set of names of
! channels that the current session is listening to.  See  for more information.
 
  
 
--- 14945,14963 
  pg_listening_channels
 
  
+
+ pg_notification_queue_usage
+
+ 
 
  pg_listening_channels returns a set of names of
! asynchronous notification channels that the current session is listening
! to.  pg_notification_queue_usage returns the
! proportion of the total available space for notifications currently
! occupied by notifications that are waiting to be processed, as a
! double in the range 0-1.
! See  and 
! for more information.
 
  
 
*** a/doc/src/sgml/ref/notify.sgml
--- b/doc/src/sgml/ref/notify.sgml
***
*** 166,171  NOTIFY channel [ , 

+The function pg_notification_queue_usage returns the
+proportion of the queue that is currently occupied by pending notifications.
+See  for more information.
+   
+   
 A transaction that has executed NOTIFY cannot be
 prepared for two-phase commit.

*** a/src/backend/commands/async.c
--- b/src/backend/commands/async.c
***
*** 371,376  static bool asyncQueueIsFull(void);
--- 371,377 
  static bool asyncQueueAdvance(volatile QueuePosition *position, int entryLength);
  static void asyncQueueNotificationToEntry(Notification *n, AsyncQueueEntry *qe);
  static ListCell *asyncQueueAddEntries(ListCell *nextNotify);
+ static double asyncQueueUsage(void);
  static void asyncQueueFillWarning(void);
  static bool SignalBackends(void);
  static void asyncQueueReadAllNotifications(void);
***
*** 1362,1387  asyncQueueAddEntries(ListCell *nextNotify)
  }
  
  /*
!  * Check whether the queue is at least half full, and emit a warning if so.
!  *
!  * This is unlikely given the size of the queue, but possible.
!  * The warnings show up at most once every QUEUE_FULL_WARN_INTERVAL.
   *
!  * Caller must hold exclusive AsyncQueueLock.
   */
! static void
! asyncQueueFillWarning(void)
  {
! 	int			headPage = QUEUE_POS_PAGE(QUEUE_HEAD);
! 	int			tailPage = QUEUE_POS_PAGE(QUEUE_TAIL);
! 	int			occupied;
! 	double		fillDegree;
! 	TimestampTz t;
  
  	occupied = headPage - tailPage;
  
  	if (occupied == 0)
! 		return;	/* fast exit for common case */
  
  	if (occupied < 0)
  	{
--- 1363,1399 
  }
  
  /*
!  * SQL function to return the proportion of the notification queue currently
!  * occupied.
!  */
! Datum
! pg_notification_queue_usage(PG_FUNCTION_ARGS)
! {
! 	double usage;
! 
! 	LWLockAcquire(AsyncQueueLock, LW_SHARED);
! 	usage = asyncQueueUsage();
! 	LWLockRelease(AsyncQueueLock);
! 
! 	PG_RETURN_FLOAT8(usage);
! }
! 
! /*
!  * Return the proportion of the queue that is currently occupied.
   *
!  * The caller must hold (at least) shared AysncQueueLock.
   */
! static double
! asyncQueueUsage(void)
  {
! 	int		headPage = QUEUE_POS_PAGE(QUEUE_HEAD);
! 	int		tailPage = QUEUE_POS_PAGE(QUEUE_TAIL);
! 	int		occupied;
  
  	occupied = headPage - tailPage;
  
  	if (occupied == 0)
! 		return (double) 0;		/* fast exit for common case */
  
  	if (occupied < 0)
  	{
***
*** 1389,1396  asyncQueueFillWarning(void)
  		occupied += QUEUE_MAX_PAGE + 1;
  	}
  
! 	fillDegree = (double) occupied / (double) ((QUEUE_MAX_PAGE + 1) / 2);
  
  	if (fillDegree < 0.5)
  		return;
  
--- 1401,1424 
  		occupied += QUEUE_MAX_PAGE + 1;
  	}
  
! 	return (double) occupied / (double) ((QUEUE_MAX_PAGE + 1) / 2);
! }
! 
! /*
!  * Check whether the queue is at least half full, and emit a warning if so.
!  *
!  * This is unlikely given the size of the queue, but possible.
!  * The warnings show up at most once every QUEUE_FULL_WARN_INTERVAL.
!  *
!  * Caller must hold exclusive AsyncQueueLock.
!  */
! static void
! asyncQueueFillWarning(void)
! {
! 	double		fillDegree;
! 	TimestampTz t;
  
+ 	fillDegree = asyncQueueUsage();
  	if (fillDegree < 0.5)
  		return;
  
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***
*** 4036,4045  DATA(insert OID = 2856 (  pg_timezone_names		PGNSP PGUID 12 1 1000 0 0 f f f f t
  DESCR("get the available time zone names");
  DATA(insert OID = 2730 (  pg_get_triggerdef		PGNSP PGUID 12 1 0 0 0 f f f f t f s 2 0 25 "26 16" _null_ _null_ _null_ _null_ _n

Re: [HACKERS] 9.5 make world failing due to sgml tools missing

2015-06-17 Thread Joshua D. Drake


On 06/17/2015 01:07 PM, Tom Lane wrote:

Keith Fiske  writes:

The current HEAD of postgres in the git repo is not building when using
"make world". It's been like this for about a month or so that I've been
aware of. I didn't really need the world build so been making due without
it. At PGCon now, though, so asked Bruce and he said this error was due to
my not having the sgml tools installed, which should not be a requirement.


Dunno about "about a month"; this seems to have come in with commit
5d93ce2d0c619ba1 from last October.  But yes, that commit moved the
goalposts in terms of what is required to build the documentation.
You now must have xmllint which is something supplied by libxml2.

I am not sure this is a good thing, especially since xmllint is only
doing checking, which is of little use to consumers of the documentation.
Maybe if xmllint is not found, we should just skip the checking steps
rather than hard failing?



If they are building from source, this does not seem to be an 
unrealistic package to require.


JD



regards, tom lane





--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Function to get size of asynchronous notification queue

2015-06-17 Thread Brendan Jurd
On Thu, 18 Jun 2015 at 08:19 Merlin Moncure  wrote:

>
> scratch that.  that note already exists in sql-notify.html.  Instead,
> I'd modify that section to note that you can check queue usage with
> your new function.
>
>
I have already done so.  Under the paragraph about the queue filling up, I
have added:

"The function pg_notify_queue_saturation returns the
proportion of the queue that is currently occupied by pending
notifications."

A link from here back to the section in System Information Functions might
be sensible?

I will rename the function with _usage as you suggest, and add the
explanation of the return value in the docs.

Cheers,
BJ


Re: [HACKERS] "could not adopt C locale" failure at startup on Windows

2015-06-17 Thread Noah Misch
On Wed, Jun 17, 2015 at 01:43:55PM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > On Mon, Jun 15, 2015 at 12:37:43PM -0400, Tom Lane wrote:
> >> It's mere chance that the order of calls to pg_perm_setlocale() is
> >> such that the code works now; and it's not hard to imagine future
> >> changes in gettext, or reordering of our own code, that would break it.
> 
> > pg_bind_textdomain_codeset() looks at one piece of the locale environment,
> > namely setlocale(LC_CTYPE, NULL), so the order of pg_perm_setlocale() calls
> > does not affect it.
> 
> Well, my point is that that is a larger assumption about the behavior of
> pg_bind_textdomain_codeset() than I think we ought to make, even if it
> happens to be true today.

Perhaps it's just me, but I can envision changes of similar plausibility that
break under each approach and not the other.  Without a way to distinguish on
that basis, I'm left shrugging about a proposal to switch.  For that matter,
if pg_bind_textdomain_codeset() starts to act on other facets of the locale,
that's liable to be a bug independent of our choice here.  However locale
facets conflict, we expect LC_CTYPE to control the message codeset.

> > There's nothing acutely bad about the alternative you
> > identify here, but it's no better equipped to forestall mistakes.  Moving 
> > the
> > call later would slightly expand the window during which translated messages
> > are garbled.
> 
> I'm not exactly sure that they wouldn't be garbled anyway during the
> interval where we're setting these things up.  Until DatabaseEncoding,
> ClientEncoding, and gettext's internal notions are all in sync, we
> are at risk of that type of issue no matter what.

True; the window exists and is small enough both ways.  This is merely one
more reason to fix the bug without fixing what ain't broke.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Function to get size of asynchronous notification queue

2015-06-17 Thread Merlin Moncure
On Wed, Jun 17, 2015 at 5:15 PM, Merlin Moncure  wrote:
> *) A note regarding the 50% (0.5) threshold raising warnings in the
> log might be appropriate here

scratch that.  that note already exists in sql-notify.html.  Instead,
I'd modify that section to note that you can check queue usage with
your new function.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Function to get size of asynchronous notification queue

2015-06-17 Thread Merlin Moncure
On Wed, Jun 17, 2015 at 5:31 AM, Brendan Jurd  wrote:
> Hello hackers,
>
> I present a patch to add a new built-in function
> pg_notify_queue_saturation().
>
> The purpose of the function is to allow users to monitor the health of their
> notification queue.  In certain cases, a client connection listening for
> notifications might get stuck inside a transaction, and this would cause the
> queue to keep filling up, until finally it reaches capacity and further
> attempts to NOTIFY error out.
>
> The current documentation under LISTEN explains this possible gotcha, but
> doesn't really suggest a useful way to address it, except to mention that
> warnings will show up in the log once you get to 50% saturation of the
> queue.  Unless you happen to be eyeballing the logs when it happens, that's
> not a huge help.  The choice of 50% as a threshold is also very much
> arbitrary, and by the time you hit 50% the problem has likely been going on
> for quite a while.  If you want your nagios (or whatever) to say, alert you
> when the queue goes over 5% or 1%, your options are limited and awkward.
>
> The patch has almost no new code.  It makes use of the existing logic for
> the 50% warning.  I simply refactored that logic into a separate function
> asyncQueueSaturation, and then added pg_notify_queue_saturation to make that
> available in SQL.
>
> I am not convinced that pg_notify_queue_saturation is the best possible name
> for this function, and am very much open to other suggestions.
>
> The patch includes documentation, a regression test and an isolation test.

*) The documentation should indicate what the range of values mean --
looks like value is returned on 0-1 scale.

*) A note regarding the 50% (0.5) threshold raising warnings in the
log might be appropriate here

*) As you suspect, the name seems a little off to me.  'usage' seems
preferable to 'saturation', I think.  Perhaps,
pg_notification_queue_usage()?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL replay bugs

2015-06-17 Thread Michael Paquier
On Thu, Jun 18, 2015 at 3:39 AM, Alvaro Herrera
 wrote:
> Michael Paquier wrote:
>
>> From 077d675795b4907904fa4e85abed8c4528f4666f Mon Sep 17 00:00:00 2001
>> From: Michael Paquier 
>> Date: Sat, 19 Jul 2014 10:40:20 +0900
>> Subject: [PATCH 3/3] Buffer capture facility: check WAL replay consistency
>
> Is there a newer version of this tech?

Not yet.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-17 Thread Tom Lane
Alvaro Herrera  writes:
> Yeah, the case is pretty weird and I'm not really sure that the server
> ought to be expected to behave.  But if this is actually the only part
> of the server that misbehaves because of sudden gigantic time jumps, I
> think it's fair to patch it.  Here's a proposed patch.

I think the parenthetical bit in the comment should read "plus any
fractional second, for simplicity".  Also, maybe replace "300" by
a #define constant MAX_AUTOVAC_SLEEPTIME for readability?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] last_analyze/last_vacuum not being updated

2015-06-17 Thread Peter Eisentraut
On 6/15/15 4:45 PM, Peter Eisentraut wrote:
> On 6/8/15 3:16 PM, Peter Eisentraut wrote:
>> I'm looking at a case on 9.4.1 where the last_analyze and last_vacuum
>> stats for a handful of tables seem stuck.  They don't update after
>> running an ANALYZE or VACUUM command, and they don't react to
>> pg_stat_reset_single_table_counters().  All of the affected tables are
>> system catalogs, some shared, some not.  Other system catalogs and other
>> tables have their statistics updated normally.  Any ideas (before I try
>> to blow it away)?
> 
> This issue somehow went away before I had time to analyze it further,
> which is weird in itself.  But now I have seen a segfault on a
> completely different 9.4 instance while querying pg_stat_databases.

This was probably bug #12918.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN function of pageinspect has inconsistency data type.

2015-06-17 Thread Jim Nasby

On 6/16/15 8:26 AM, Sawada Masahiko wrote:

I noticed while using gin function of pageinspect that there are some
inconsistency data types.
For example, data type of GinMetaPageData.head, and tail  is
BlockNumber, i.g, uint32.
But in ginfunc.c, we get that value as int64.


You can't put a uint32 into a signed int32 (which is what the SQL int 
is). That's why it's returning a bigint.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 make world failing due to sgml tools missing

2015-06-17 Thread Tom Lane
Keith Fiske  writes:
> The current HEAD of postgres in the git repo is not building when using
> "make world". It's been like this for about a month or so that I've been
> aware of. I didn't really need the world build so been making due without
> it. At PGCon now, though, so asked Bruce and he said this error was due to
> my not having the sgml tools installed, which should not be a requirement.

Dunno about "about a month"; this seems to have come in with commit
5d93ce2d0c619ba1 from last October.  But yes, that commit moved the
goalposts in terms of what is required to build the documentation.
You now must have xmllint which is something supplied by libxml2.

I am not sure this is a good thing, especially since xmllint is only
doing checking, which is of little use to consumers of the documentation.
Maybe if xmllint is not found, we should just skip the checking steps
rather than hard failing?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On columnar storage

2015-06-17 Thread Jim Nasby

On 6/17/15 2:04 PM, Alvaro Herrera wrote:

Jim Nasby wrote:


Related to idea of an 'auto join', I do wish we had the ability to access
columns in a referenced FK table from a referring key; something like SELECT
customer_id.first_name FROM invoice (which would be translated to SELECT
first_name FROM invoice JOIN customer USING( customer_id )).


We already have this feature -- you just need to set the
add_missing_from GUC.

(... actually we removed that feature because it was considered
dangerous or something.)


That was removed in 9.0. Even so, I don't believe it added the JOIN, so 
you'd suddenly get a Cartesian product.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL replay bugs

2015-06-17 Thread Alvaro Herrera
Michael Paquier wrote:

> From 077d675795b4907904fa4e85abed8c4528f4666f Mon Sep 17 00:00:00 2001
> From: Michael Paquier 
> Date: Sat, 19 Jul 2014 10:40:20 +0900
> Subject: [PATCH 3/3] Buffer capture facility: check WAL replay consistency

Is there a newer version of this tech?


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Function to get size of asynchronous notification queue

2015-06-17 Thread Brendan Jurd
On Thu, 18 Jun 2015 at 03:06 Gurjeet Singh  wrote:

> I don't see this in the CF app; can you please add it there?
>

Done.  I did try to add it when I posted the email, but for some reason I
couldn't connect to commitfest.postgresql.org at all.  Seems fine now,
though.

Cheers,
BJ


Re: [HACKERS] On columnar storage

2015-06-17 Thread Alvaro Herrera
Jim Nasby wrote:

> Related to idea of an 'auto join', I do wish we had the ability to access
> columns in a referenced FK table from a referring key; something like SELECT
> customer_id.first_name FROM invoice (which would be translated to SELECT
> first_name FROM invoice JOIN customer USING( customer_id )).

We already have this feature -- you just need to set the
add_missing_from GUC.

(... actually we removed that feature because it was considered
dangerous or something.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

2015-06-17 Thread Tomas Vondra

Hi,

I'm currently running some tests on a 3TB TPC-H data set, and I tripped 
over a pretty bad n_distinct underestimate, causing OOM in HashAgg 
(which somehow illustrates the importance of the memory-bounded hashagg 
patch Jeff Davis is working on).


The problem is Q18, particularly this simple subquery:

select l_orderkey
from lineitem
group by l_orderkey
having sum(l_quantity) > 313;

which is planned like this:

   QUERY PLAN
-
 HashAggregate  (cost=598510163.92..598515393.93 rows=418401 width=12)
   Group Key: l_orderkey
   Filter: (sum(l_quantity) > '313'::double precision)
   ->  Seq Scan on lineitem  (cost=0.00..508509923.28 rows=1848128 
width=12)

(4 rows)

but sadly, in reality the l_orderkey cardinality looks like this:

tpch=# select count(distinct l_orderkey) from lineitem;
   count

 45
(1 row)

That's a helluva difference - not the usual one or two orders of 
magnitude, but 1x underestimate.


The usual thing to do in this case is increasing statistics target, and 
while this improves the estimate, the improvement is rather small:


   statistics target estimatedifference
   --
   100 429491 1
   1000   4240418  1000
   1 42913759   100

I find the pattern rather strange - every time the statistics target 
increases 10x, the difference decreases 10x - maybe that's natural, but 
the perfect proportionality is suspicious IMHO.


Also, this is a quite large dataset - the table has ~18 billion rows, 
and even with target=1 we're sampling only 3M rows, which is ~0.02%. 
That's a tiny sample, so inaccuracy is naturally expected, but OTOH the 
TPC-H dataset is damn uniform - there's pretty much no skew in the 
distributions AFAIK. So I'd expect a slightly better result.


With target=1 the plan switches to GroupAggregate, because the 
estimate gets sufficient to exceed work_mem (2GB). But it's still way 
off, and it's mostly just a lucky coincidence.


So I'm wondering if there's some bug because of the dataset size (an 
integer overflow or something like), so I added a bunch of logging into 
the estimator, logging all the parameters computed:


target=100 (samplerows=3)
-
WARNING:  attnum=1 attname=l_orderkey f1=27976 ndistinct=28977 
nmultiple=1001 toowide_cnt=0 d=28977 numer=86931.00 
denom=2024.046627 stadistinct=429491.094029
WARNING:  ndistinct estimate attnum=1 attname=l_orderkey 
current=429491.09 adaptive=443730.00


target=1000 (samplerows=30)
---
WARNING:  attnum=1 attname=l_orderkey f1=279513 ndistinct=289644 
nmultiple=10131 toowide_cnt=0 d=289644 numer=8689320.00 
denom=20491.658538 stadistinct=4240418.111618
WARNING:  ndistinct estimate attnum=1 attname=l_orderkey 
current=4240418.11 adaptive=4375171.00


target=1 (samplerows=300)
-
WARNING:  attnum=1 attname=l_orderkey f1=2797888 ndistinct=2897799 
nmultiple=99911 toowide_cnt=0 d=2897799 numer=869339700.00 
denom=202578.313396 stadistinct=42913759.396282
WARNING:  ndistinct estimate attnum=1 attname=l_orderkey 
current=42913759.40 adaptive=9882.00


It's totalrows=1849031 in all cases. The logs also show estimate 
produced by the adaptive estimate (discussed in a separate thread), but 
apparently that does not change the estimates much :-(


Any ideas?

--
Tomas Vondra   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "could not adopt C locale" failure at startup on Windows

2015-06-17 Thread Tom Lane
Noah Misch  writes:
> On Mon, Jun 15, 2015 at 12:37:43PM -0400, Tom Lane wrote:
>> It's mere chance that the order of calls to pg_perm_setlocale() is
>> such that the code works now; and it's not hard to imagine future
>> changes in gettext, or reordering of our own code, that would break it.

> pg_bind_textdomain_codeset() looks at one piece of the locale environment,
> namely setlocale(LC_CTYPE, NULL), so the order of pg_perm_setlocale() calls
> does not affect it.

Well, my point is that that is a larger assumption about the behavior of
pg_bind_textdomain_codeset() than I think we ought to make, even if it
happens to be true today.

> There's nothing acutely bad about the alternative you
> identify here, but it's no better equipped to forestall mistakes.  Moving the
> call later would slightly expand the window during which translated messages
> are garbled.

I'm not exactly sure that they wouldn't be garbled anyway during the
interval where we're setting these things up.  Until DatabaseEncoding,
ClientEncoding, and gettext's internal notions are all in sync, we
are at risk of that type of issue no matter what.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Function to get size of asynchronous notification queue

2015-06-17 Thread Gurjeet Singh
I don't see this in the CF app; can you please add it there?

Best regards,

On Wed, Jun 17, 2015 at 3:31 AM, Brendan Jurd  wrote:

> Hello hackers,
>
> I present a patch to add a new built-in function
> pg_notify_queue_saturation().
>
> The purpose of the function is to allow users to monitor the health of
> their notification queue.  In certain cases, a client connection listening
> for notifications might get stuck inside a transaction, and this would
> cause the queue to keep filling up, until finally it reaches capacity and
> further attempts to NOTIFY error out.
>
> The current documentation under LISTEN explains this possible gotcha, but
> doesn't really suggest a useful way to address it, except to mention that
> warnings will show up in the log once you get to 50% saturation of the
> queue.  Unless you happen to be eyeballing the logs when it happens, that's
> not a huge help.  The choice of 50% as a threshold is also very much
> arbitrary, and by the time you hit 50% the problem has likely been going on
> for quite a while.  If you want your nagios (or whatever) to say, alert you
> when the queue goes over 5% or 1%, your options are limited and awkward.
>
> The patch has almost no new code.  It makes use of the existing logic for
> the 50% warning.  I simply refactored that logic into a separate function
> asyncQueueSaturation, and then added pg_notify_queue_saturation to make
> that available in SQL.
>
> I am not convinced that pg_notify_queue_saturation is the best possible
> name for this function, and am very much open to other suggestions.
>
> The patch includes documentation, a regression test and an isolation test.
>
> Cheers,
> BJ
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
Gurjeet Singh http://gurjeet.singh.im/


Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-17 Thread Tom Lane
Haribabu Kommi  writes:
> I can think of a case where the "launcher_determine_sleep" function
> returns a big sleep value because of system time change.
> Because of that it is possible that the launcher is not generating
> workers to do the vacuum. May be I am wrong.

I talked with Alvaro about this and we agreed that's most likely what
happened.  The launcher tracks future times-to-wake-up as absolute times,
so shortly after the system clock went backwards, it could have computed
that the next time to wake up was 20 years in the future, and issued a
sleep() call for 20 years.  Fixing the system clock after that would not
have caused it to wake up again.

It looks like a SIGHUP (pg_ctl reload) ought to be enough to wake it up,
or of course you could restart the server.

In HEAD this doesn't seem like it could cause an indefinite sleep because
if nothing else, sinval queue overrun would eventually wake the launcher
even without any manual action from the DBA.  But the loop logic is
different in 9.1.

launcher_determine_sleep() does have a minimum sleep time, and it seems
like we could fairly cheaply guard against this kind of scenario by also
enforcing a maximum sleep time (of say 5 or 10 minutes).  Not quite
convinced whether it's worth the trouble though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-17 Thread Alvaro Herrera
Prakash Itnal wrote:

> Currently the issue is easily reproducible. Steps to reproduce:
> * Set some aggressive values for auto-vacuuming.
> * Run a heavy database update/delete/insert queries. This leads to invoking
> auto-vacuuming in quick successions.
> * Change the system time to older for eg. 1995-01-01
> 
> Suddenly auto-vacuuming stops working. Even after changing system time back
> to current time, the auto-vacuuming did not resume.
> 
> So the question is, "does postrges supports system time changes?".

So Tom Lane just pinged me about this.  As far as I can tell, the
problem is that the clock goes backwards 20 years, then autovacuum
figures that it needs to sleep for 20 years until the next vacuum is
scheduled.  Then regardless of the clock moving forwards again, the
sleep is not affected by this.  (I didn't actually test this.)

A simple workaround would be to stop autovacuum then restart it, that
is, set autovacuum=off in postgresql.conf, send SIGHUP to postmaster
which should stop the autovacuum launcher, then set autovacuum=on and
SIGHUP again, which would start a new launcher.

As a proposed code fix, we could just clamp the sleep time to, say, 5
minutes.  In that case the launcher would wake up even if the next
database check is still a ways off, but that should be pretty harmless
-- we just compute a new sleep period and continue sleeping.  (That is,
notwitshstanding the fact that the sleep time should never really go
above a minute in most configurations.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sequence Access Method WIP

2015-06-17 Thread Petr Jelinek

On 2015-06-15 11:32, Vik Fearing wrote:

I've been looking at these patches a bit and here are some comments:



Thanks for looking at this.


Patch 1: seqam

I would like to see an example in the docs for CREATE SEQUENCE.  That's
perhaps not possible (or desirable) with only the "local" seqam?  Not sure.



It is possible to have example with local seqam, it might be confusing 
though, given it produces same results as not putting USING in the query.



In the docs for pg_class, there is no mention that relam refers to
pg_seqam for sequences but pg_am for all other types.

There are errant half-sentences in the documentation, for example "to
the options in the CREATE SEQUENCE or ALTER SEQUENCE statement." in
Sequence Access Method Functions.


I think that's the side effect of all the rebases and rewrites over the 
2y(!) that this has been going forward and back. It can be easily fixed 
by proof reading before final submission. I didn't pay too much 
attention yet because it's not clear how the docs should look like if 
there is no real agreement on the api. (This applies to other comments 
about docs as well)




I'd prefer a README instead of the long comment at the start of seqam.c.
  The other ams do that.



OK, since things have been moved to separate directory, README is 
doable, I personally prefer the docs in the main .c file usually but I 
know project uses README sometimes for this.



As mentioned upthread, this patch isn't a seamless replacement for
what's already there because of the amdata field.  I wasn't part of the
conversation of FOSDEM unfortunately, and there's not enough information
in this thread to know why this solution is preferred over each seqam
having its own table type with all the columns it needs.  I see that
Heikki is waffling a bit between the two, and I have a fairly strong
opinion that amdata should be split into separate columns.  The patch
already destroys and recreates what it needs when changing access method
via ALTER SEQUENCE, so I don't really see what the problem is.



FOSDEM was just about agreeing that amdata is simpler after we discussed 
it with Heikki. Nothing too important you missed there I guess.


I can try to summarize what are the differences:
- amdata is somewhat simpler in terms of code for both init, alter and 
DDL, since with custom columns you have to specify them somehow and deal 
with them in catalog, also ALTER SEQUENCE USING means that there are 
going to be colums marked as deleted which produces needless waste, etc
- amdata make it easier to change the storage model as the tuple 
descriptor is same for all sequences

- the separate columns are much nicer from user point of view
- my opinion is that separate columns also more nicely separate state 
from options and I think that if we move to separate storage model, 
there can be state table per AM which solves the tuple descriptor issue
- there is probably some slight performance benefit to amdata but I 
don't think it's big enough to be important


I personally have slight preference to separate columns design, but I am 
ok with both ways honestly.




There is no \d command for sequence access methods.  Without querying
pg_seqam directly, how does one discover what's available?



Good point.



Patch 2: seqam ddl

When defining a new access method for sequences, it is possible to list
the arguments multiple times (last one wins).  Other defel loops raise
an error if the argument is specified more than once.  I haven't looked
at all of such loops to see if this is the only odd man out or not, but
I prefer the error behavior.



Hmm yeah, there should be error. I think only tsearch doesn't enforce 
errors from the existing stuff, should probably be fixed as well 
(separately of course).




Patch 3: gapless_seq

I really like the idea of having a gapless sequence in contrib.
Although it has big potential to be abused, doing it manually when it's
needed (like for invoices, at least in France) is a major pain.  So big
+1 for including this.



Yeah, people make gapless sequences regardless, it's better to provide 
them one that behaves correctly, also it's quite good test for the API.




It would be nice to be able to specify an access method when declaring a
serial or bigserial column.  This could be a separate patch later on,
though.



The patch originally had GUC for this, but Heikki didn't like it so it's 
left for the future developments.




On the whole, I think this is a pretty good patchset.  Aside from the
design decision of whether amdata is a single opaque column or a set of
columns, there are only a few things that need to be changed before it's
ready for committer, and those things are mostly documentation.



Unfortunately the amdata being opaque vs set of columns is the main 
issue here.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresq

Re: [HACKERS] PATCH: adaptive ndistinct estimator v4

2015-06-17 Thread Tomas Vondra

Hi,

On 05/13/15 23:07, Jeff Janes wrote:

With the warning it is very hard to correlate the discrepancy you do
see with which column is causing it, as the warnings don't include
table or column names (Assuming of course that you run it on a
substantial database--if you just run it on a few toy cases then the
warning works well).


That's true. I've added attnum/attname to the warning in the attached 
version of the patch.



If we want to have an explicitly experimental patch which we want
people with interesting real-world databases to report back on, what
kind of patch would it have to be to encourage that to happen? Or are
we never going to get such feedback no matter how friendly we make
it? Another problem is that you really need to have the gold standard
to compare them to, and getting that is expensive (which is why we
resort to sampling in the first place). I don't think there is much
to be done on that front other than bite the bullet and just do
it--perhaps only for the tables which have discrepancies.


Not sure. The "experimental" part of the patch was not really aimed at 
the users outside the development community - it was meant to be used by 
members of the community, possibly testing it on customer databases I 
don't think adding the GUC into the final release is a good idea, it's 
just a noise in the config no-one would actually use.



Some of the regressions I've seen are at least partly a bug:

+   /* find the 'm' value minimizing the difference */
+   for (m = 1; m <= total_rows; m += step)
+   {
+   double q = k / (sample_rows * m);

sample_rows and m are both integers, and their product overflows
vigorously. A simple cast to double before the multiplication fixes
the first example I produced. The estimate goes from 137,177 to
1,108,076. The reality is 1,062,223.

Perhaps m should be just be declared a double, as it is frequently
used in double arithmetic.


Yeah, I just discovered this bug independently. There's another bug that 
the adaptive_estimator takes total_rows as integer, so it breaks for 
tables with more than INT_MAX rows. Both are fixed in the v5.




Therefore, I think that:

1. This should be committed near the beginning of a release cycle,
not near the end. That way, if there are problem cases, we'll have a
year or so of developer test to shake them out.


It can't hurt, but how effective will it be? Will developers know or
care whether ndistinct happened to get better or worse while they
are working on other things? I would think that problems will be
found by focused testing, or during beta, and probably not by
accidental discovery during the development cycle. It can't hurt, but
I don't know how much it will help.


I agree with that - it's unlikely the regressions will get discovered 
randomly. OTOH I'd expect non-trivial number of people on this list to 
have a few examples of ndistinct failures, and testing those would be 
more useful I guess. But that's unlikely to find the cases that worked 
OK before and got broken by the new estimator :-(



I agree with the "experimental GUC".  That way if hackers do happen to
see something suspicious, they can just turn it off and see what
difference it makes.  If they have to reverse out a patch from 6 months
ago in an area of the code they aren't particularly interested in and
then recompile their code and then juggle two different sets of
binaries, they will likely just shrug it off without investigation.


+1


--
Tomas Vondra   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 861048f..f030702 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -16,6 +16,7 @@
 
 #include 
 
+#include "access/hash.h"
 #include "access/multixact.h"
 #include "access/transam.h"
 #include "access/tupconvert.h"
@@ -97,6 +98,9 @@ static void update_attstats(Oid relid, bool inh,
 static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
 static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
 
+static double adaptive_estimator(double total_rows, int sample_rows,
+int *f, int f_max);
+static int hash_comparator(const void *a, const void *b);
 
 /*
  *	analyze_rel() -- analyze one relation
@@ -1698,6 +1702,7 @@ static void compute_scalar_stats(VacAttrStatsP stats,
 	 int samplerows,
 	 double totalrows);
 static int	compare_scalars(const void *a, const void *b, void *arg);
+static int	compare_scalars_simple(const void *a, const void *b, void *arg);
 static int	compare_mcvs(const void *a, const void *b);
 
 
@@ -1816,6 +1821,23 @@ compute_minimal_stats(VacAttrStatsP stats,
 	StdAnalyzeData *mystats = (StdAnalyzeData *) stats->extra_data;
 
 	/*
+	 * The adaptive ndistinct estimator requires counts for all the
+	 * repetition counts - we can't do the sort-based count directly
+	 

Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-17 Thread Robert Haas
On Wed, Jun 17, 2015 at 9:32 AM, Thomas Munro
 wrote:
> We saw a rather extreme performance problem in a cluster upgraded from
> 9.1 to 9.3.  It uses a largish number of child tables (partitions) and
> many columns.  Planning a simple UPDATE via the base table started
> using a very large amount of memory and CPU time.
>
> My colleague Rushabh Lathia tracked the performance change down to
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c03ad5602f529787968fa3201b35c119bbc6d782
> .
>
> The call to copyObject in the loop introduced here seems to be
> problematic (copying append_rel_list for every element in
> append_rel_list unconditionally), though we're still trying to figure
> it out.  Attached is a simple repro script, with variables to tweak.
>
> Quite a few others have posted about this sort of thing and been
> politely reminded of the 100 table caveat [1][2] which is fair enough,
> but the situations seems to have got dramatically worse for some users
> after an upgrade.

Yes.  The copyObject() call introduced by this commit seems to have
complexity O(T^2*C) where T is the number of child tables and C is the
number of columns per child.  That's because the List that is being
copied is a list of AppendRelInfo nodes, each of which has a
translated_vars member that is a list of every Var in one table, and
we copy that list once per child.

It appears that in a lot of cases this work is unnecessary.  The
second (long) for loop in inheritance_planner copies root->rowMarks
and root->append_rel_list so as to be able to apply ChangeVarNodes()
to the result, but we only actually do that if the rte is of type
RTE_SUBQUERY or if it has security quals.  In the common case where we
reach inheritance_planner not because of UNION ALL but just because
the table has a bunch of inheritance children (none of which have RLS
policies applied), we copy everything and then modify none of it,
using up startling large amounts of memory in ways that pre-9.2
versions did not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-17 Thread Thomas Munro
Hi

We saw a rather extreme performance problem in a cluster upgraded from
9.1 to 9.3.  It uses a largish number of child tables (partitions) and
many columns.  Planning a simple UPDATE via the base table started
using a very large amount of memory and CPU time.

My colleague Rushabh Lathia tracked the performance change down to
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c03ad5602f529787968fa3201b35c119bbc6d782
.

The call to copyObject in the loop introduced here seems to be
problematic (copying append_rel_list for every element in
append_rel_list unconditionally), though we're still trying to figure
it out.  Attached is a simple repro script, with variables to tweak.

Quite a few others have posted about this sort of thing and been
politely reminded of the 100 table caveat [1][2] which is fair enough,
but the situations seems to have got dramatically worse for some users
after an upgrade.

[1] 
http://www.postgresql.org/message-id/8c9acaa.1f453.14c0da0402f.coremail.chjis...@163.com
[2] 
http://www.postgresql.org/message-id/flat/20141107185824.2513.53...@wrigleys.postgresql.org#20141107185824.2513.53...@wrigleys.postgresql.org

-- 
Thomas Munro
http://www.enterprisedb.com


repro-planner-explosion.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "could not adopt C locale" failure at startup on Windows

2015-06-17 Thread Noah Misch
On Mon, Jun 15, 2015 at 12:37:43PM -0400, Tom Lane wrote:
> After further thought, ISTM that this bug is evidence that 5f538ad
> was badly designed, and the proposed fix has blinkers on.  If
> pg_bind_textdomain_codeset() is looking at the locale environment,
> we should not be calling it from inside pg_perm_setlocale() at all,
> but from higher level code *after* we're done setting up the whole libc
> locale environment --- thus, after the unsetenv("LC_ALL") call in main.c,
> and somewhere near the bottom of CheckMyDatabase() in postinit.c.
> It's mere chance that the order of calls to pg_perm_setlocale() is
> such that the code works now; and it's not hard to imagine future
> changes in gettext, or reordering of our own code, that would break it.

pg_bind_textdomain_codeset() looks at one piece of the locale environment,
namely setlocale(LC_CTYPE, NULL), so the order of pg_perm_setlocale() calls
does not affect it.  There's nothing acutely bad about the alternative you
identify here, but it's no better equipped to forestall mistakes.  Moving the
call later would slightly expand the window during which translated messages
are garbled.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind and xlogtemp files

2015-06-17 Thread Michael Paquier
On Wed, Jun 17, 2015 at 9:07 PM, Fujii Masao  wrote:
> On Wed, Jun 17, 2015 at 4:57 PM, Vladimir Borodin  wrote:
>>
>> 17 июня 2015 г., в 9:48, Michael Paquier 
>> написал(а):
>>
>> On Wed, Jun 17, 2015 at 3:17 PM, Michael Paquier
>>  wrote:
>>
>> As pointed by dev1ant on the original bug report, process_remote_file
>> should ignore files named as pg_xlog/xlogtemp.*, and I think that this
>> is the right thing to do. Any objections for a patch that at the same
>> time makes "xlogtemp." a define declaration in xlog_internal.h?
>>
>>
>> Declaration seems to be the right thing.
>>
>> Another problem I’ve caught twice already in the same test:
>>
>> error reading xlog record: record with zero length at 0/7890
>> unexpected result while fetching remote files: ERROR:  could not open file
>> "base/13003/t6_2424967" for reading: No such file or directory
>> The servers diverged at WAL position 0/76BADD50 on timeline 303.
>> Rewinding from Last common checkpoint at 0/7651F870 on timeline 303
>>
>> I don’t know if this problem could be solved the same way (by skipping such
>> files)… Should I start a new thread for that?
>
> That's the file of the temporary table, so there is no need to copy it
> from the source server. pg_rewind can safely skip such file, I think.

Yes. It is actually recommended to copy them manually if needed from
the archive (per se the docs).

> But even if we make pg_rewind skip such file, we would still get the
> similar problem. You can see the problem that I reported in other thread.
> In order to address this type of problem completely, we would need
> to apply the fix that is been discussed in that thread.
> http://www.postgresql.org/message-id/CAHGQGwEdsNgeNZo+GyrzZtjW_TkC=XC6XxrjuAZ7=x_cj1a...@mail.gmail.com

There are two things to take into account here in my opinion:
1) Ignoring files that should not be added into the filemap, like
postmaster.pid, xlogtemp, etc.
2) bypass the files that can be added in the file map, for example a
relation file or a fsm file, and prevent erroring out if they are
missing.

> BTW, even pg_basebackup doesn't skip the file of temporary table.
> But maybe we should change this, too.
>
> Also pg_rewind doesn't skip the files that pg_basebackup does. ISTM
> that basically pg_rewind can safely skip any files that pg_basebackup does.
> So probably we need to reconsider which file to make pg_rewind skip.

pg_rewind and basebackup.c are beginning to share many things in this
area, perhaps we should consider a common routine in let's say
libpqcommon to define if a file can be safely skipped depending on its
path name in PGDATA.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind and xlogtemp files

2015-06-17 Thread Fujii Masao
On Wed, Jun 17, 2015 at 4:57 PM, Vladimir Borodin  wrote:
>
> 17 июня 2015 г., в 9:48, Michael Paquier 
> написал(а):
>
> On Wed, Jun 17, 2015 at 3:17 PM, Michael Paquier
>  wrote:
>
> As pointed by dev1ant on the original bug report, process_remote_file
> should ignore files named as pg_xlog/xlogtemp.*, and I think that this
> is the right thing to do. Any objections for a patch that at the same
> time makes "xlogtemp." a define declaration in xlog_internal.h?
>
>
> Declaration seems to be the right thing.
>
> Another problem I’ve caught twice already in the same test:
>
> error reading xlog record: record with zero length at 0/7890
> unexpected result while fetching remote files: ERROR:  could not open file
> "base/13003/t6_2424967" for reading: No such file or directory
> The servers diverged at WAL position 0/76BADD50 on timeline 303.
> Rewinding from Last common checkpoint at 0/7651F870 on timeline 303
>
> I don’t know if this problem could be solved the same way (by skipping such
> files)… Should I start a new thread for that?

That's the file of the temporary table, so there is no need to copy it
from the source server. pg_rewind can safely skip such file, I think.

But even if we make pg_rewind skip such file, we would still get the
similar problem. You can see the problem that I reported in other thread.
In order to address this type of problem completely, we would need
to apply the fix that is been discussed in that thread.
http://www.postgresql.org/message-id/CAHGQGwEdsNgeNZo+GyrzZtjW_TkC=XC6XxrjuAZ7=x_cj1a...@mail.gmail.com

BTW, even pg_basebackup doesn't skip the file of temporary table.
But maybe we should change this, too.

Also pg_rewind doesn't skip the files that pg_basebackup does. ISTM
that basically pg_rewind can safely skip any files that pg_basebackup does.
So probably we need to reconsider which file to make pg_rewind skip.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Function to get size of asynchronous notification queue

2015-06-17 Thread Brendan Jurd
Hello hackers,

I present a patch to add a new built-in function
pg_notify_queue_saturation().

The purpose of the function is to allow users to monitor the health of
their notification queue.  In certain cases, a client connection listening
for notifications might get stuck inside a transaction, and this would
cause the queue to keep filling up, until finally it reaches capacity and
further attempts to NOTIFY error out.

The current documentation under LISTEN explains this possible gotcha, but
doesn't really suggest a useful way to address it, except to mention that
warnings will show up in the log once you get to 50% saturation of the
queue.  Unless you happen to be eyeballing the logs when it happens, that's
not a huge help.  The choice of 50% as a threshold is also very much
arbitrary, and by the time you hit 50% the problem has likely been going on
for quite a while.  If you want your nagios (or whatever) to say, alert you
when the queue goes over 5% or 1%, your options are limited and awkward.

The patch has almost no new code.  It makes use of the existing logic for
the 50% warning.  I simply refactored that logic into a separate function
asyncQueueSaturation, and then added pg_notify_queue_saturation to make
that available in SQL.

I am not convinced that pg_notify_queue_saturation is the best possible
name for this function, and am very much open to other suggestions.

The patch includes documentation, a regression test and an isolation test.

Cheers,
BJ
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 14800,14805  SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
--- 14800,14811 

  

+pg_notify_queue_saturation()
+double
+proportion of the asynchronous notification queue currently occupied
+   
+ 
+   
 pg_my_temp_schema()
 oid
 OID of session's temporary schema, or 0 if none
***
*** 14939,14948  SET search_path TO schema , schema, ..
  pg_listening_channels
 
  
 
  pg_listening_channels returns a set of names of
! channels that the current session is listening to.  See  for more information.
 
  
 
--- 14945,14962 
  pg_listening_channels
 
  
+
+ pg_notify_queue_saturation
+
+ 
 
  pg_listening_channels returns a set of names of
! asynchronous notification channels that the current session is listening
! to.  pg_notify_queue_saturation returns the proportion
! of the total available space for notifications currently occupied by
! notifications that are waiting to be processed.  See
!  and 
! for more information.
 
  
 
*** a/doc/src/sgml/ref/notify.sgml
--- b/doc/src/sgml/ref/notify.sgml
***
*** 166,171  NOTIFY channel [ , 

+The function pg_notify_queue_saturation returns the
+proportion of the queue that is currently occupied by pending notifications.
+   
+   
 A transaction that has executed NOTIFY cannot be
 prepared for two-phase commit.

*** a/src/backend/commands/async.c
--- b/src/backend/commands/async.c
***
*** 371,376  static bool asyncQueueIsFull(void);
--- 371,377 
  static bool asyncQueueAdvance(volatile QueuePosition *position, int entryLength);
  static void asyncQueueNotificationToEntry(Notification *n, AsyncQueueEntry *qe);
  static ListCell *asyncQueueAddEntries(ListCell *nextNotify);
+ static double asyncQueueSaturation(void);
  static void asyncQueueFillWarning(void);
  static bool SignalBackends(void);
  static void asyncQueueReadAllNotifications(void);
***
*** 1362,1387  asyncQueueAddEntries(ListCell *nextNotify)
  }
  
  /*
!  * Check whether the queue is at least half full, and emit a warning if so.
!  *
!  * This is unlikely given the size of the queue, but possible.
!  * The warnings show up at most once every QUEUE_FULL_WARN_INTERVAL.
   *
!  * Caller must hold exclusive AsyncQueueLock.
   */
! static void
! asyncQueueFillWarning(void)
  {
! 	int			headPage = QUEUE_POS_PAGE(QUEUE_HEAD);
! 	int			tailPage = QUEUE_POS_PAGE(QUEUE_TAIL);
! 	int			occupied;
! 	double		fillDegree;
! 	TimestampTz t;
  
  	occupied = headPage - tailPage;
  
  	if (occupied == 0)
! 		return;	/* fast exit for common case */
  
  	if (occupied < 0)
  	{
--- 1363,1399 
  }
  
  /*
!  * SQL function to return the proportion of the notification queue currently
!  * occupied.
!  */
! Datum
! pg_notify_queue_saturation(PG_FUNCTION_ARGS)
! {
! 	double saturation;
! 
! 	LWLockAcquire(AsyncQueueLock, LW_SHARED);
! 	saturation = asyncQueueSaturation();
! 	LWLockRelease(AsyncQueueLock);
! 
! 	PG_RETURN_FLOAT8(saturation);
! }
! 
! /*
!  * Return the proportion of the queue that is currently occupied.
   *
!  * The caller must hold (at least) shared AysncQueueLock.
   */
! static double
! asyncQueueSaturation(void)
  {
! 	int		headPage = QUEUE_POS_PAGE(QUEUE_HEAD);
! 	int		tailPage = QUEUE_POS_PAGE(Q

Re: [HACKERS] pg_rewind and xlogtemp files

2015-06-17 Thread Vladimir Borodin

> 17 июня 2015 г., в 9:48, Michael Paquier  
> написал(а):
> 
> On Wed, Jun 17, 2015 at 3:17 PM, Michael Paquier
>  wrote:
>> As pointed by dev1ant on the original bug report, process_remote_file
>> should ignore files named as pg_xlog/xlogtemp.*, and I think that this
>> is the right thing to do. Any objections for a patch that at the same
>> time makes "xlogtemp." a define declaration in xlog_internal.h?

Declaration seems to be the right thing.

Another problem I’ve caught twice already in the same test:

error reading xlog record: record with zero length at 0/7890
unexpected result while fetching remote files: ERROR:  could not open file 
"base/13003/t6_2424967" for reading: No such file or directory
The servers diverged at WAL position 0/76BADD50 on timeline 303.
Rewinding from Last common checkpoint at 0/7651F870 on timeline 303

I don’t know if this problem could be solved the same way (by skipping such 
files)… Should I start a new thread for that?

> 
> And attached is a patch following those lines.
> -- 
> Michael
> <20150617_rewind_xlogtemp.patch>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


--
May the force be with you…
https://simply.name



Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-17 Thread Haribabu Kommi
On Wed, Jun 17, 2015 at 2:17 PM, Prakash Itnal  wrote:
> Hi,
>
> Currently the issue is easily reproducible. Steps to reproduce:
> * Set some aggressive values for auto-vacuuming.
> * Run a heavy database update/delete/insert queries. This leads to invoking
> auto-vacuuming in quick successions.
> * Change the system time to older for eg. 1995-01-01
>
> Suddenly auto-vacuuming stops working. Even after changing system time back
> to current time, the auto-vacuuming did not resume.
>
> So the question is, "does postrges supports system time changes?".

PostgreSQL uses the system time to check whether it reached the
specific nap time to trigger the autovacuum worker.

Is it possible for you to check what autovauum launcher is doing even
after the system time is reset to current time?

I can think of a case where the "launcher_determine_sleep" function
returns a big sleep value because of system time change.
Because of that it is possible that the launcher is not generating
workers to do the vacuum. May be I am wrong.


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Is Postgres database server works fine if there is a change in system time?

2015-06-17 Thread Prakash Itnal
Hi,

Currently we observed that certain postgres child process, for eg.
autovacuum worker, are not working as expected if there is a system time
change. So I wanted to know if postgres already supports system time
changes or not.

Please confirm if postgres already handles system time changes or not.

-- 
Cheers,
Prakash