Re: New GUC to sample log queries

2018-11-25 Thread Adrien Nayrat
On 11/26/18 12:40 AM, Thomas Munro wrote:
> On Wed, Nov 21, 2018 at 9:06 PM Adrien Nayrat
>  wrote:
>> Thanks for your comments. Here is the updated patch. I fixed a warning for
>> missing parentheses in this expression:
>> if ((exceeded && in_sample) || log_duration)
> 
> Hi Adrien,
> 
> GCC 4.8 says:
> 
> guc.c:3328:3: error: initialization makes integer from pointer without
> a cast [-Werror]
> },
> ^
> guc.c:3328:3: error: (near initialization for
> ‘ConfigureNamesReal[19].gen.flags’) [-Werror]
> 

Sorry, I missed this warning. There was one extra NULL in guc.c.

Here is a new patch.
Thanks
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index db1a2d4e74..9de01e6977 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5752,6 +5752,26 @@ local0.*/var/log/postgresql

   
 
+ 
+  log_statement_sample_rate (real)
+  
+   log_statement_sample_rate configuration parameter
+  
+  
+   
+
+ Causes logging only a fraction of the statements when  is used. The default is
+ 1, meaning log all statements longer than
+ log_min_duration_statement. Setting this to zero
+ disables logging, same as setting
+ log_min_duration_statement to
+ -1.  Using log_statement_sample_rate is
+ helpful when the traffic is too high to log all queries.
+
+   
+  
+
  
 
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index a3b9757565..6777eebde1 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2201,7 +2201,8 @@ check_log_statement(List *stmt_list)
 
 /*
  * check_log_duration
- *		Determine whether current command's duration should be logged
+ *		Determine whether current command's duration should be logged.
+ *		If log_statement_sample_rate < 1.0, log only a sample.
  *
  * Returns:
  *		0 if no logging is needed
@@ -2223,6 +2224,7 @@ check_log_duration(char *msec_str, bool was_logged)
 		int			usecs;
 		int			msecs;
 		bool		exceeded;
+		bool		in_sample;
 
 		TimestampDifference(GetCurrentStatementStartTimestamp(),
 			GetCurrentTimestamp(),
@@ -2239,7 +2241,16 @@ check_log_duration(char *msec_str, bool was_logged)
 	 (secs > log_min_duration_statement / 1000 ||
 	  secs * 1000 + msecs >= log_min_duration_statement)));
 
-		if (exceeded || log_duration)
+		/*
+		 * Do not log if log_statement_sample_rate = 0.
+		 * Log a sample if log_statement_sample_rate <= 1 and avoid unecessary
+		 * random() call if log_statement_sample_rate = 1.
+		 */
+		in_sample = log_statement_sample_rate != 0 &&
+			(log_statement_sample_rate == 1 ||
+			 random() <= log_statement_sample_rate * MAX_RANDOM_VALUE);
+
+		if ((exceeded && in_sample) || log_duration)
 		{
 			snprintf(msec_str, 32, "%ld.%03d",
 	 secs * 1000 + msecs, usecs % 1000);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 6497393c03..3bc251afbb 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -486,6 +486,7 @@ int			log_min_messages = WARNING;
 int			client_min_messages = NOTICE;
 int			log_min_duration_statement = -1;
 int			log_temp_files = -1;
+double 		log_statement_sample_rate = 1.0;
 int			trace_recovery_messages = LOG;
 
 int			temp_file_limit = -1;
@@ -3319,6 +3320,16 @@ static struct config_real ConfigureNamesReal[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"log_statement_sample_rate", PGC_SUSET, LOGGING_WHEN,
+			gettext_noop("Fraction of statements to log."),
+			gettext_noop("1.0 logs all statements.")
+		},
+		&log_statement_sample_rate,
+		1.0, 0.0, 1.0,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0.0, 0.0, 0.0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index ee9ec6a120..8892e71104 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -488,6 +488,8 @@
 	# statements running at least this number
 	# of milliseconds
 
+#log_statement_sample_rate = 1	# Fraction of logged statements. 1 means log
+	# all statements.
 
 # - What to Log -
 
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index df2e556b02..4c5cabcc5f 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -251,6 +251,7 @@ extern PGDLLIMPORT int log_min_messages;
 extern PGDLLIMPORT int client_min_messages;
 extern int	log_min_duration_statement;
 extern int	log_temp_files;
+extern double	log_statement_sample_rate;
 
 extern int	temp_file_limit;
 


Re: Add extension options to control TAP and isolation tests

2018-11-25 Thread Nikolay Shaplov
В письме от 26 ноября 2018 08:53:20 Вы написали:

> > I've carefully read this documentation. And did not get clear explanation
> > of what is the true purpose of PGXS environment variable. Only
> > 
> > "The last three lines should always be the same. Earlier in the file, you
> > assign variables or add custom make rules."
> 
> The definition of PGXS is here:
> https://www.postgresql.org/docs/11/extend-pgxs.html
Can you please provide the quotation? I see there PGXS mentioned several times 
as "a build infrastructure for extensions" and as an environment variable it 
is mentioned only in code sample 

 
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

So for me there is no explanation. Or it is hard to find (that is also bad)

If we are done with your patch, can we still finish this line of discussion in 
order to create another small patch concerning PGXS-env-var comment?



signature.asc
Description: This is a digitally signed message part.


Re: Handling of REGRESS_OPTS in MSVC for regression tests

2018-11-25 Thread Michael Paquier
On Mon, Nov 26, 2018 at 02:43:02PM +0900, Michael Paquier wrote:
> Another option could also be to switch contribcheck and modulescheck
> so as they use a temporary installation, but that's a can of worms
> waiting to explode as MSVC makes more complicated the search for
> initdb and such (see the way upgradecheck happens for example), and
> this makes the tests waaay longer to run.

This has been itching me, and actually it proves to not be that
complicated to achieve per the attached.  This makes all the tests from
contrib/ and src/test/modules pass with temporary installations, the
tests runs are much slower though.  This would not blow up the buildfarm
visibly, buts its code assumes that installcheck should be used, so we
could as well just introduce new options for vcregress.pl.  I am not
sure what would be the best way, still using temporary installations has
the merit to not cause any tests to run unconfigured in the future.
--
Michael
diff --git a/src/tools/msvc/vcregress.pl b/src/tools/msvc/vcregress.pl
index 599b521014..5d3910b5ef 100644
--- a/src/tools/msvc/vcregress.pl
+++ b/src/tools/msvc/vcregress.pl
@@ -380,8 +380,10 @@ sub plcheck
 
 sub subdircheck
 {
+	my $subpath = shift;
 	my $module = shift;
 
+	chdir "${topdir}/${subpath}";
 	if (   !-d "$module/sql"
 		|| !-d "$module/expected"
 		|| (!-f "$module/GNUmakefile" && !-f "$module/Makefile"))
@@ -389,7 +391,7 @@ sub subdircheck
 		return;
 	}
 
-	chdir $module;
+	chdir "${topdir}/${subpath}/${module}";
 	my @tests = fetchTests();
 	my @opts  = fetchRegressOpts();
 
@@ -419,17 +421,23 @@ sub subdircheck
 	print "Checking $module\n";
 	my @args = (
 		"$topdir/$Config/pg_regress/pg_regress",
-		"--bindir=${topdir}/${Config}/psql",
+		"--dlpath=${topdir}/src/test/regress",
+		"--bindir=",
+		"--encoding=SQL_ASCII",
+		"--no-locale",
+		"--temp-instance=./tmp_check",
+		"--inputdir=.",
 		"--dbname=contrib_regression", @opts, @tests);
 	print join(' ', @args), "\n";
 	system(@args);
-	chdir "..";
 	return;
 }
 
 sub contribcheck
 {
-	chdir "../../../contrib";
+	InstallTemp();
+	my $subpath = "contrib";
+	chdir "${topdir}/${subpath}";
 	my $mstat = 0;
 	foreach my $module (glob("*"))
 	{
@@ -441,7 +449,7 @@ sub contribcheck
 		next if ($module =~ /_plpython$/ && !defined($config->{python}));
 		next if ($module eq "sepgsql");
 
-		subdircheck($module);
+		subdircheck($subpath, $module);
 		my $status = $? >> 8;
 		$mstat ||= $status;
 	}
@@ -451,11 +459,13 @@ sub contribcheck
 
 sub modulescheck
 {
-	chdir "../../../src/test/modules";
+	InstallTemp();
+	my $subpath = "src/test/modules";
+	chdir "${topdir}/${subpath}";
 	my $mstat = 0;
 	foreach my $module (glob("*"))
 	{
-		subdircheck($module);
+		subdircheck($subpath, $module);
 		my $status = $? >> 8;
 		$mstat ||= $status;
 	}
@@ -619,13 +629,18 @@ sub fetchRegressOpts
 	$m =~ s{\\\r?\n}{}g;
 	if ($m =~ /^\s*REGRESS_OPTS\s*\+?=(.*)/m)
 	{
+		my @split_opts = split(/\s+/, $1);
 
-		# Substitute known Makefile variables, then ignore options that retain
-		# an unhandled variable reference.  Ignore anything that isn't an
-		# option starting with "--".
-		@opts = grep { !/\$\(/ && /^--/ }
-		  map { (my $x = $_) =~ s/\Q$(top_builddir)\E/\"$topdir\"/; $x; }
-		  split(/\s+/, $1);
+		# Substitute known Makefile variables for each option by
+		# something which is supported in this context.
+		foreach my $opt (@split_opts)
+		{
+			# ignore empty strings
+			next if ($opt =~ /^\s*$/);
+			$opt =~ s#\$\(top_builddir\)#"$topdir"#gs;
+			$opt =~ s#\$\(top_srcdir\)#"$topdir"#gs;
+			push @opts, $opt;
+		}
 	}
 	if ($m =~ /^\s*ENCODING\s*=\s*(\S+)/m)
 	{


signature.asc
Description: PGP signature


Re: tab-completion debug print

2018-11-25 Thread David Fetter
On Mon, Nov 26, 2018 at 12:23:16PM +0900, Kyotaro HORIGUCHI wrote:
> Thank you for the comments.
> 
> At Sun, 25 Nov 2018 01:17:27 +0100, David Fetter  wrote in 
> <20181125001727.gm...@fetter.org>
> > On Fri, Nov 23, 2018 at 04:32:31PM -0500, Tom Lane wrote:
> > > Hm.  I can see the value of instrumenting tab-complete when you're trying
> > > to debug why it did something, but this output format seems pretty terse
> > > and unreadable.  Can we get it to print the completion text as well?
> > > I'm imagining something more like
> > > 
> > > 1414: "er "
> > > 1435: ""
> > > 1435: "ab"
> > > 1435: ""
> > > 1431: ""
> > > 
> > > Perhaps there's room as well to print the context that the match looked
> > > at:
> > > 
> > > 1414: "alt" -> "er "
> > > 1435: "alter " -> ""
> > > 1435: "alter t" -> "ab"
> > > 
> > > etc.
> > > 
> > >   regards, tom lane
> > 
> > Is this something along the lines of what you had in mind?
> 
> The reason for the minimal output was it won't disturb console so
> much when stderr is not redirected. It could be richer if we
> premise redirection. Since every debug line is followed by a
> line-feed, we don't need the brackets. Since the result is a set
> so Devid's proposal to enclose the set by parentheses looks nice.
> Also, result can contain matches containing word breaks (or even
> an empty string) so quoting make every candidate string more
> distinctive.
> 
> Finally, the attached shows the following debugging output.
> 
> 
> INPUT: alter u(ser) 
> OUTPUT:
> 1445: "u" -> ("user", "user", "user mapping for")
> 1445: "user" -> ("user", "user", "user mapping for")
> 3630: "" -> ("", "horiguti", "MAPPING FOR", "pg_execute_server_program", 
> "pg_monitor", "pg_read_all_settings", "pg_read_all_stats", 
> "pg_read_server_files", "pg_signal_backend", "pg_stat_scan_tables", 
> "pg_write_server_files", "usr1", "usr2")
> 
> INPUT: create function 
> OUTPUT:
> 3635: "" -> ("", "pg_ndistinct_out", "path_recv", "int4eq", 
> "spg_quad_picksplit", "tsvectorsend", "float8_var_pop", "opaque_in", 
> "interval_le", "macaddr_cmp", "range_gt..(17272 chars)..nge_in")
> 
> Wouldn't be a problem with the long line since it is for
> debugging purpose. Of course we can limit the length easily (by
> number of elements).

We could wrap just about anywhere if needed. I found it relatively
easy to puzzle what was going on in my brief tests just by looking at
the potentially-long lines.

> >+for (int i = 0; list && list[i]; ++i)
> 
> # Are we using C99 for-loop-variable-declarations?

I thought we were using all of it, but of course I can move it
outside.

> An arguable change in this version is enclosing empty result
> lists with brackets, since it is just an intermediate state for
> us.
> 
> [1431: "sael" -> ()]
> 3655: "sael" -> ("")

That's certainly fixable.

> Showing completion context makes things a bit bothersome, since
> complete_from_variables needs to take additional two parameters.
> It is already shown in console conversasion so I'm not sure it's
> worth the trouble.
> 
> 1872: (...table t1 alter [COn]) -> ("CONSTRAINT")
> 
> To be friendly with the CI, the second patch is attached as the
> difference from the first patch.
> 
> 
> Note: This is not registered in this  CF.

Should it be?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: tab-completion debug print

2018-11-25 Thread David Fetter
On Sun, Nov 25, 2018 at 11:21:51PM -0500, Tom Lane wrote:
> Kyotaro HORIGUCHI  writes:
> >> On Fri, Nov 23, 2018 at 04:32:31PM -0500, Tom Lane wrote:
> >>> Hm.  I can see the value of instrumenting tab-complete when you're trying
> >>> to debug why it did something, but this output format seems pretty terse
> >>> and unreadable.
> 
> > The reason for the minimal output was it won't disturb console so
> > much when stderr is not redirected. It could be richer if we
> > premise redirection.
> 
> It seems to me this behavior would be completely unusable if it's not
> redirected; I don't understand why you're even considering that as an
> interesting option.  libreadline is necessarily active when we're doing
> this, and it will get very confused (or at least produce a very confused
> display) if anything else is emitting text onto the active console line.

Integrating with libreadline would be a *much* bigger project for
reasons starting with the ones you state.

> Maybe somebody who never makes any typing errors at all and doesn't
> really need to see what they typed could use it like that, but I for
> one would find it quite useless.
> 
> In fact, I was thinking of proposing that the output shouldn't go to
> stderr in the first place.  If you do it like that, then you're also
> confusing this info with query error output, which does little for
> usability either.
> 
> Speaking of query error output, it wouldn't be a bad thing if this
> mode could show any errors from the tab-completion queries.  I
> suppose people look into the postmaster log for that right now; but
> if this were all going to some separate log file, I'd vote for
> showing the constructed queries and their results or errors.

I briefly thought about logging this to the postmaster log, but psql
is not the server, and doesn't, as a rule, have access to the same
kinds of things the server does, so it's not really the right thing.
On a busy server, because we don't yet have ways to carve off streams
of logs, these ones could get lost in the noise.

This brings up an interesting idea, though. It seems like we're
backing into a logging facility for psql with this feature. How about
just sending stderr to some file(s) in /var/log/psql/ , or if we're
getting super fancy, to the syslog family?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Inadequate executor locking of indexes

2018-11-25 Thread Amit Langote
On 2018/11/26 14:25, David Rowley wrote:
> On Mon, 26 Nov 2018 at 17:37, Amit Langote
>  wrote:
>> On 2018/11/24 1:25, Tom Lane wrote:
>>> I'm beginning to think that postponing target-index locking to
>>> ExecInitModifyTable was a damfool idea and we should undo it.
>>
>> +1
>>
>> Also as already proposed, InitPlan should lock result relation indexes
>> even for DELETEs.
> 
> I'd rather see it fixed another way.  The reason being, if we get [1],
> then that opens the door to run-time partition pruning for
> UPDATE/DELETE, which is currently blocked due to lack of knowledge
> about baserestrictinfos for the base partitioned relation because
> inheritance_planner() does not plan for the partitioned table, only
> its partitions.  Doing the index opening work during InitPlan() means
> we do that for all partitions, even the ones that will later be
> run-time pruned. If we can doing it during init of a ModifyTable node,
> then we can likely do it after the run-time pruning takes place.
> Since Amit and I are both working towards making partitioning faster,
> I imagined he would also not want to do something that could slow it
> down significantly, if there was some alternative way to fix it, at
> least.
> 
> I'm making efforts to delay per-partition work even further in the
> executor, for example locking of the per-partition result relations
> until after run-time pruning would be a significant win for
> partitioned tables with many partitions when generic plans are in use.
> Moving things back to InitPlan() flies in the face of that work.
> 
> [1] https://commitfest.postgresql.org/20/1778/

That's an interesting point.  Although, considering the concerns that Tom
raised about the same index relation being locked such that lock-strength
upgrade occurs (his example containing two CTEs), we'll have to find a way
to do the ModifyTable run-time pruning such that result relations and
their indexes (possibly under multiple ModifyTable nodes) are locked with
RowExclusiveLock before they're locked with AccessShareLock lock as scan
relations.  For example, we might be able to find a way to do the
ModifyTable run-time pruning in InitPlan before initializing plan trees.

That said, I don't quite understand how the lock-strength upgrade
occurring in the way being discussed here (AccessShareLock for scanning to
RowExclusiveLock for modifying) leads to deadlock hazard?

Thanks,
Amit




RE: [Todo item] Add entry creation timestamp column to pg_stat_replication

2018-11-25 Thread myungkyu.lim
On Thu, Nov 15, 2018 at 5:27 PM Michael Paquier  wrote:
>> Good point. 'last_reply_send_time' is better.
>> How about just 'reply_time'?
>
>Please note that the original thread has mentioned reply_timestamp as a
>consensus:

So I selected 'reply_time' because 'timestamp' was not used in stat field.

>On Thu, Nov 15, 2018 at 05:33:26PM +0900, Masahiko Sawada wrote:
>> Yeah, I also agree with 'reply_time'. But please also note that we had
>> the discussion when there is not the similar system catalogs and
>> fields. Now that we have them it might be worth to consider to follow
>> the existing name for consistency.
>
>The fields in pg_stat_wal_receiver are named after their respective fields.
>Now if you look at the fields from pg_stat_replication, you have those from
>the standby:
>sent_lsn => Last write-ahead log location sent on this connection write_lsn
>=> Last write-ahead log location written to disk by this standby server
>flush_lsn => Last write-ahead log location flushed to disk by this standby
>server replay_lsn => Last write-ahead log location replayed into the
>database on this standby server
>
>So to keep the brand consistent, reply_time looks like the best choice as
>Sawada-san suggests?

I agree.
The fields naming in pg_stat_replication are some different from other views
fields.
Not used 'last_' or 'latest_' prefix in fields.

Best regards,
Myungkyu, Lim




Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Laurenz Albe
On Sun, 2018-11-25 at 22:01 +0100, Magnus Hagander wrote:
[about managing backups from pre- and post-file-system-backup scrips]
> I agree with your point that it's not an uncommon thing to need. If a good 
> solution
> for it can be proposed that requires the exclusive backup interface, then I 
> wouldn't
> be against un-deprecating that.  But that's going to require a lot more than 
> just a
> documentation change, IMHO. What could perhaps be handled with a 
> documentation change,
> however, is to document a good way for this type of setup to use the new 
> interfaces.

Good point, and it puts the ball in my court :^)

> > I'm arguing on behalf of users that run a few databases, want a simple 
> > backup
> > solution and are ready to deal with the shortcomings.
> 
> Those that want a simple backup solution have one -- pg_basebackup.
> 
> The exclusive backup API is *not* simple. It is convenient, but it is not 
> simple.
> 
> Actually having a simple API that worked with the pre/post backup scripts, 
> that
> would be an improvement that we should definitely want!

Right; unfortunately it is not simple to come up with one that doesn't exhibit
the problems of the existing exclusive backup.

Perhaps it's theoretically impossible.  The goal is to disambiguate what a file
system backup sees in backup mode and what the startup process sees after a 
crash
in backup mode, and I can't see how that could be done.

Yours,
Laurenz Albe




Handling of REGRESS_OPTS in MSVC for regression tests

2018-11-25 Thread Michael Paquier
Hi all,

As mentioned on this thread, I have been fighting a bit with the
buildfarm when trying to introduce new PGXS options for isolation and
TAP tests:
https://www.postgresql.org/message-id/e1gr4d0-0002yj...@gemulon.postgresql.org

However it happens that we have more issues than the buildfarm failures
to begin with.  One of them it related to the way REGRESS_OPTS is
defined and handled across the whole tree for MSVC.

There are in the tree now a couple of paths using top_builddir or
top_srcdir:
$ git grep REGRESS_OPTS | grep top
contrib/dblink/Makefile:REGRESS_OPTS =
--dlpath=$(top_builddir)/src/test/regress
contrib/pg_stat_statements/Makefile:REGRESS_OPTS = --temp-config
$(top_srcdir)/contrib/pg_stat_statements/pg_stat_statements.conf 
src/test/modules/commit_ts/Makefile:REGRESS_OPTS =
--temp-config=$(top_srcdir)/src/test/modules/commit_ts/commit_ts.conf
src/test/modules/test_rls_hooks/Makefile:REGRESS_OPTS =
--temp-config=$(top_srcdir)/src/test/modules/test_rls_hooks/rls_hooks.conf

Using top_builddir for dblink is for example fine as this needs to point
out to the place where builds of pg_regress are present.  However when
it comes to configuration files we should use top_builddir, and ought to
use top_srcdir instead as VPATH would complain about things gone
missing.  So the definition that we make out of it is correct.  However
there is an issue with MSVC which thinks that REGRESS_OPTS should only
include top_builddir.  This is actually fine per-se as all MSVC tests
run with make-installcheck, and not make-check, however I think that we
should allow top_srcdir to be switched on-the-fly as much as
top_builddir as top_srcdir could be used for something else.

Another way worse issue is that MSVC scripts ignore some configuration
values because of the way values of REGRESS_OPTS are parsed.  This
causes some sets of regression tests to never run on Windows.  For
example, here is what the command generated for pg_stat_statements, and
what happens with it:
c:/pgbuildfarm/pgbuildroot/HEAD/pgsql.build/Release/pg_regress/pg_regress \
  --bindir=c:/pgbuildfarm/pgbuildroot/HEAD/pgsql.build/Release/psql \
  --dbname=contrib_regression --temp-config pg_stat_statements
[...]
== running regression test queries==

=
 All 0 tests passed.
=

This causes the tests to pass, but to run nothing as test list is eaten
out as an option value for --temp-config.  In this case, what happens is
that --temp-config is set to "pg_stat_statements", leaving the test list 
empty.  The same cannot happen with test_decoding as the buildfarm uses
a custom module to run its tests (still this module could go away with
PGXS options to control isolation tests).

Attached is a patch which makes MSVC scripts more intelligent by being
able to replace top_srcdir as well by topdir when fetching options.

Thanks to that, MSVC is able to run the tests correctly, by building a
proper command.  I think that this is a bug that should be back-patched,
because the tests just don't run, and we likely would miss regressions
because of that.

Unfortunately, if I were to commit that directly, the buildfarm would
turn immediately to red for all Windows members, because when checking
module and contrib tests an "installcheck" happens, and
shared_preload_libraries is not set in this case.  When working on the
switch to add isolation and TAP test support to PGXS, test_decoding has
been failing because the installed server did not have wal_level =
logical set up, which is one instance of that.  The buildfarm code
installing the Postgres instance on which the tests are run should
configure that properly I think, and one tweak that we could use for the
time being is to bypass tests of modules which cannot work yet, so as
the buildfarm keeps being green.  This way, this would not be a blocker
for the integration of the new PGXS infra for TAP and isolation.  And
those could be enabled back once the buildfarm code is able to set up a
proper configuration.  The following modules require some extra
configuration:
- commit_ts
- test_rls_hooks
- pg_stat_statements
- test_decoding (if its Makefile is rewritten)
- snapshot_too_old (if its Makefile is rewritten)

The buildfarm part requires Andrew Dunstan's input, and there is a bit
to sort out for the rest, so input from all is welcome.  Please note
that I would prefer if the tests which just cannot work yet are
disabled until the needed buildfarm infrastructure is needed.  Another
option could also be to switch contribcheck and modulescheck so as they
use a temporary installation, but that's a can of worms waiting to
explode as MSVC makes more complicated the search for initdb and such
(see the way upgradecheck happens for example), and this makes the tests
waaay longer to run.

Thoughts?
--
Michael
diff --git a/src/tools/msvc/vcregress.pl b/src/tools/msvc/vcregress.pl
index 599b521014..f3394905f0 100644
--- a/src/tools/msvc/vcregress.pl
+++ b/src/t

Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Laurenz Albe
On Sun, 2018-11-25 at 16:04 -0500, Stephen Frost wrote:
> > > There isn’t any need to write the backup label before you restore the 
> > > database,
> > > just as you write recovery.conf then.
> > 
> > Granted.
> > But it is pretty convenient, and writing it to the data directory right away
> > is a good thing on top, because it reduces the danger of inadvertedly
> > starting the backup without recovery.
> 
> Writing it into the data directory is *not* a good thing because as soon as 
> you do
> that you risk there being an issue if there’s a crash.  Writing into the 
> backup
> isn’t a bad idea but if you’re managing your backups then writing it 
> somewhere else
> (such as where you write your WAL) and associating it with the backup 
> (presumably
> it has a label) should make it easy to pull back when you restore. 

If there is a crash during the backup procedure, the backup is bad.
Doesn't matter during which part of the backup procedure it happens.

> > Yes, you can come up with a post-backup script that somehow communicates
> > with your pre-backup script to get the information, but it sure is
> > inconvenient.  Simplicity is good in backup solutions, because complicated
> > things tend to break more easily.
> 
> Not sure what communication is necessary here..?   The data needed for the 
> backup
> label file comes from pg_stop_backup in a non-exclusive backup.

Right, and pg_stop_backup has to be run from the "pre-backup" script.

Yours,
Laurenz Albe





Re: Inadequate executor locking of indexes

2018-11-25 Thread David Rowley
On Mon, 26 Nov 2018 at 17:37, Amit Langote
 wrote:
> On 2018/11/24 1:25, Tom Lane wrote:
> > I'm beginning to think that postponing target-index locking to
> > ExecInitModifyTable was a damfool idea and we should undo it.
>
> +1
>
> Also as already proposed, InitPlan should lock result relation indexes
> even for DELETEs.

I'd rather see it fixed another way.  The reason being, if we get [1],
then that opens the door to run-time partition pruning for
UPDATE/DELETE, which is currently blocked due to lack of knowledge
about baserestrictinfos for the base partitioned relation because
inheritance_planner() does not plan for the partitioned table, only
its partitions.  Doing the index opening work during InitPlan() means
we do that for all partitions, even the ones that will later be
run-time pruned. If we can doing it during init of a ModifyTable node,
then we can likely do it after the run-time pruning takes place.
Since Amit and I are both working towards making partitioning faster,
I imagined he would also not want to do something that could slow it
down significantly, if there was some alternative way to fix it, at
least.

I'm making efforts to delay per-partition work even further in the
executor, for example locking of the per-partition result relations
until after run-time pruning would be a significant win for
partitioned tables with many partitions when generic plans are in use.
Moving things back to InitPlan() flies in the face of that work.

[1] https://commitfest.postgresql.org/20/1778/

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



RE: Too many logs are written on Windows (LOG: could not reserve shared memory region (addr=%p) for child %p:)

2018-11-25 Thread Takahashi, Ryohei
Hi,


Thank you for replying.


> You might be right, but maybe we should first try to understand why
> this is happening so frequently.  Maybe it's not that normal.

I found past threads [1] and [2].

In thread [1], the error is mentioned as an ASLR problem.
In thread [2], the patch is provided which retries 
pgwin32_ReserveSharedMemoryRegion() to resolve the ASLR problem.

Therefore, I think our community regard that the error is an ASLR problem and 
is already resolved by the patch.


[1] - 
https://www.postgresql.org/message-id/flat/CACmJi2JyDLMtxE3i_Krp%3DhrK4KKZ%3DD83s6eMw9q%3DHeM_srQdXg%40mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/flat/CAA4eK1%2BR6hSx6t_yvwtx%2BNRzneVp%2BMRqXAdGJZChcau8Uij-8g%40mail.gmail.com



Regards,
Ryohei Takahashi


Re: using expression syntax for partition bounds

2018-11-25 Thread Amit Langote
On 2018/11/09 14:38, Amit Langote wrote:
> Rebased due to change in addRangeTableEntryForRelation's API.

Rebased again due to changes in psql's describe output for partitioned tables.

Thanks,
Amit
>From 999aa53b459a6fc0fe899e613406f0e0035aca94 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 6 Jul 2018 14:05:22 +0900
Subject: [PATCH v7] Allow generalized expression syntax for partition bounds

Authors: Kyotaro Horiguchi, Tom Lane, Amit Langote
---
 doc/src/sgml/ref/alter_table.sgml  |   6 +-
 doc/src/sgml/ref/create_table.sgml |  16 +--
 src/backend/commands/tablecmds.c   |   9 ++
 src/backend/optimizer/util/clauses.c   |   4 +-
 src/backend/parser/gram.y  |  60 +
 src/backend/parser/parse_agg.c |  10 ++
 src/backend/parser/parse_expr.c|   5 +
 src/backend/parser/parse_func.c|   3 +
 src/backend/parser/parse_utilcmd.c | 201 ++---
 src/include/optimizer/clauses.h|   2 +
 src/include/parser/parse_node.h|   1 +
 src/include/utils/partcache.h  |   6 +
 src/test/regress/expected/create_table.out |  49 ---
 src/test/regress/sql/create_table.sql  |  16 ++-
 14 files changed, 222 insertions(+), 166 deletions(-)

diff --git a/doc/src/sgml/ref/alter_table.sgml 
b/doc/src/sgml/ref/alter_table.sgml
index be1647937d..b9c5f7c384 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -86,9 +86,9 @@ ALTER TABLE [ IF EXISTS ] name
 
 and partition_bound_spec 
is:
 
-IN ( { numeric_literal | 
string_literal | TRUE | FALSE | 
NULL } [, ...] ) |
-FROM ( { numeric_literal | 
string_literal | TRUE | FALSE | 
MINVALUE | MAXVALUE } [, ...] )
-  TO ( { numeric_literal | 
string_literal | TRUE | FALSE | 
MINVALUE | MAXVALUE } [, ...] ) |
+IN ( partition_bound_expr [, ...] 
) |
+FROM ( { partition_bound_expr | 
MINVALUE | MAXVALUE } [, ...] )
+  TO ( { partition_bound_expr | 
MINVALUE | MAXVALUE } [, ...] ) |
 WITH ( MODULUS numeric_literal, 
REMAINDER numeric_literal )
 
 and column_constraint 
is:
diff --git a/doc/src/sgml/ref/create_table.sgml 
b/doc/src/sgml/ref/create_table.sgml
index 50d5597002..7297f751b7 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -86,9 +86,9 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] 
TABLE [ IF NOT EXI
 
 and partition_bound_spec 
is:
 
-IN ( { numeric_literal | 
string_literal | TRUE | FALSE | 
NULL } [, ...] ) |
-FROM ( { numeric_literal | 
string_literal | TRUE | FALSE | 
MINVALUE | MAXVALUE } [, ...] )
-  TO ( { numeric_literal | 
string_literal | TRUE | FALSE | 
MINVALUE | MAXVALUE } [, ...] ) |
+IN ( partition_bound_expr [, ...] 
) |
+FROM ( { partition_bound_expr | 
MINVALUE | MAXVALUE } [, ...] )
+  TO ( { partition_bound_expr | 
MINVALUE | MAXVALUE } [, ...] ) |
 WITH ( MODULUS numeric_literal, 
REMAINDER numeric_literal )
 
 index_parameters in 
UNIQUE, PRIMARY KEY, and 
EXCLUDE constraints are:
@@ -412,12 +412,10 @@ WITH ( MODULUS numeric_literal, REM
  
 
  
-  Each of the values specified in
-  the partition_bound_spec is
-  a literal, NULL, MINVALUE, or
-  MAXVALUE.  Each literal value must be either a
-  numeric constant that is coercible to the corresponding partition key
-  column's type, or a string literal that is valid input for that type.
+  partition_bound_expr is
+  any variable-free expression (subqueries, window functions, aggregate
+  functions, and set-returning functions are not allowed).  Its data type
+  must match the data type of the corresponding partition key column.
  
 
  
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 843ed48aa7..32a094dcf6 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -767,6 +767,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
defaultPartOid;
Relationparent,
defaultRel = NULL;
+   RangeTblEntry *rte;
 
/* Already have strong enough lock on the parent */
parent = heap_open(parentId, NoLock);
@@ -809,6 +810,14 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
pstate = make_parsestate(NULL);
pstate->p_sourcetext = queryString;
 
+   /*
+* Add an RTE containing this relation, so that transformExpr 
called
+* on partition bound expressions is able to report errors 
using a
+* proper context.
+*/
+   rte = addRangeTableEntryForRelation(pstate, rel, 
AccessShareLock,
+   
NULL, false, false);
+   addRTEtoQuery(pstate, rte, false, true, true);
b

Re: csv format for psql

2018-11-25 Thread Corey Huinker
On Sun, Nov 25, 2018 at 11:23 PM Tom Lane  wrote:

> Corey Huinker  writes:
> > Could we have another output type called "separated" that uses the
> existing
> > --fieldsep / --recordsep?
>
> Uh, what's the difference from the existing unaligned format?
>

No footer and I guess we'd want to escape instances of fieldsep and
recordsep in the data, so I guess if we had an option to escape instances
of fieldsep/recordsep found in the data, unaligned would work fine.


Re: csv format for psql

2018-11-25 Thread Pavel Stehule
po 26. 11. 2018 v 5:36 odesílatel Andrew Gierth 
napsal:

> > "Tom" == Tom Lane  writes:
>
>  Tom> Or we could kill both issues by hard-wiring the separator as ','.
>
> Using ';' for the delimiter isn't all that rare.
>

; is default for CSV produced by MS Excel in Czech mutation - so for some
countries, like CR is common.

Regards

Pavel

>
> But I don't see any reason to allow multibyte or non-ascii characters or
> arbitrary strings.
>
> --
> Andrew (irc:RhodiumToad)
>
>


Re: 64-bit hash function for hstore and citext data type

2018-11-25 Thread amul sul
Thanks Tom for enhancing & committing these patches.

Regards,
Amul
On Sat, Nov 24, 2018 at 12:15 AM Tom Lane  wrote:
>
> Andrew Gierth  writes:
> > "Tomas" == Tomas Vondra  writes:
> >  Tomas> The important question is - can there be two different encodings
> >  Tomas> for the same hstore value?
>
> > I was going to say "no", but in fact on closer examination there is an
> > edge case caused by the fact that hstoreUpgrade allows an _empty_ hstore
> > from pg_upgraded 8.4 data through without modifying it. (There's also a
> > vanishingly unlikely case involving the pgfoundry release of hstore-new.)
>
> Ugh.  Still, that's a pre-existing problem in hstore_hash, and so I don't
> think it's a blocker for this patch.
>
> > I'm inclined to fix this in hstoreUpgrade rather than complicate
> > hstore_hash with historical trivia. Also there have been no field
> > complaints - I guess it's unlikely that there is much pg 8.4 hstore data
> > in the wild that anyone wants to hash.
>
> Changing hstoreUpgrade at this point seems like wasted/misguided effort.
> I don't doubt that there was a lot of 8.4 hstore data out there, but how
> much remains unmigrated?  If we're going to take this seriously at all,
> my inclination would be to change hstore_hash[_extended] to test for
> the empty-hstore case and force the same value it gets for such an
> hstore made today.
>
> In the meantime, I went ahead and pushed these patches.  The only
> non-cosmetic changes I made were to remove the changes in
> citext--unpackaged--1.0.sql/hstore--unpackaged--1.0.sql; those
> were wrong, because the point of those files is to migrate pre-9.1
> databases into the extension system.  Such a database would not
> contain an extended hash function, and so adding an ALTER EXTENSION
> command for that function would cause the script to fail.
>
> regards, tom lane



Re: Inadequate executor locking of indexes

2018-11-25 Thread Amit Langote
Sorry for jumping in late on this.

On 2018/11/24 1:25, Tom Lane wrote:
> David Rowley  writes:
> Thinking more about this, the problem I noted previously about two of
> these solutions not working if the index scan node is not physically
> underneath the ModifyTable node actually applies to all three :-(.
> It's a slightly different issue for #2, namely that what we risk is
> first taking AccessShareLock and then upgrading to RowExclusiveLock.
> Since there are places (not many) that take ShareLock on indexes,
> this would pose a deadlock risk.
> 
> Now, after more thought, I believe that it's probably impossible
> to have a plan tree in which ExecRelationIsTargetRelation would
> return true at an indexscan node that's not underneath the ModifyTable
> node.  What *is* possible is that we have such an indexscan on a
> different RTE for the same table.  I constructed this admittedly
> artificial example in the regression database:
> 
> # explain with x1 as (select * from tenk1 t1 where unique1 = 42),
> x2 as (update tenk1 t2 set two = 2 where unique2 = 11 returning *)
> select * from x1,x2 where x1.ten = x2.ten;
>   QUERY PLAN  
> 
> --
>  Nested Loop  (cost=16.61..16.66 rows=1 width=488)
>Join Filter: (x1.ten = x2.ten)
>CTE x1
>  ->  Index Scan using tenk1_unique1 on tenk1 t1  (cost=0.29..8.30 rows=1 
> width=244)
>Index Cond: (unique1 = 42)
>CTE x2
>  ->  Update on tenk1 t2  (cost=0.29..8.30 rows=1 width=250)
>->  Index Scan using tenk1_unique2 on tenk1 t2  (cost=0.29..8.30 
> rows=1 width=250)
>  Index Cond: (unique2 = 11)
>->  CTE Scan on x1  (cost=0.00..0.02 rows=1 width=244)
>->  CTE Scan on x2  (cost=0.00..0.02 rows=1 width=244)
> (11 rows)
> 
> Because the CTEs will be initialized in order, this presents a case
> where the lock-upgrade hazard exists today: the x1 indexscan will
> open tenk1_unique1 with AccessShareLock and then x2's ModifyTable
> node will upgrade that to RowExclusiveLock.  None of the proposed
> fixes improve this.

Provided we want to keep ExecRelationIsTargetRelation going forward, how
about modifying it such that we compare the scan relation's OID with that
of the result relations, not the RT index?  Like in the attached.

> I'm beginning to think that postponing target-index locking to
> ExecInitModifyTable was a damfool idea and we should undo it.

+1

Also as already proposed, InitPlan should lock result relation indexes
even for DELETEs.

>> For partitioned
>> table updates, there may be many result relations which can cause
>> ExecRelationIsTargetRelation() to become very slow, in such a case any
>> additional redundant lock would be cheap by comparison.
> 
> Yeah, it'd be nice to get rid of the need for that.

As David mentioned elsewhere, can we add the ResultRelInfos to a hash
table if there are at least a certain number of result relations?
3f2393edefa did that for UPDATE tuple routing efficiency.

Thanks,
Amit
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 2a47abc02e..b9ee875267 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -664,12 +664,17 @@ bool
 ExecRelationIsTargetRelation(EState *estate, Index scanrelid)
 {
ResultRelInfo *resultRelInfos;
+   Oid scanreloid;
int i;
 
+   Assert(estate->es_range_table_array != NULL &&
+  estate->es_range_table_array[scanrelid - 1] != NULL);
+   scanreloid = estate->es_range_table_array[scanrelid - 1]->relid;
+
resultRelInfos = estate->es_result_relations;
for (i = 0; i < estate->es_num_result_relations; i++)
{
-   if (resultRelInfos[i].ri_RangeTableIndex == scanrelid)
+   if (scanreloid == resultRelInfos[i].ri_RelationDesc->rd_id)
return true;
}
return false;


Re: csv format for psql

2018-11-25 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 Tom> Or we could kill both issues by hard-wiring the separator as ','.

Using ';' for the delimiter isn't all that rare.

But I don't see any reason to allow multibyte or non-ascii characters or
arbitrary strings.

-- 
Andrew (irc:RhodiumToad)



Re: csv format for psql

2018-11-25 Thread Tom Lane
Corey Huinker  writes:
> Could we have another output type called "separated" that uses the existing
> --fieldsep / --recordsep?

Uh, what's the difference from the existing unaligned format?

regards, tom lane



Re: tab-completion debug print

2018-11-25 Thread Tom Lane
Kyotaro HORIGUCHI  writes:
>> On Fri, Nov 23, 2018 at 04:32:31PM -0500, Tom Lane wrote:
>>> Hm.  I can see the value of instrumenting tab-complete when you're trying
>>> to debug why it did something, but this output format seems pretty terse
>>> and unreadable.

> The reason for the minimal output was it won't disturb console so
> much when stderr is not redirected. It could be richer if we
> premise redirection.

It seems to me this behavior would be completely unusable if it's not
redirected; I don't understand why you're even considering that as an
interesting option.  libreadline is necessarily active when we're doing
this, and it will get very confused (or at least produce a very confused
display) if anything else is emitting text onto the active console line.
Maybe somebody who never makes any typing errors at all and doesn't really
need to see what they typed could use it like that, but I for one would
find it quite useless.

In fact, I was thinking of proposing that the output shouldn't go
to stderr in the first place.  If you do it like that, then you're
also confusing this info with query error output, which does little
for usability either.

Speaking of query error output, it wouldn't be a bad thing if this
mode could show any errors from the tab-completion queries.
I suppose people look into the postmaster log for that right now;
but if this were all going to some separate log file, I'd vote for
showing the constructed queries and their results or errors.

regards, tom lane



Re: csv format for psql

2018-11-25 Thread Corey Huinker
>
>
> Or we could kill both issues by hard-wiring the separator as ','.


+1

I've never encountered a situation where a customer wanted a custom
delimiter AND quoted strings. So either they wanted pure CSV or a customed
TSV.

Could we have another output type called "separated" that uses the existing
--fieldsep / --recordsep? Word will get out that csv is faster, but we'd
still have the flexibility if somebody really wanted it.


Re: tab-completion debug print

2018-11-25 Thread Kyotaro HORIGUCHI
Thank you for the comments.

At Sun, 25 Nov 2018 01:17:27 +0100, David Fetter  wrote in 
<20181125001727.gm...@fetter.org>
> On Fri, Nov 23, 2018 at 04:32:31PM -0500, Tom Lane wrote:
> > Hm.  I can see the value of instrumenting tab-complete when you're trying
> > to debug why it did something, but this output format seems pretty terse
> > and unreadable.  Can we get it to print the completion text as well?
> > I'm imagining something more like
> > 
> > 1414: "er "
> > 1435: ""
> > 1435: "ab"
> > 1435: ""
> > 1431: ""
> > 
> > Perhaps there's room as well to print the context that the match looked
> > at:
> > 
> > 1414: "alt" -> "er "
> > 1435: "alter " -> ""
> > 1435: "alter t" -> "ab"
> > 
> > etc.
> > 
> > regards, tom lane
> 
> Is this something along the lines of what you had in mind?

The reason for the minimal output was it won't disturb console so
much when stderr is not redirected. It could be richer if we
premise redirection. Since every debug line is followed by a
line-feed, we don't need the brackets. Since the result is a set
so Devid's proposal to enclose the set by parentheses looks nice.
Also, result can contain matches containing word breaks (or even
an empty string) so quoting make every candidate string more
distinctive.

Finally, the attached shows the following debugging output.


INPUT: alter u(ser) 
OUTPUT:
1445: "u" -> ("user", "user", "user mapping for")
1445: "user" -> ("user", "user", "user mapping for")
3630: "" -> ("", "horiguti", "MAPPING FOR", "pg_execute_server_program", 
"pg_monitor", "pg_read_all_settings", "pg_read_all_stats", 
"pg_read_server_files", "pg_signal_backend", "pg_stat_scan_tables", 
"pg_write_server_files", "usr1", "usr2")

INPUT: create function 
OUTPUT:
3635: "" -> ("", "pg_ndistinct_out", "path_recv", "int4eq", 
"spg_quad_picksplit", "tsvectorsend", "float8_var_pop", "opaque_in", 
"interval_le", "macaddr_cmp", "range_gt..(17272 chars)..nge_in")

Wouldn't be a problem with the long line since it is for
debugging purpose. Of course we can limit the length easily (by
number of elements).

>+  for (int i = 0; list && list[i]; ++i)

# Are we using C99 for-loop-variable-declarations?

An arguable change in this version is enclosing empty result
lists with brackets, since it is just an intermediate state for
us.

[1431: "sael" -> ()]
3655: "sael" -> ("")

Showing completion context makes things a bit bothersome, since
complete_from_variables needs to take additional two parameters.
It is already shown in console conversasion so I'm not sure it's
worth the trouble.

1872: (...table t1 alter [COn]) -> ("CONSTRAINT")

To be friendly with the CI, the second patch is attached as the
difference from the first patch.


Note: This is not registered in this  CF.

- v3-0001-Psql-completion-debug-print-feature.patch
  W/O context version

- v3-0002-Add-context-information-in-tab-completion-debug-prin.patch
  W context version

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 573d8352331d0fd2ee0b5f3147cfcdcf7f1d4afc Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Mon, 26 Nov 2018 12:05:35 +0900
Subject: [PATCH 1/2] Psql completion debug print feature

Sometimes we find it bothersome to identify the code that actually
made a tab-completion. Psql will show the line number and completion
result to stderr by defining TABCOMPLETION_DEBUG at comiple time with
this patch.
---
 src/bin/psql/tab-complete.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 9dbd555166..00aa730050 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -60,6 +60,40 @@ extern char *filename_completion_function();
 #define completion_matches rl_completion_matches
 #endif
 
+/*
+ * By enabling the following definition every completion attempt emits to
+ * stderr the details of every completion prefixed by the source line number
+ * where it is made. You can isolate them from console interaction by
+ * redirecting stderr into a file.
+ */
+#ifdef TABCOMPLETION_DEBUG
+#ifdef HAVE_RL_COMPLETION_MATCHES
+#define org_completion_matches rl_completion_matches
+#else
+#define org_completion_matches completion_matches
+#endif
+
+#undef completion_matches
+#define completion_matches(t, f) completion_debug(__LINE__, (t), (f))
+
+static char **completion_debug(int line,
+			   const char *text, rl_compentry_func_t *func)
+{
+	char **list = org_completion_matches(text, func);
+
+	/*
+	 * enclose empty list with brackets since it is an intermediate state
+	 * which is immediately followed by a non-empty list.
+	 */
+	fprintf(stderr, "%s%d: \"%s\" -> (", list ? "" : "[", line, text);
+	for (int i = 0; list && list[i]; ++i)
+		fprintf(stderr, "%s\"%s\"", i ? ", " : "", list[i]);			
+	fprintf(stderr, ")%s\n", list ? "": "]");
+
+	return list;
+}
+#endif
+
 /* word break characters */
 #define WORD_BREAKS		"\t\n@$><=;|&{() "
 
-- 
2.16.3

>From 4

Broken and rotten TAP tests in contrib/bloom/

2018-11-25 Thread Michael Paquier
Hi Alexander & Teodor,

It happens that the buildfarm machines do not like much the set of TAP
tests present in contrib/bloom/:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2018-11-25%2023%3A59%3A03

Here is an extract of the failure:
#   Failed test 'initial: query result matches'
#   at t/001_wal.pl line 38.
#  got: '0|d
# 0|9
# 0|3
# 0|d

These currently cannot run within the buildfarm as a custom Makefile
target is used.  And likely many developers don't run them either.

Could you look at what is wrong in them?  Thanks,
--
Michael


signature.asc
Description: PGP signature


Re: pgsql: Add PGXS options to control TAP and isolation tests

2018-11-25 Thread Michael Paquier
On Mon, Nov 26, 2018 at 09:33:51AM +0900, Michael Paquier wrote:
> Several buildfarm members complain about this commit, including dory and
> longfin:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dory&dt=2018-11-26%2000%3A00%3A26
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2018-11-25%2023%3A59%3A03
> 
> The error comes from the TAP tests of contrib/bloom/, which we have
> never run across the buildfarm until now:
> #   Failed test 'initial: query result matches'
> #   at t/001_wal.pl line 38.
> #  got: '0|d
> # 0|9
> # 0|3
> 
> I am not sure what to think of that yet, so as a short-term fix and to
> keep the buildfarm green, I am going to disable the TAP tests for bloom,
> but something is busted there, either in the code or in the tests.

I have been fighting with the buildfarm a bit this morning, but this is
proving to have a larger subset of issues than I anticipated first.
Here is the collection of my notes:
- MSVC scripts assume that REGRESS_OPTS can only use top_builddir.  Some
test suites actually finish by using top_srcdir, like pg_stat_statements
which cause the regression tests to never run on Windows!
- Trying to enforce top_builddir does not work either when using VPATH
as this is not recognized properly.
- TAP tests of bloom are unstable on various platforms, causing various
failures.

As a result, I have reverted all the recent changes which introduced
those new PGXS options.  I am going to spawn a set of thread to address
all the underlying issues first as those deserve more discussion.  There
is a collection of tests not actually running on Windows.
--
Michael


signature.asc
Description: PGP signature


Tab completion for ALTER INDEX|TABLE ALTER COLUMN SET STATISTICS

2018-11-25 Thread Tatsuro Yamada

Hi,

Attached patches are following:

* tab_completion_alter_index_set_statistics.patch
- Add column name completion after ALTER COLUMN
- Avoid schema completion after SET STATISTICS

* fix_manual_of_alter_index.patch
- ALTER INDEX ALTER COLUMN syntax is able to use not only
  column number but also column name. So, I fixed the manual.

* tab_completion_alter_table_set_statistics.patch
- Avoid schema completion after SET STATISTICS


Regards,
Tatsuro Yamada

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 9dbd555166..31f4b7d862 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1554,9 +1554,18 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("PARTITION");
 	else if (Matches("ALTER", "INDEX", MatchAny, "ATTACH", "PARTITION"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
-	/* ALTER INDEX  ALTER COLUMN  */
-	else if (Matches("ALTER", "INDEX", MatchAny, "ALTER", "COLUMN", MatchAny))
+	/* ALTER INDEX  ALTER [COLUMN] */
+	else if (Matches("ALTER", "INDEX", MatchAny, "ALTER", "COLUMN") ||
+			 Matches("ALTER", "INDEX", MatchAny, "ALTER"))
+		COMPLETE_WITH_ATTR(prev3_wd, "");
+	/* ALTER INDEX  ALTER [COLUMN]  */
+	else if (Matches("ALTER", "INDEX", MatchAny, "ALTER", "COLUMN", MatchAny) ||
+			 Matches("ALTER", "INDEX", MatchAny, "ALTER", MatchAny))
 		COMPLETE_WITH("SET STATISTICS");
+	/* ALTER INDEX  ALTER COLUMN  SET STATISTICS */
+	else if (HeadMatches("ALTER", "INDEX") && TailMatches("SET", "STATISTICS")){
+		/* We don't complete after "SET STATISTICS" */
+	}
 	/* ALTER INDEX  SET */
 	else if (Matches("ALTER", "INDEX", MatchAny, "SET"))
 		COMPLETE_WITH("(", "TABLESPACE");
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 9dbd555166..2f041350fa 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1858,6 +1858,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny, "SET", "(") ||
 			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny, "SET", "("))
 		COMPLETE_WITH("n_distinct", "n_distinct_inherited");
+	/* ALTER TABLE ALTER [COLUMN]  SET STATISTICS */
+	else if (HeadMatches("ALTER", "TABLE") && TailMatches("SET", "STATISTICS")){
+		/* We don't complete after "SET STATISTICS" */
+	}
 	/* ALTER TABLE ALTER [COLUMN]  SET STORAGE */
 	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny, "SET", "STORAGE") ||
 			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny, "SET", "STORAGE"))
diff --git a/doc/src/sgml/ref/alter_index.sgml b/doc/src/sgml/ref/alter_index.sgml
index 6d34dbb74e..1b5b1f7ef1 100644
--- a/doc/src/sgml/ref/alter_index.sgml
+++ b/doc/src/sgml/ref/alter_index.sgml
@@ -27,7 +27,7 @@ ALTER INDEX name ATTACH PARTITION <
 ALTER INDEX name DEPENDS ON EXTENSION extension_name
 ALTER INDEX [ IF EXISTS ] name SET ( storage_parameter = value [, ... ] )
 ALTER INDEX [ IF EXISTS ] name RESET ( storage_parameter [, ... ] )
-ALTER INDEX [ IF EXISTS ] name ALTER [ COLUMN ] column_number
+ALTER INDEX [ IF EXISTS ] name ALTER [ COLUMN ] column_number | column_name
 SET STATISTICS integer
 ALTER INDEX ALL IN TABLESPACE name [ OWNED BY role_name [, ... ] ]
 SET TABLESPACE new_tablespace [ NOWAIT ]
@@ -137,14 +137,12 @@ ALTER INDEX ALL IN TABLESPACE name

 

-ALTER [ COLUMN ] column_number SET STATISTICS integer
+ALTER [ COLUMN ] column_number | column_name SET STATISTICS integer
 
  
   This form sets the per-column statistics-gathering target for
   subsequent  operations, though can
   be used only on index columns that are defined as an expression.
-  Since expressions lack a unique name, we refer to them using the
-  ordinal number of the index column.
   The target can be set in the range 0 to 1; alternatively, set it
   to -1 to revert to using the system default statistics
   target ().
@@ -175,6 +173,15 @@ ALTER INDEX ALL IN TABLESPACE name
   
  
 
+ 
+  column_name
+  
+   
+The name of an existing index column.
+   
+  
+ 
+
  
   column_number
   


Re: shared-memory based stats collector

2018-11-25 Thread Tomas Vondra
Hi,

Unfortunately, the patch does not apply anymore - it seems it got broken
by the changes to signal handling and/or removal of magic OIDs :-(

I've done a review and testing when applied on top of 10074651e335.

Firstly, the testing - I was wondering if the patch has some performance
impact, so I've done some testing with a read-only workload on large
number of tables (1k, 10k and 100k) while concurrently selecting data
from pg_stat_* catalogs at the same time.

In one case both workloads were running against the same database, in
another there were two separate databases (and the selects from stat
catalogs were running against an "empty" database with no use tables).

In both cases there were 8 clients doing selects from the user tables,
and 4 clients accessing the pg_stat_* catalogs.

For the "single database" case the results looks like this (this is just
patched / master throughput):

# of tables xact stats
--
1000  97.71%98.76%
1100.38%97.97%
10   100.10%98.50%

xact is throughput of the user workload (select from the large number of
tables) and stats is throughput for selects from system catalogs.

So pretty much no difference - 2% is within noise on this machine.

On two separate databases the results are a bit more interesting:

# of tables xact  stats
---
1000 100.49% 80.38%
1103.18% 80.28%
10   100.85% 81.95%

For the main workload there's pretty much no difference, but for selects
from the stats catalogs there's ~20% drop in throughput. In absolute
numbers this means drop from ~670tps to ~550tps. I haven't investigated
this, but I suppose this is due to dshash seqscan being more expensive
than reading the data from file.

I don't think any of this is an issue in practice, though. The important
thing is that there's no measurable impact on the regular workload.

Now, a couple of comments regarding individual parts of the patch.


0001-0003
-

I do think 0001 - 0003 are ready, with some minor cosmetic issues:

1) I'd rephrase the last part of dshash_seq_init comment more like this:

* If consistent is set for dshash_seq_init, the all hash table
* partitions are locked in the requested mode (as determined by the
* exclusive flag), and the locks are held until the end of the scan.
* Otherwise the partition locks are acquired and released as needed
* during the scan (up to two partitions may be locked at the same time).

Maybe it should briefly explain what the consistency guarantees are (and
aren't), but considering we're not materially changing the existing
behavior probably  is not really necessary.

2) I think the dshash_find_extended() signature should be more like
dshash_find(), i.e. just appending parameters instead of moving them
around unnecessarily. Perhaps we should add

Assert(nowait || !lock_acquired);

Using nowait=false with lock_acquired!=NULL does not seem sensible.

3) I suppose this comment in postmaster.c is just copy-paste:

-#define BACKEND_TYPE_ARCHIVER  0x0010  /* bgworker process */
+#define BACKEND_TYPE_ARCHIVER  0x0010  /* archiver process */

I wonder why wasn't archiver a regular auxiliary process already? It
seems like a fairly natural thing, so why not?


0004 (+0005 and 0007)
-

This seems fine, but I have my doubts about two changes - removing of
stats_temp_directory and the IDLE_STATS_UPDATE_TIMEOUT thingy.

There's a couple of issues with the stats_temp_directory. Firstly, I
don't understand why it's spread over multiple parts of the patch. The
GUC is removed in 0004, the underlying variable is removed in 0005 and
then the docs are updated in 0007. If we really want to do this, it
should happen in a single patch.

But the main question is - do we really want to do that? I understand
this directory was meant for the stats data we're moving to shared
memory, so removing it seems natural. But clearly it's used by
pg_stat_statements - 0005 fixes that, of course, but I wonder if there
are other extensions using it to store files?

It's not just about how intensive I/O to those files is, but this also
means the files will now be included in backups / pg_rewind, and maybe
that's not really desirable?

Maybe it's fine but I'm not quite convinced about it ...

I'm not sure I understand what IDLE_STATS_UPDATE_TIMEOUT does. You've
described it as

   This adds a new timeout IDLE_STATS_UPDATE_TIMEOUT. This works
   similarly to IDLE_IN_TRANSACTIION_SESSION_TIMEOUT. It fires in
   at most PGSTAT_STAT_MIN_INTERVAL(500)ms to clean up pending
   statistics update.

but I'm not sure what pending updates do you mean? Aren't we updating
the stats at the end of each transaction? At least that's what we've
been doing before, so maybe this patch changes that?


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Suppor

Re: Regarding performance regression on specific query

2018-11-25 Thread Jung, Jinho

Thanks for the comment.


We also have several performance regression cases that we found from TPC-C 
benchmark. Since those queries were not executed on empty relation, they will 
be more interesting.


We will report to pgsql-performance mailing list next time.


Jinho Jung



From: Tom Lane 
Sent: Saturday, November 24, 2018 3:32:41 PM
To: Amit Langote
Cc: Jung, Jinho; pgsql-hack...@postgresql.org
Subject: Re: Regarding performance regression on specific query

Amit Langote  writes:
> On 2018/11/20 2:49, Jung, Jinho wrote:
>>  [ assorted queries ]

> I noticed that these two are fixed by running ANALYZE in the database in
> which these queries are run.

That didn't help much for me.  What did help was increasing
join_collapse_limit and from_collapse_limit to not limit the
join search space --- on queries with as many input relations
as these, you're really at the mercy of whether the given query
structure represents a good join order if you don't.

In general I can't get even a little bit excited about the quality of the
plans selected for these examples, as they all involve made-up restriction
and join clauses that the planner isn't going to have the slightest clue
about.  The observations boil down to "9.4 made one set of arbitrary plan
choices, while v10 made a different set of arbitrary plan choices, and on
these particular examples 9.4 got lucky and 10 didn't".

Possibly also worth noting is that running these in an empty database
is in itself kind of a worst case, because many of the tables are empty
to start with (or the restriction/join clauses pass no rows), and so
the fastest runtime tends to go to plans of the form "nestloop with
empty relation on the outside and all the expensive stuff on the
inside".  (Observe all the "(never executed)" notations in the EXPLAIN
output.)  This kind of plan wins only when the outer rel is actually
empty, otherwise it can easily lose big, and therefore PG's planner is
intentionally designed to discount the case entirely.  We never believe
that a relation is empty, unless we can mathematically prove that, and
our cost estimates are never made with an eye to exploiting such cases.
This contributes a lot to the random-chance nature of which plan is
actually fastest; the planner isn't expecting "(never executed)" to
happen and doesn't prefer plans that will win if it does happen.

regards, tom lane


Re: csv format for psql

2018-11-25 Thread Tom Lane
Fabien COELHO  writes:
> Basically the proposed patch addresses a simple and convenient use case 
> which are neither addressed by \copy nor COPY. The fact that more options 
> are available with these commands does it precludes its usefulness as is.

Yeah, I agree that this option is useful independently of whether COPY
provides something similar.  I think the killer argument is that right
now, psql-ref.sgml repeatedly suggests that unaligned mode with fieldsep
',' is a reasonable way to produce comma-separated output for consumption
by other programs.  That's like handing our users a loaded foot-gun.
And, in fact, right now *none* of psql's table output formats is both
unambiguous and reasonably simple/popular to use.  So the astonishing
thing about this patch, IMO, is that we didn't do it a decade ago.

I went through the documentation to improve that point, and did some other
cosmetic cleanup including rebasing up to HEAD, and got the attached.

I think there are two remaining points to settle:

1. Are we limiting the separator to be a single-byte character or not?
If we are, the code could be made simpler and faster by working with a
"char" value instead of a string.  If we're not, then Michael's change
needs to be undone (but I didn't do that here).

I feel that if we allow multi-byte characters here, we might as well
take the training wheels off and just say you can use any separator
string you want, as long as it doesn't contain double quote, \r, or \n.
Most programs reading a file are not going to perceive a difference
between separating fields with a single multibyte character and multiple
single-byte characters; either they can cope or they can't.  A fair
number of them are going to be in the latter category.  So we can either
say "do what you wish, it's your problem whether anything can read the
result" or "we're going to restrict you to something that (perhaps) is
more widely readable".  I'm a bit inclined to the former viewpoint.
If we were in the business of being restrictive, why would we allow
the field separator to be changed at all?  The name of the format is
*comma* separated values, not something else.

2. Speaking of the field separator, I'm pretty desperately unhappy
with the choice of "fieldsep_csv" as the parameter name.  The trouble
with that is that it encourages sticking "fieldsep_csv" in between
"fieldsep" and "fieldsep_zero", because alphabet.  But "fieldsep" and
"fieldsep_zero" are *intimately* tied together, in fact it's only a
dubious implementation choice that made them separate parameters at all.
It does not make any semantic sense to stick other vaguely-related
parameters in between, neither in the documentation nor in \pset output.

We could avoid this self-inflicted confusion by choosing a different
parameter name.  I'd be good with "csv_fieldsep" or "csvfieldsep".

Or we could kill both issues by hard-wiring the separator as ','.

Thoughts?

regards, tom lane

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6e6d0f4..d53451b 100644
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** PostgreSQL documentation
*** 68,75 


Switches to unaligned output mode. (The default output mode is
!   otherwise aligned.)  This is equivalent to \pset format
!   unaligned.


  
--- 68,75 


Switches to unaligned output mode. (The default output mode is
!   aligned.)  This is equivalent to
!   \pset format unaligned.


  
*** EOF
*** 152,157 
--- 152,167 
  
  
  
+   --csv
+   
+   
+   Switches to CSV (Comma-Separated Values) output
+   mode.  This is equivalent to \pset format csv.
+   
+   
+ 
+ 
+ 
-d dbname
--dbname=dbname

*** EOF
*** 270,277 
--html


!   Turn on HTML tabular output. This is
!   equivalent to \pset format html or the
\H command.


--- 280,287 
--html


!   Switches to HTML output mode.  This is
!   equivalent to \pset format html or the
\H command.


*** lo_import 152801
*** 2547,2554 


Specifies the field separator to be used in unaligned output
!   format. That way one can create, for example, tab- or
!   comma-separated output, which other programs might prefer. To
set a tab as field separator, type \pset fieldsep
'\t'. The default field separator is
'|' (a vertical bar).
--- 2557,2564 


Specifies the field separator to be used in unaligned output
!   format. That way one can create, for example, tab-separated
!   output, which other programs might prefer. To
set a tab

Re: SSL tests failing with "ee key too small" error on Debian SID

2018-11-25 Thread Michael Paquier
On Mon, Nov 26, 2018 at 01:17:24PM +1300, Thomas Munro wrote:
> On Wed, Oct 3, 2018 at 1:32 PM Michael Paquier  wrote:
>> I find your suggestion quite tempting at the end instead of having to
>> tweak the global system's configuration.  That should normally work with
>> any configuration.  This would require regenerating the certs in the
>> tree.  Any thoughts from others?
> 
> I don't really have opinion here, but I wanted to point out that
> src/test/ldap/t/001_auth.pl creates new certs on the fly, which is a
> bit inconsistent with the SSL test's approach of certs-in-the-tree.
> Which is better?

When going up to 2k, it takes longer to generate the keys than to run
the tests, so keeping them in the tree looks like a pretty good gain to
me.
--
Michael


signature.asc
Description: PGP signature


Re: Copy function for logical replication slots

2018-11-25 Thread Masahiko Sawada
On Sun, Nov 25, 2018 at 12:27 AM Petr Jelinek
 wrote:
>
> Hi,
>
> On 31/08/2018 07:03, Masahiko Sawada wrote:
> >
> > Attached a new version patch incorporated the all comments I got.
> >
>
> This looks pretty reasonable.

Thank you for looking at this patch.

>
> I am personally not big fan of the C wrappers for overloaded functions,
> but that's what we need to do for opr_sanity to pass so I guess we'll
> have to use them.
>
> The more serious thing is:
>
> > + if (MyReplicationSlot)
> > + ReplicationSlotRelease();
> > +
> > + /* Release the saved slot if exist while preventing double releasing 
> > */
> > + if (savedslot && savedslot != MyReplicationSlot)
>
> This won't work as intended as the ReplicationSlotRelease() will set
> MyReplicationSlot to NULL, you might need to set aside MyReplicationSlot
> to yet another temp variable inside this function prior to releasing it.
>

You're right. I've fixed it by checking if we need to release the
saved slot before releasing the origin slot. Is that right?
Attached an updated patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


v7-0001-Copy-function-for-logical-and-physical-replicatio.patch
Description: Binary data


Re: pgsql: Add PGXS options to control TAP and isolation tests

2018-11-25 Thread Michael Paquier
On Sun, Nov 25, 2018 at 11:53:02PM +, Michael Paquier wrote:
> Add PGXS options to control TAP and isolation tests
> 
> The following options are added for extensions:
> - TAP_TESTS, to allow an extention to run TAP tests which are the ones
> present in t/*.pl.  A subset of tests can always be run with the
> existing PROVE_TESTS for developers.
> - ISOLATION, to define a list of isolation tests.
> - ISOLATION_OPTS, to pass custom options to isolation_tester.
> 
> A couple of custom Makefile targets have been accumulated across the
> tree to cover the lack of facility in PGXS for a couple of releases when
> using those test suites, which are all now replaced with the new flags,
> without reducing the test coverage.  This also fixes an issue with
> contrib/bloom/, which had a custom target to trigger its TAP tests of
> its own not part of the main check runs.

Several buildfarm members complain about this commit, including dory and
longfin:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dory&dt=2018-11-26%2000%3A00%3A26
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2018-11-25%2023%3A59%3A03

The error comes from the TAP tests of contrib/bloom/, which we have
never run across the buildfarm until now:
#   Failed test 'initial: query result matches'
#   at t/001_wal.pl line 38.
#  got: '0|d
# 0|9
# 0|3

I am not sure what to think of that yet, so as a short-term fix and to
keep the buildfarm green, I am going to disable the TAP tests for bloom,
but something is busted there, either in the code or in the tests.
--
Michael


signature.asc
Description: PGP signature


Re: pgsql: Integrate recovery.conf into postgresql.conf

2018-11-25 Thread Michael Paquier
On Sun, Nov 25, 2018 at 03:49:23PM +, Peter Eisentraut wrote:
> Integrate recovery.conf into postgresql.conf
> 
> recovery.conf settings are now set in postgresql.conf (or other GUC
> sources).  Currently, all the affected settings are PGC_POSTMASTER;
> this could be refined in the future case by case.

The buildfarm is unhappy after this one:
https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=culicidae&br=HEAD

If I use -DEXEC_BACKEND when compiling the tests complain about a
timeout in 003_recovery_targets.  Here is what the buildfarm reports, I
can see the failure by myself as well:
# Postmaster PID for node "standby_6" is 26668
# poll_query_until timed out executing this query:
# SELECT '0/3022690'::pg_lsn <= pg_last_wal_replay_lsn()
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
Timed out while waiting for standby to catch up at
t/003_recovery_targets.pl line 34.

Peter, could you look at that?
--
Michael


signature.asc
Description: PGP signature


Re: SSL tests failing with "ee key too small" error on Debian SID

2018-11-25 Thread Thomas Munro
On Wed, Oct 3, 2018 at 1:32 PM Michael Paquier  wrote:
> On Mon, Oct 01, 2018 at 09:18:01PM +0900, Kyotaro HORIGUCHI wrote:
> > The attached second patch just changes key size to 2048 bits and
> > "ee key too small" are eliminated in 001_ssltests_master, but
> > instead I got "ca md too weak" error. This is eliminated by using
> > sha256 instead of sha1 in cas.config. (third attached)
>
> I find your suggestion quite tempting at the end instead of having to
> tweak the global system's configuration.  That should normally work with
> any configuration.  This would require regenerating the certs in the
> tree.  Any thoughts from others?

I don't really have opinion here, but I wanted to point out that
src/test/ldap/t/001_auth.pl creates new certs on the fly, which is a
bit inconsistent with the SSL test's approach of certs-in-the-tree.
Which is better?

-- 
Thomas Munro
http://www.enterprisedb.com



Re: allow online change primary_conninfo

2018-11-25 Thread Michael Paquier
On Sun, Nov 25, 2018 at 01:43:13PM -0800, Andres Freund wrote:
> On 2018-11-26 00:25:43 +0300, Sergei Kornilov wrote:
>> I think we have no futher reason to have a copy of conninfo and slot
>> name in WalRcvData, so i propose remove these fields from
>> pg_stat_get_wal_receiver() (and pg_stat_wal_receiver view). This data
>> can be accesible now via regular GUC commands.
> 
> Is that wise? It seems possible that wal receivers run for a while with
> the old connection information, and without this information that seems
> hard to debug.

No, that's unwise.  One of the reasons why conninfo is around is that we
know to which server a receiver is connected when specifying multiple
host and/or port targets in the configuration.  Please see 9a895462 for
the details.

Removing the slot does not look like an improvement to me either,
following Andres' argument.
--
Michael


signature.asc
Description: PGP signature


Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

2018-11-25 Thread Michael Paquier
On Mon, Nov 26, 2018 at 09:53:19AM +1300, Thomas Munro wrote:
> I see now that Michael already wrote about this recently[1], but that
> thread hasn't yet reached a conclusion.
> 
> [1] 
> https://www.postgresql.org/message-id/flat/20180917131340.GE31460%40paquier.xyz

Yes, I heard nothing but crickets on this one.  So what I have been
doing is just to update my SSL configuration when running the tests.
That's annoying...  Still not impossible to solve.  If there are extra
opinions to move on with a key replacement, I could always do so.
--
Michael


signature.asc
Description: PGP signature


Re: Add extension options to control TAP and isolation tests

2018-11-25 Thread Michael Paquier
On Sun, Nov 25, 2018 at 05:49:42PM +0300, Nikolay Shaplov wrote:
> I've carefully read this documentation. And did not get clear explanation of 
> what is the true purpose of PGXS environment variable. Only  
> 
> "The last three lines should always be the same. Earlier in the file, you 
> assign variables or add custom make rules."

The definition of PGXS is here:
https://www.postgresql.org/docs/11/extend-pgxs.html

> May be it should bot be discussed in the doc, but it should be mentioned in 
> pgxs.mk . I guess both PGXS nad NO_PGXS should be explicitly described,  in 
> order to make the rest of the code more readable. (As for me I now have some 
> intuitive understanding of it's nature, but it would be better to have strict 
> explanation)

I am not sure that documenting NO_PGXS makes much sense for extensions,
which have normally no need of the surrounding code.  If you have
suggestions of improvements for the existing docs, please feel free to
think about a patch and then post it.  Docs improvements are a
never-ending task.

> When I read these lines an idea about ISOLATION_CONF and REGRESS_CONF 
> variables came into my mind. That are transformed into proper _OPTS in 
> pgxs.mk 
> ? Since there is only one use case, may be it do not worth it. So I just 
> speak 
> this thought aloud without any intention to make it reality.

ISOLATION_OPTS and REGRESS_OPTS already allow to pass down custom
options, and are more extensible than the _CONF variants proposed here.
Keeping the number of options low is not a bad idea either.

> I've also greped for "pg_isolation_regress_check" and found that it is also 
> used in src/test/modules/snapshot_too_old  that also uses PGXS (at least as 
> an 
> option) should not we include it in your patch too?

Good point.  This can be cleaned up too, so done.

> So I think that is it. Since Artur said, he successfully used your TAP patch 
> in other extensions, I will not do it, considering it is already checked. If 
> you think it is good to recheck, I can do it too :-)

Thanks, I have re-checked, and the thing is working as I would expect.
So committed.
--
Michael


signature.asc
Description: PGP signature


Re: Constraint documentation

2018-11-25 Thread David Fetter
On Thu, Nov 22, 2018 at 03:16:11PM +0100, Patrick Francelle wrote:
> On 11/15/18 00:02, Tom Lane wrote:
> > I think this could be improved some more.  Perhaps something like this
> > (I've not bothered with markup...)
> > 
> > This is a little verbose maybe, but as the text stands, it sounds like
> > using a trigger is enough to solve all the consistency problems that
> > a cross-row CHECK has.  Which it's not of course.
> 
> Thank you for the rewriting, this is much more clear and explicit that way.
> 
> > I'm also wondering whether it's better to put this in the CREATE TABLE
> > reference page instead of here.  While there are certainly benefits in
> > having the caveat here, I'm a bit troubled by the number of forward
> > references to concepts that are described later.  OTOH, a lot of people
> > who need the warning might never see it if it's buried in the reference
> > material.
> 
> To address your remark, I added a small message in the CREATE TABLE
> reference page to be more explicit about the topic, so that it would be
> a warning for the users reading the section. And then a reference to the
> CHECK constraint page where the full explanation is to be located.
> 
> That way, the caveat is mentioned in both pages, but the full
> explanation is located only on a single page.
> 
> Please, let me know if this is good enough or maybe if I missed
> something.
> 
> Patrick Francelle

I believe that features F671 (subqueries in CHECK constraints) and
possibly F673 (reads SQL-data routine invocations in CHECK
constraints) from the standard should be referred to here.

We haven't implemented either one of them, but we might some day.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: New GUC to sample log queries

2018-11-25 Thread Thomas Munro
On Wed, Nov 21, 2018 at 9:06 PM Adrien Nayrat
 wrote:
> Thanks for your comments. Here is the updated patch. I fixed a warning for
> missing parentheses in this expression:
> if ((exceeded && in_sample) || log_duration)

Hi Adrien,

GCC 4.8 says:

guc.c:3328:3: error: initialization makes integer from pointer without
a cast [-Werror]
},
^
guc.c:3328:3: error: (near initialization for
‘ConfigureNamesReal[19].gen.flags’) [-Werror]

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Time to change pg_regress diffs to unified by default?

2018-11-25 Thread David Fetter
On Thu, Nov 22, 2018 at 02:10:59PM +0100, Christoph Berg wrote:
> Re: Noah Misch 2017-04-07 <20170407021431.gb2658...@tornado.leadboat.com>
> > > > I personally, and I know of a bunch of other regular contributors, find
> > > > context diffs very hard to read.  Besides general dislike, for things
> > > > like regression test output context diffs are just not well suited.
> > > 
> > > Personally, I disagree completely.  Unified diffs are utterly unreadable
> > > for anything beyond trivial cases of small well-separated changes.
> > > 
> > > It's possible that regression failure diffs will usually fall into that
> > > category, but I'm not convinced.
> > 
> > For reading patches, I frequently use both formats.  Overall, I perhaps read
> > unified 3/4 of the time and context 1/4 of the time.
> > 
> > For regression diffs, I use PG_REGRESS_DIFF_OPTS=-u and have never 
> > converted a
> > regression diff to context form.  Hence, +1 for the proposed change.
> 
> I've just had another case of horrible context diff from pg_regress.
> I'd claim that regression diffs are particularly bad for context diffs
> because one error will often trigger a whole chain of failures, which
> will essentially make the whole file appear twice in the output,
> whereas unified diffs would just put the original output and the error
> right next to each other at the top. Having to scroll through a whole
> .out file just to find "create extension; file not found" is very
> inefficient.
> 
> It's nice that PG_REGRESS_DIFF_OPTS exists, but the diffs are often
> coming from automated build logs where setting the variable after the
> fact doesn't help.
> 
> Please consider the attached patch, extension packagers will thank
> you.
> 
> Christoph

+1 for pushing this. Regression diffs can get pretty noisy even with
it in place.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Doc patch on psql output formats

2018-11-25 Thread Tom Lane
"Daniel Verite"  writes:
> Tom Lane wrote:
>> Pushed.  (I simplified the code a bit by using just one state variable,
>> and also made the error message more verbose.)

> Thanks!

I noticed while poking at the csv patch that we'd outsmarted ourselves
with this one.  As of HEAD, it's impossible to select latex output format
at all:

regression=# \pset format latex
\pset: ambiguous abbreviation "latex" matches both "latex" and "latex-longtable"

We could fix that by adding a special case to accept an exact match
immediately.  However, that would still leave us in a situation where
"latex" can never be abbreviated at all, which does not seem very nice
(not to mention not being backwards-compatible).  Instead I propose
treating "latex-longtable" as a special case, as attached.  With this
code, "l", "la", up through "latex" are all accepted as "latex", while
"latex-" through "latex-longtable" are accepted as "latex-longtable".
This has less ideological purity than one could wish, but it's backwards
compatible and arguably a lot more user-friendly than what we'd have
if we insist on an exact match for "latex".

In future, let's reject any proposal to invent switch or option names
such that one is a prefix of another.

regards, tom lane

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index ee88e1c..13d4c57 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3698,7 +3698,6 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			{"asciidoc", PRINT_ASCIIDOC},
 			{"html", PRINT_HTML},
 			{"latex", PRINT_LATEX},
-			{"latex-longtable", PRINT_LATEX_LONGTABLE},
 			{"troff-ms", PRINT_TROFF_MS},
 			{"unaligned", PRINT_UNALIGNED},
 			{"wrapped", PRINT_WRAPPED}
@@ -3725,13 +3724,22 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 	}
 }
 			}
-			if (match_pos < 0)
+			if (match_pos >= 0)
+popt->topt.format = formats[match_pos].number;
+			else if (pg_strncasecmp("latex-longtable", value, vallen) == 0)
+			{
+/*
+ * We must treat latex-longtable specially because latex is a
+ * prefix of it; if both were in the table above, we'd think
+ * "latex" is ambiguous.
+ */
+popt->topt.format = PRINT_LATEX_LONGTABLE;
+			}
+			else
 			{
 psql_error("\\pset: allowed formats are aligned, asciidoc, html, latex, latex-longtable, troff-ms, unaligned, wrapped\n");
 return false;
 			}
-			else
-popt->topt.format = formats[match_pos].number;
 		}
 	}
 


Re: Continue work on changes to recovery.conf API

2018-11-25 Thread Stephen Frost
Greetings,

* Sergei Kornilov (s...@zsrv.org) wrote:
> > Why don't we put recovery entries into postgresql.recovery.conf or such?
> > And overwrite rather than append?
> Appending to postgressql.auto.conf instead of writing new hardcoded file was 
> proposed here: 
> https://www.postgresql.org/message-id/CAB7nPqQ-Grduc-0WSsK4-VXo%2B1CB%3DyDS9NMU0NEp%2Bc%2Bffnz0UQ%40mail.gmail.com

Thanks, yes, I saw that but I don't think it really contemplated the
issues that arise from that approach, as I outlined.

Thanks again!

Stephen


signature.asc
Description: PGP signature


Python versions (was Re: RHEL 8.0 build)

2018-11-25 Thread Tom Lane
Andres Freund  writes:
> On 2018-11-24 15:49:25 -0500, Tom Lane wrote:
>> There's been some preliminary discussion about starting to default to
>> python3, but given this project's inherent conservatism, I don't expect
>> that to happen for some years yet.  In any case, whenever we do pull
>> that trigger we'd surely do so only in HEAD not released branches, so
>> buildfarm owners will need to deal with the case for years more.

> Why don't we probe for python2 in addition to python by default? That
> ought to make RHEL 8 work, without making the switch just yet.

I'm unexcited about that because that *would* be expressing a version
preference --- and one that's on the wrong side of history.

Also, I noticed on a fresh FreeBSD 12.0 installation that what
I've got is

$ ls /usr/bin/pyth*
ls: /usr/bin/pyth*: No such file or directory
$ ls /usr/local/bin/pyth*
/usr/local/bin/python2.7
/usr/local/bin/python2.7-config
/usr/local/bin/python3.6
/usr/local/bin/python3.6-config
/usr/local/bin/python3.6m
/usr/local/bin/python3.6m-config

So there are modern platforms on which "python2" isn't going to make
it work automatically either.

At some point I think what we're going to want to do is to probe for,
in order, $PYTHON, python, python3, python3.7, python3.6, ..., python3.0,
python2, python2.7, python2.6, ..., python2.4 (or whatever our minimum
supported version is).  However, I don't think we are ready to put in a
preference towards python3 yet, and it's hard to see how to do something
like this while still being version agnostic.

[ thinks for awhile ... ]  Maybe we could do something like this:

1. If $PYTHON exists, use that.

2. If "python" exists, use that.

3. Probe all the numbered python versions suggested above.  If
*exactly one* of them exists, use that.

4. Otherwise complain and tell user to resolve uncertainty by
setting $PYTHON.

That's still not quite right, because it'd get confused by
something like symlinking python3 to python3.6, which you'd
kind of wish didn't confuse it.  But you could imagine sweating
over this for an hour or two and getting something that generally
did the right thing, and maybe with another hour or two on docs
you could explain it reasonably.  I'm unconvinced that it's worth
the trouble though; I think we're better off waiting until we can
go with a straight prefer-the-newest-version rule.  With where the
world is right now, it seems to me that making the user specify
which Python to use is arguably a feature not a bug.

regards, tom lane



Re: Continue work on changes to recovery.conf API

2018-11-25 Thread Petr Jelinek
Hi,

On 25/11/2018 21:48, Stephen Frost wrote:
> Greetings,
> 
> On Sun, Nov 25, 2018 at 15:39 Andres Freund  > wrote:
> 
> Hi,
> 
> On 2018-11-25 13:24:15 -0500, Stephen Frost wrote:
> > - User performs a backup with pg_basebackup -R
> > - Replica is then promoted to be a primary
> > - User performs a backup with pg_basebackup -R on the new primary
> > - Duplicate entries end up in postgresql.auto.conf.  Ew.
> 
> Why don't we put recovery entries into postgresql.recovery.conf or such?
> And overwrite rather than append?
> 
> 
> Sure, I think that’s more or less the same thing..?   Another included
> file, but one specifically for recovery bits.

I think the important part there is overwrite rather than append. Given
that postgresql.auto.conf is used by ALTER SYSTEM, doing overwrites
there is not as straightforward proposition (behaviorally) as doing it
in another included file.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: RHEL 8.0 build

2018-11-25 Thread Andres Freund
Hi,

On 2018-11-24 15:49:25 -0500, Tom Lane wrote:
> Jeremy Harris  writes:
> >  Trying to set up a buildfarm animal on a RHEL 8.0 (beta) system,
> > the build fails early:
> > ...
> > It appears to be a "configure" script looking for python; and there is
> > no such.  You can have python3 or python2 - but neither package install
> > provides a symlink of just "python".
> 
> Yeah, some distros are starting to act that way, and I suspect it's
> causing pain for a lot of people.
> 
> Currently we are agnostic about which python version to use, so if you
> don't have anything simply named "python", you have to tell configure
> what to use by setting the PYTHON environment variable.
> 
> In a buildfarm configuration file this would look something like
> 
> # env settings to pass to configure. These settings will only be seen 
> by
> # configure.
> config_env => {
> +   PYTHON => "/usr/local/bin/python3",
> 
> There's been some preliminary discussion about starting to default to
> python3, but given this project's inherent conservatism, I don't expect
> that to happen for some years yet.  In any case, whenever we do pull
> that trigger we'd surely do so only in HEAD not released branches, so
> buildfarm owners will need to deal with the case for years more.

Why don't we probe for python2 in addition to python by default? That
ought to make RHEL 8 work, without making the switch just yet.

Greetings,

Andres Freund



Re: allow online change primary_conninfo

2018-11-25 Thread Andres Freund
Hi,

On 2018-11-26 00:25:43 +0300, Sergei Kornilov wrote:
> Now we integrate recovery.conf into GUC system. So i would like to start 
> discussion to make primary_conninfo and primary_slot_name PGC_SIGHUP.
> My work-in-progress patch attached.

Cool!


> I think we have no futher reason to have a copy of conninfo and slot name in 
> WalRcvData, so i propose remove these fields from pg_stat_get_wal_receiver() 
> (and pg_stat_wal_receiver view). This data can be accesible now via regular 
> GUC commands.

Is that wise? It seems possible that wal receivers run for a while with
the old connection information, and without this information that seems
hard to debug.


> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
> index db1a2d4..faa8e17 100644
> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -3797,7 +3797,6 @@ ANY  class="parameter">num_sync ( postgresql.conf
file or on the server command line."


> @@ -435,9 +420,33 @@ WalReceiverMain(void)
>  
>   if (got_SIGHUP)
>   {
> + char*conninfo = 
> pstrdup(PrimaryConnInfo);
> + char*slotname = 
> pstrdup(PrimarySlotName);
> +
>   got_SIGHUP = false;
>   ProcessConfigFile(PGC_SIGHUP);
>   XLogWalRcvSendHSFeedback(true);
> +
> + /*
> +  * If primary_conninfo has been changed 
> while walreceiver is running,
> +  * shut down walreceiver so that a new 
> walreceiver is started and
> +  * initiates replication with the new 
> connection information.
> +  */
> + if (strcmp(conninfo, PrimaryConnInfo) 
> != 0)
> + ereport(FATAL,
> + 
> (errcode(ERRCODE_ADMIN_SHUTDOWN),
> +  
> errmsg("closing replication connection because primary_conninfo was 
> changed")));
> +
> + /*
> +  * And the same for primary_slot_name.
> +  */
> + if (strcmp(slotname, PrimarySlotName) 
> != 0)
> + ereport(FATAL,
> + 
> (errcode(ERRCODE_ADMIN_SHUTDOWN),
> +  
> errmsg("closing replication connection because primary_slot_name was 
> changed")));
> +
> + pfree(conninfo);
> + pfree(slotname);

I'd strongly advocate moving this to a separate function.


Greetings,

Andres Freund



allow online change primary_conninfo

2018-11-25 Thread Sergei Kornilov
Hello all
Now we integrate recovery.conf into GUC system. So i would like to start 
discussion to make primary_conninfo and primary_slot_name PGC_SIGHUP.
My work-in-progress patch attached.

I think we have no futher reason to have a copy of conninfo and slot name in 
WalRcvData, so i propose remove these fields from pg_stat_get_wal_receiver() 
(and pg_stat_wal_receiver view). This data can be accesible now via regular GUC 
commands.

Thank you for advance!

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index db1a2d4..faa8e17 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3797,7 +3797,6 @@ ANY num_sync ( mutex);
-	memset(walrcv->conninfo, 0, MAXCONNINFO);
-	if (tmp_conninfo)
-		strlcpy((char *) walrcv->conninfo, tmp_conninfo, MAXCONNINFO);
 
 	memset(walrcv->sender_host, 0, NI_MAXHOST);
 	if (sender_host)
 		strlcpy((char *) walrcv->sender_host, sender_host, NI_MAXHOST);
 
 	walrcv->sender_port = sender_port;
-	walrcv->ready_to_display = true;
 	SpinLockRelease(&walrcv->mutex);
 
-	if (tmp_conninfo)
-		pfree(tmp_conninfo);
-
 	if (sender_host)
 		pfree(sender_host);
 
@@ -387,7 +372,7 @@ WalReceiverMain(void)
 		 */
 		options.logical = false;
 		options.startpoint = startpoint;
-		options.slotname = slotname[0] != '\0' ? slotname : NULL;
+		options.slotname = PrimarySlotName[0] != '\0' ? PrimarySlotName : NULL;
 		options.proto.physical.startpointTLI = startpointTLI;
 		ThisTimeLineID = startpointTLI;
 		if (walrcv_startstreaming(wrconn, &options))
@@ -435,9 +420,33 @@ WalReceiverMain(void)
 
 if (got_SIGHUP)
 {
+	char	*conninfo = pstrdup(PrimaryConnInfo);
+	char	*slotname = pstrdup(PrimarySlotName);
+
 	got_SIGHUP = false;
 	ProcessConfigFile(PGC_SIGHUP);
 	XLogWalRcvSendHSFeedback(true);
+
+	/*
+	 * If primary_conninfo has been changed while walreceiver is running,
+	 * shut down walreceiver so that a new walreceiver is started and
+	 * initiates replication with the new connection information.
+	 */
+	if (strcmp(conninfo, PrimaryConnInfo) != 0)
+		ereport(FATAL,
+(errcode(ERRCODE_ADMIN_SHUTDOWN),
+ errmsg("closing replication connection because primary_conninfo was changed")));
+
+	/*
+	 * And the same for primary_slot_name.
+	 */
+	if (strcmp(slotname, PrimarySlotName) != 0)
+		ereport(FATAL,
+(errcode(ERRCODE_ADMIN_SHUTDOWN),
+ errmsg("closing replication connection because primary_slot_name was changed")));
+
+	pfree(conninfo);
+	pfree(slotname);
 }
 
 /* See if we can read data immediately */
@@ -672,7 +681,6 @@ WalRcvWaitForStartPosition(XLogRecPtr *startpoint, TimeLineID *startpointTLI)
 			   walrcv->walRcvState == WALRCV_STOPPING);
 		if (walrcv->walRcvState == WALRCV_RESTARTING)
 		{
-			/* we don't expect primary_conninfo to change */
 			*startpoint = walrcv->receiveStart;
 			*startpointTLI = walrcv->receiveStartTLI;
 			walrcv->walRcvState = WALRCV_STREAMING;
@@ -776,7 +784,6 @@ WalRcvDie(int code, Datum arg)
 	Assert(walrcv->pid == MyProcPid);
 	walrcv->walRcvState = WALRCV_STOPPED;
 	walrcv->pid = 0;
-	walrcv->ready_to_display = false;
 	walrcv->latch = NULL;
 	SpinLockRelease(&walrcv->mutex);
 
@@ -1374,7 +1381,6 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
 	Datum	   *values;
 	bool	   *nulls;
 	int			pid;
-	bool		ready_to_display;
 	WalRcvState state;
 	XLogRecPtr	receive_start_lsn;
 	TimeLineID	receive_start_tli;
@@ -1386,13 +1392,10 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
 	TimestampTz latest_end_time;
 	char		sender_host[NI_MAXHOST];
 	int			sender_port = 0;
-	char		slotname[NAMEDATALEN];
-	char		conninfo[MAXCONNINFO];
 
 	/* Take a lock to ensure value consistency */
 	SpinLockAcquire(&WalRcv->mutex);
 	pid = (int) WalRcv->pid;
-	ready_to_display = WalRcv->ready_to_display;
 	state = WalRcv->walRcvState;
 	receive_start_lsn = WalRcv->receiveStart;
 	receive_start_tli = WalRcv->receiveStartTLI;
@@ -1402,17 +1405,15 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
 	last_receipt_time = WalRcv->lastMsgReceiptTime;
 	latest_end_lsn = WalRcv->latestWalEnd;
 	latest_end_time = WalRcv->latestWalEndTime;
-	strlcpy(slotname, (char *) WalRcv->slotname, sizeof(slotname));
 	strlcpy(sender_host, (char *) WalRcv->sender_host, sizeof(sender_host));
 	sender_port = WalRcv->sender_port;
-	strlcpy(conninfo, (char *) WalRcv->conninfo, sizeof(conninfo));
 	SpinLockRelease(&WalRcv->mutex);
 
 	/*
 	 * No WAL receiver (or not ready yet), just return a tuple with NULL
 	 * values
 	 */
-	if (pid == 0 || !ready_to_display)
+	if (pid == 0)
 		PG_RETURN_NULL();
 
 	/* determine result type */
@@ -1464,22 +1465,14 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
 			nulls[9] = true;
 		else
 			values[9] = TimestampTzGetDatum(latest_end_time);
-		if (*slotname == '\0')
-			nulls[10] = true;
-		else
-			values[10] = CStringGetTextDatum(slotname);
 		if (*sender_host == '\0')
-			nulls[11] = true;
+			nulls[10] = true;
 		else

Re: [HACKERS] advanced partition matching algorithm for partition-wise join

2018-11-25 Thread Thomas Munro
On Mon, Nov 26, 2018 at 9:03 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> I've noticed, that this patch set is outdated, so here is the rebased version.

This turned red on cfbot because I turned on -Werror:

partbounds.c: In function ‘partition_bounds_merge’:
partbounds.c:3619:43: error: ‘outer_part’ may be used uninitialized in
this function [-Werror=maybe-uninitialized]
merged_index = map_and_merge_partitions(outer_pmap, outer_mmap,
^
partbounds.c:3475:6: note: ‘outer_part’ was declared here
int outer_part;
^
cc1: all warnings being treated as errors

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Stephen Frost
Greetings,

On Sun, Nov 25, 2018 at 15:45 Laurenz Albe  wrote:

> Stephen Frost wrote:
> > > > On restore, you're
> > > > going to need to create a recovery.conf (at least in released
> versions)
> > > > which provides a restore command (needed even in HEAD today) to get
> the
> > > > old WAL, so having to also create the backup_label file shouldn't be
> > > > that difficult.
> > >
> > > You write "recovery.conf" upon recovery, when you have the restored
> > > backup, so you have it on a file system.  No problem adding a file
> then.
> > >
> > > This is entirely different from adding a "backup_label" file to
> > > a backup that has been taken by a backup software in some arbitrary
> > > format in some arbitrary location (think snapshot).
> >
> > There isn’t any need to write the backup label before you restore the
> database,
> > just as you write recovery.conf then.
>
> Granted.
> But it is pretty convenient, and writing it to the data directory right
> away
> is a good thing on top, because it reduces the danger of inadvertedly
> starting the backup without recovery.


Writing it into the data directory is *not* a good thing because as soon as
you do that you risk there being an issue if there’s a crash.  Writing into
the backup isn’t a bad idea but if you’re managing your backups then
writing it somewhere else (such as where you write your WAL) and
associating it with the backup (presumably it has a label) should make it
easy to pull back when you restore.

> > > Lastly, if you really want, you can extract out the data from
> > > > pg_stop_backup in whatever your post-backup script is.
> > >
> > > Come on, now.
> > > You usually use backup techniques like that because you can't get
> > > your large database backed up in the available time window otherwise.
> >
> > I’m not following what you’re trying to get at here, why can’t you
> extract
> > the data for the backup label from pg_stop_backup..?  Certainly other
> tools
> > do, even ones that do extremely fast parallel backups..  the two are
> > completely independent.
> >
> > Did you think I meant pg_basebackup..?  I certaily didn’t.
>
> Oh yes, I misunderstood.  Sorry.
>
> Yes, you can come up with a post-backup script that somehow communicates
> with your pre-backup script to get the information, but it sure is
> inconvenient.  Simplicity is good in backup solutions, because complicated
> things tend to break more easily.


Not sure what communication is necessary here..?   The data needed for the
backup label file comes from pg_stop_backup in a non-exclusive backup.  You
*should* be grabbing the starting WAL from the start backup as well, to
confirm that you have all of the WAL for the backup, but you don’t actually
need to in order to write out the backup label.

> > I thought our goal is to provide convenient backup methods...
> >
> > Correctness would be first and having a broken system because of a crash
> during a backup isn’t correct.
>
> "Not starting up without manual intervention" is not actually broken...


Of course it is.  Having the behavior of the system be completely different
depending on if a backup happened to be running or not is just plain bad
and is broken.  It’s not a feature. Had this issue been realized when the
exclusive backup mode was developed I suspect it never would have been
implemented that way to begin with...

Thanks!

Stephen


Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Magnus Hagander
On Sun, Nov 25, 2018 at 9:45 PM Laurenz Albe 
wrote:

> Stephen Frost wrote:
> > > > Lastly, if you really want, you can extract out the data from
> > > > pg_stop_backup in whatever your post-backup script is.
> > >
> > > Come on, now.
> > > You usually use backup techniques like that because you can't get
> > > your large database backed up in the available time window otherwise.
> >
> > I’m not following what you’re trying to get at here, why can’t you
> extract
> > the data for the backup label from pg_stop_backup..?  Certainly other
> tools
> > do, even ones that do extremely fast parallel backups..  the two are
> > completely independent.
> >
> > Did you think I meant pg_basebackup..?  I certaily didn’t.
>
> Oh yes, I misunderstood.  Sorry.
>
> Yes, you can come up with a post-backup script that somehow communicates
> with your pre-backup script to get the information, but it sure is
> inconvenient.  Simplicity is good in backup solutions, because complicated
> things tend to break more easily.
>

Yes, simplicity is good.

The problem with the previous interface is that it made things *look*
simple, but they were not. Thus, people got into all sorts of trouble
because they got things wrong.

The new interface is simple, and much harder to get wrong. What it isn't,
is that it's not as convenient as the old one. But correctness is a lot
more important than convenience.

That said, I agree with your point that it's not an uncommon thing to need.
If a good solution for it can be proposed that requires the exclusive
backup interface, then I wouldn't be against un-deprecating that.  But
that's going to require a lot more than just a documentation change, IMHO.
What could perhaps be handled with a documentation change, however, is to
document a good way for this type of setup to use the new interfaces.


> > I thought our goal is to provide convenient backup methods...
> >
> > Correctness would be first and having a broken system because of a crash
> during a backup isn’t correct.
>
> "Not starting up without manual intervention" is not actually broken...
>

Correct. But that's not the worst case scenario here. Yes, some of the bad
ones are the ones amplified by said manual intervention, but user
interaction at the restore point is going to be even harder to get right.


I'm arguing on behalf of users that run a few databases, want a simple
> backup
> solution and are ready to deal with the shortcomings.
>

Those that want a simple backup solution have one -- pg_basebackup.

The exclusive backup API is *not* simple. It is convenient, but it is not
simple.

Actually having a simple API that worked with the pre/post backup scripts,
that would be an improvement that we should definitely want!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

2018-11-25 Thread Thomas Munro
On Mon, Nov 26, 2018 at 6:56 AM Tom Lane  wrote:
> Thomas Munro  writes:
> > Fix pushed.
> > By way of penance, I have now configured PG_TEST_EXTRA="ssl ldap
> > kerberos" for my build farm animals elver and eelpout.  elver should
> > pass at the next build, as I just tested it with --nosend, but eelpout
> > is so slow I'll just take my chances see if that works.
>
> Nope :-(.  Looks like something about key length ... probably just
> misconfiguration?

It seems that we have keys in our tree that are unacceptable to
OpenSSL 1.1.1 as shipped in Debian buster:

2018-11-25 20:32:22.519 UTC [26882] FATAL:  could not load server
certificate file "server-cn-only.crt": ee key too small

That's what you get if you use the libssl-dev package (1.1.1a-1), but
you can still install libssl1.0-dev (which uninstalls 1.1's dev
package).  I've  done that and it the ssl test passes on that machine,
so fingers crossed for the next build farm run.

I see now that Michael already wrote about this recently[1], but that
thread hasn't yet reached a conclusion.

[1] 
https://www.postgresql.org/message-id/flat/20180917131340.GE31460%40paquier.xyz

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Continue work on changes to recovery.conf API

2018-11-25 Thread Sergei Kornilov
Hi

> Why don't we put recovery entries into postgresql.recovery.conf or such?
> And overwrite rather than append?
Appending to postgressql.auto.conf instead of writing new hardcoded file was 
proposed here: 
https://www.postgresql.org/message-id/CAB7nPqQ-Grduc-0WSsK4-VXo%2B1CB%3DyDS9NMU0NEp%2Bc%2Bffnz0UQ%40mail.gmail.com

regards, Sergei



Re: Continue work on changes to recovery.conf API

2018-11-25 Thread Stephen Frost
Greetings,

On Sun, Nov 25, 2018 at 15:39 Andres Freund  wrote:

> Hi,
>
> On 2018-11-25 13:24:15 -0500, Stephen Frost wrote:
> > - User performs a backup with pg_basebackup -R
> > - Replica is then promoted to be a primary
> > - User performs a backup with pg_basebackup -R on the new primary
> > - Duplicate entries end up in postgresql.auto.conf.  Ew.
>
> Why don't we put recovery entries into postgresql.recovery.conf or such?
> And overwrite rather than append?


Sure, I think that’s more or less the same thing..?   Another included
file, but one specifically for recovery bits.

> In the end, I'm not entirely convinced that eliminating recovery.conf is
> > actually an improvement; I think I would have rather seen the original
> > approach of having an 'auto' recovery.conf, but perhaps we can improve
> > on this since others seemed anxious to get rid of recovery.conf (though
> > I'm not sure why- seems like we'll likely end up with more code to
> > handle things cleanly with a merged recovery.auto/postgresql.auto than
> > if we had kept them independent).
>
> If we ever want to have more dynamic reconfiguration around recovery
> (say changing the primary and other settings at runtime, and a lot of
> more advanced features), we're going to need infrastructure to deal with
> that. Without the merge we'd have to duplicate the guc logic.


The recovery auto conf before was also just included into the config,
avoiding the need to have specialized handling.  I do see the advantage of
having it utilize the regular GUC system, but I don’t like losing the
separation we had which allowed us to just move the whole recovery.conf to
recovery.done when we finish recovery.  Seems like we could have both
though if we have the recovery options in a separately included file
instead of in PostgreSQL.auto.conf.

Thanks!

Stephen

>


Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Laurenz Albe
Stephen Frost wrote: 
> > > Seeing it often doesn't make it a good solution.  Running just
> > > pre-backup and post-backup scripts and copying the filesystem isn't
> > > enough to perform an online PostgreSQL backup- the WAL needs to be
> > > collected as well, and you need to make sure that you have all of the
> > > WAL before the backup can be considered complete.
> > 
> > Yes, that's why "pg_stop_backup" has the "wait_for_archive" parameter.
> > So this is not a problem.
> 
> That doesn’t actually make sure you have all of the WAL reliably saved
> across the backup, it just cares what archive command returns, which is
> sadly often a bad thing to depend on.  I certainly wouldn’t rely on only
> that for any system I cared about. 

If you write a bad archive_command, you have a problem.
But that is quite unrelated to the problem at hand, if I am not mistaken.

> > > On restore, you're
> > > going to need to create a recovery.conf (at least in released versions)
> > > which provides a restore command (needed even in HEAD today) to get the
> > > old WAL, so having to also create the backup_label file shouldn't be
> > > that difficult.
> > 
> > You write "recovery.conf" upon recovery, when you have the restored
> > backup, so you have it on a file system.  No problem adding a file then.
> > 
> > This is entirely different from adding a "backup_label" file to
> > a backup that has been taken by a backup software in some arbitrary
> > format in some arbitrary location (think snapshot).
> 
> There isn’t any need to write the backup label before you restore the 
> database,
> just as you write recovery.conf then.

Granted.
But it is pretty convenient, and writing it to the data directory right away
is a good thing on top, because it reduces the danger of inadvertedly
starting the backup without recovery.

> > > Lastly, if you really want, you can extract out the data from
> > > pg_stop_backup in whatever your post-backup script is.
> > 
> > Come on, now.
> > You usually use backup techniques like that because you can't get
> > your large database backed up in the available time window otherwise.
> 
> I’m not following what you’re trying to get at here, why can’t you extract
> the data for the backup label from pg_stop_backup..?  Certainly other tools
> do, even ones that do extremely fast parallel backups..  the two are
> completely independent.
> 
> Did you think I meant pg_basebackup..?  I certaily didn’t.

Oh yes, I misunderstood.  Sorry.

Yes, you can come up with a post-backup script that somehow communicates
with your pre-backup script to get the information, but it sure is
inconvenient.  Simplicity is good in backup solutions, because complicated
things tend to break more easily.

> > I thought our goal is to provide convenient backup methods...
> 
> Correctness would be first and having a broken system because of a crash 
> during a backup isn’t correct.

"Not starting up without manual intervention" is not actually broken...

> > But what's wrong with retaining the exclusive backup method and just
> > sticking a big "Warning: this may cause a restart to fail after a crash"
> > on it?  That sure wouldn't be unsafe.
> 
> I haven’t seen anyone pushing for it to be removed immediately, but users 
> should
> not use it and newcomers would be much better served by using the non 
> exclusive api.
> There is a reason it was deprecated and it’s because it simply isn’t a good 
> API.
> Coming along a couple years later and saying that it’s a good API while 
> ignoring
> the issues that it has doesn’t change that.

I don't think I'm ignoring the issues, I just think there is a valid use case 
for
the exclusive backup API, with all its caveats.

Of course I'm not arguing on behalf of organizations running lots of databases
for whom manual intervention after a crash is unacceptable.

I'm arguing on behalf of users that run a few databases, want a simple backup
solution and are ready to deal with the shortcomings.


But I will gladly accept defeat in this matter, I just needed to vent my 
opinion.

Yours,
Laurenz Albe




Re: Continue work on changes to recovery.conf API

2018-11-25 Thread Andres Freund
On 2018-11-25 16:56:13 +0100, Peter Eisentraut wrote:
> On 21/11/2018 07:00, Michael Paquier wrote:
> >> Author: Simon Riggs 
> >> Author: Abhijit Menon-Sen 
> >> Author: Sergei Kornilov 
> > I think that Fujii Masao should also be listed as an author, or at least
> > get a mention.  He was one of the first to work on this stuff.
> 
> Committed with that addition.

Wee!

Greetings,

Andres Freund



Re: Continue work on changes to recovery.conf API

2018-11-25 Thread Andres Freund
Hi,

On 2018-11-25 13:24:15 -0500, Stephen Frost wrote:
> - User performs a backup with pg_basebackup -R
> - Replica is then promoted to be a primary
> - User performs a backup with pg_basebackup -R on the new primary
> - Duplicate entries end up in postgresql.auto.conf.  Ew.

Why don't we put recovery entries into postgresql.recovery.conf or such?
And overwrite rather than append?


> In the end, I'm not entirely convinced that eliminating recovery.conf is
> actually an improvement; I think I would have rather seen the original
> approach of having an 'auto' recovery.conf, but perhaps we can improve
> on this since others seemed anxious to get rid of recovery.conf (though
> I'm not sure why- seems like we'll likely end up with more code to
> handle things cleanly with a merged recovery.auto/postgresql.auto than
> if we had kept them independent).

If we ever want to have more dynamic reconfiguration around recovery
(say changing the primary and other settings at runtime, and a lot of
more advanced features), we're going to need infrastructure to deal with
that. Without the merge we'd have to duplicate the guc logic.

Greetings,

Andres Freund



Re: [HACKERS] [PATCH] Generic type subscripting

2018-11-25 Thread Thomas Munro
On Mon, Nov 26, 2018 at 6:07 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > On Fri, Nov 9, 2018 at 1:55 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> >
> > I've noticed, that patch has some conflicts, so here is the rebased version.
> > Also, since people are concern about performance impact for arrays, I've 
> > done
> > some tests similar to [1], but agains the current master - results are 
> > similar
> > so far, I've got quite insignificant difference between the master and the
> > patched version.
> >
> > [1]: 
> > https://www.postgresql.org/message-id/CA%2Bq6zcV8YCKcMHkUKiiUM3eOsq-ubb%3DT1D%2Bki4YbE%3DBYbt1PxQ%40mail.gmail.com
>
> One more rebased version. This time I also decided to use this opportunity, to
> write more descriptive commit messages.

Hi Dmitry,

Noticed on cfbot.cputube.org:

pg_type.c:167:10: error: passing argument 3 of
‘GenerateTypeDependencies’ makes integer from pointer without a cast
[-Werror]
false);
^
In file included from pg_type.c:28:0:
../../../src/include/catalog/pg_type.h:335:13: note: expected ‘Oid’
but argument is of type ‘void *’
extern void GenerateTypeDependencies(Oid typeObjectId,
^

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Stephen Frost
Greetings,

On Sun, Nov 25, 2018 at 14:17 Laurenz Albe  wrote:

> On Sun, 2018-11-25 at 13:50 -0500, Stephen Frost wrote:
> > I don't see any compelling argument for trying to do something half-way
> > any more today than I did two years ago when this was being discussed.
>
> That may well be so.  It may be better to make users unhappy than to
> make them very unhappy...
>
> But I find the following points unconvincing:
>
> > > I would say the typical use case for the exclusive backup method is
> > > the following (and I have seen it often):
> > >
> > > You have some kind of backup software that does file system backups
> > > and is able to run a "pre-backup" and "post-backup" script.
> > > The backup is triggered by the backup software.
> >
> > Seeing it often doesn't make it a good solution.  Running just
> > pre-backup and post-backup scripts and copying the filesystem isn't
> > enough to perform an online PostgreSQL backup- the WAL needs to be
> > collected as well, and you need to make sure that you have all of the
> > WAL before the backup can be considered complete.
>
> Yes, that's why "pg_stop_backup" has the "wait_for_archive" parameter.
> So this is not a problem.


That doesn’t actually make sure you have all of the WAL reliably saved
across the backup, it just cares what archive command returns, which is
sadly often a bad thing to depend on.  I certainly wouldn’t rely on only
that for any system I cared about.

> On restore, you're
> > going to need to create a recovery.conf (at least in released versions)
> > which provides a restore command (needed even in HEAD today) to get the
> > old WAL, so having to also create the backup_label file shouldn't be
> > that difficult.
>
> You write "recovery.conf" upon recovery, when you have the restored
> backup, so you have it on a file system.  No problem adding a file then.
>
> This is entirely different from adding a "backup_label" file to
> a backup that has been taken by a backup software in some arbitrary
> format in some arbitrary location (think snapshot).


There isn’t any need to write the backup label before you restore the
database, just as you write recovery.conf then.

> Lastly, if you really want, you can extract out the data from
> > pg_stop_backup in whatever your post-backup script is.
>
> Come on, now.
> You usually use backup techniques like that because you can't get
> your large database backed up in the available time window otherwise.


I’m not following what you’re trying to get at here, why can’t you extract
the data for the backup label from pg_stop_backup..?  Certainly other tools
do, even ones that do extremely fast parallel backups..  the two are
completely independent.

Did you think I meant pg_basebackup..?  I certaily didn’t.

> > Another thing that is problematic with non-exclusive backups is that
> > > you have to write the backup_label file into the backup after the
> > > backup has been taken.  With a technique like the above, you cannot
> > > easily do that.
> >
> > ... why not?  You need to create the recovery.conf anyway, and you need
> > to be archiving WAL somewhere, so it certainly seems like you could put
> > the backup_label there too.
>
> As I said above, you don't add "recovery.conf" to the backup right away,
> so these two cases don't compare.


There’s no requirement that you add the backup label contents immediately
either, you just need to keep track of it and restore it when you restore
the database and create the recovery.conf file.

> > I expect that that will make a lot of users unhappy.
> >
> > If it means that they implement a better backup strategy, then it's
> > probably a good thing, which is the goal of this.
>
> I thought our goal is to provide convenient backup methods...


Correctness would be first and having a broken system because of a crash
during a backup isn’t correct.

Ignoring "backup_label" on restart, as I suggested in my previous message,
> probably isn't such a hot idea.


Agreed.

But what's wrong with retaining the exclusive backup method and just
> sticking a big "Warning: this may cause a restart to fail after a crash"
> on it?  That sure wouldn't be unsafe.


I haven’t seen anyone pushing for it to be removed immediately, but users
should not use it and newcomers would be much better served by using the
non exclusive api.  There is a reason it was deprecated and it’s because it
simply isn’t a good API. Coming along a couple years later and saying that
it’s a good API while ignoring the issues that it has doesn’t change that.

Thanks!

Stephen


Re: Support custom socket directory in pg_upgrade

2018-11-25 Thread Tom Lane
Daniel Gustafsson  writes:
> [ pg_upgrade_sockdir-v3.patch ]

BTW, I notice that cfbot doesn't like this now that Thomas is running it
with -Werror:

option.c: In function ‘parseCommandLine’:
option.c:265:8: error: ignoring return value of ‘getcwd’, declared with 
attribute warn_unused_result [-Werror=unused-result]
  getcwd(default_sockdir, MAXPGPATH);
^
cc1: all warnings being treated as errors

Failing to check a syscall's result isn't per project style even if
the tools would let you get away with it.  Other places in that same
file do

if (!getcwd(cwd, MAXPGPATH))
pg_fatal("could not determine current directory\n");

regards, tom lane



Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Laurenz Albe
On Sun, 2018-11-25 at 13:50 -0500, Stephen Frost wrote:
> I don't see any compelling argument for trying to do something half-way
> any more today than I did two years ago when this was being discussed.

That may well be so.  It may be better to make users unhappy than to
make them very unhappy...

But I find the following points unconvincing:

> > I would say the typical use case for the exclusive backup method is
> > the following (and I have seen it often):
> > 
> > You have some kind of backup software that does file system backups
> > and is able to run a "pre-backup" and "post-backup" script.
> > The backup is triggered by the backup software.
> 
> Seeing it often doesn't make it a good solution.  Running just
> pre-backup and post-backup scripts and copying the filesystem isn't
> enough to perform an online PostgreSQL backup- the WAL needs to be
> collected as well, and you need to make sure that you have all of the
> WAL before the backup can be considered complete.

Yes, that's why "pg_stop_backup" has the "wait_for_archive" parameter.
So this is not a problem.

> On restore, you're
> going to need to create a recovery.conf (at least in released versions)
> which provides a restore command (needed even in HEAD today) to get the
> old WAL, so having to also create the backup_label file shouldn't be
> that difficult.

You write "recovery.conf" upon recovery, when you have the restored
backup, so you have it on a file system.  No problem adding a file then.

This is entirely different from adding a "backup_label" file to
a backup that has been taken by a backup software in some arbitrary
format in some arbitrary location (think snapshot).

So these two cases don't compare.

> Lastly, if you really want, you can extract out the data from
> pg_stop_backup in whatever your post-backup script is.

Come on, now.
You usually use backup techniques like that because you can't get
your large database backed up in the available time window otherwise.

> > Another thing that is problematic with non-exclusive backups is that
> > you have to write the backup_label file into the backup after the
> > backup has been taken.  With a technique like the above, you cannot
> > easily do that.
> 
> ... why not?  You need to create the recovery.conf anyway, and you need
> to be archiving WAL somewhere, so it certainly seems like you could put
> the backup_label there too.

As I said above, you don't add "recovery.conf" to the backup right away,
so these two cases don't compare.

> > I expect that that will make a lot of users unhappy.
> 
> If it means that they implement a better backup strategy, then it's
> probably a good thing, which is the goal of this.

I thought our goal is to provide convenient backup methods...


Ignoring "backup_label" on restart, as I suggested in my previous message,
probably isn't such a hot idea.

But what's wrong with retaining the exclusive backup method and just
sticking a big "Warning: this may cause a restart to fail after a crash"
on it?  That sure wouldn't be unsafe.

If somebody prefers a simpler backup method at the price of having to
manually restart the server after a crash, what's wrong with that?
Why not give them the choice?

I'd say that one could write a server start script that just removes
"backup_label", but I am sure someone will argue that then somebody
will restore a backup and then restart the server without first
recovering the database...

Yours,
Laurenz Albe




Re: Continue work on changes to recovery.conf API

2018-11-25 Thread Tom Lane
Peter Eisentraut  writes:
> Committed with that addition.

Some of the buildfarm doesn't like this.  I realized that the common
denominator was EXEC_BACKEND: if you build with -DEXEC_BACKEND the
failure is completely reproducible.  Presumably something is thinking
that it will inherit some variable from the postmaster via fork.
I don't know if that's a new bug or was just exposed by a new test.

regards, tom lane



Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Stephen Frost
Greetings,

* Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> On Sun, 2016-04-24 at 11:49 -0400, Stephen Frost wrote:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> > > On Sun, Apr 24, 2016 at 5:37 AM, Bruce Momjian  wrote:
> > > > On Fri, Apr 22, 2016 at 11:53:46AM -0400, Robert Haas wrote:
> > > > > 
> > > > > Frankly, I think that's right.  It is one thing to say that the new
> > > > > method is preferred - +1 for that.  But the old method is going to
> > > > > continue to be used by many people for a long time, and in some cases
> > > > > will be superior.  That's not something we can deprecate, unless I'm
> > > > > misunderstanding the situation.
> > > > 
> > > > I agree with Robert.  One the one hand we are saying pg_stop_backup()
> > > > doesn't work well in psql because you get those two file contents output
> > > > that you have to write, and on the other hand we are saying we are going
> > > > to deprecate the method that does work well in psql?  I must be missing
> > > > something too, as that makes no sense.
> > > 
> > > I don't agree. I don't see how "making a backup using psql" is more
> > > important than "making a backup without potentially dangerous 
> > > sideeffects".
> > > But if others don't agree, could one of you at least provide an example of
> > > how you'd like the docs to read in a way that doesn't deprecate the unsafe
> > > way but still informs the user properly?
> > 
> > I'm with Magnus on this, primairly because I've come to understand just
> > how dangerous the old backup method is.  That method *should* be
> > deprecated and discouraged.  A backup method which means your database
> > doesn't restart properly if the system crashes during the backup is
> > *bad*.
> > 
> > Fixing that means using something more complicated than the old method
> > and that's a bit of a pain in psql, but that doesn't mean we should tell
> > people that the old method is an acceptable approach.
> > 
> > Perhaps we can look at improving psql to make it easier to use it for
> > the new backup method but, honestly, all these hackish scripts to do
> > backups aren't doing a lot of things that a real backup solution needs
> > to do.  Improving psql for this is a bit like new paint on the titanic.
> 
> I guess that the danger mentioned above is that the database may not
> restart if it crashes while in exclusive backup mode, because the
> WAL containing the starting checkpoint has already been purged.

The database would think it's trying to do recovery from a backup, not
doing crash recovery like it should be doing, so, yes, that's pretty
ugly.

> I would say the typical use case for the exclusive backup method is
> the following (and I have seen it often):
> 
> You have some kind of backup software that does file system backups
> and is able to run a "pre-backup" and "post-backup" script.
> The backup is triggered by the backup software.

Seeing it often doesn't make it a good solution.  Running just
pre-backup and post-backup scripts and copying the filesystem isn't
enough to perform an online PostgreSQL backup- the WAL needs to be
collected as well, and you need to make sure that you have all of the
WAL before the backup can be considered complete.  On restore, you're
going to need to create a recovery.conf (at least in released versions)
which provides a restore command (needed even in HEAD today) to get the
old WAL, so having to also create the backup_label file shouldn't be
that difficult.

Lastly, if you really want, you can extract out the data from
pg_stop_backup in whatever your post-backup script is.

> Another thing that is problematic with non-exclusive backups is that
> you have to write the backup_label file into the backup after the
> backup has been taken.  With a technique like the above, you cannot
> easily do that.

... why not?  You need to create the recovery.conf anyway, and you need
to be archiving WAL somewhere, so it certainly seems like you could put
the backup_label there too.

> So in essence, all backup methods that work like the above won't be
> possible any more once the exclusive backup is gone.

That's why it's been deprecated and not removed...  but hopefully we
will be able to remove it in the future as it really isn't a good API.

> I expect that that will make a lot of users unhappy.

If it means that they implement a better backup strategy, then it's
probably a good thing, which is the goal of this.

> Would it be an option to change the semantics so that "backup_label"
> is ignored if there is no "recovery.conf"?  Of course that opens
> another way to corrupt your database (by starting it from a backup
> without first creating "recovery.conf"), but we could add another
> big warning.

My recollection is that other approaches were considered to deal with
this but none of them were all that good, though that was two years ago
at this point.  Certainly the issue with the above where a restored
backup without a recovery.conf could unwittingly be treated as if

Re: FETCH FIRST clause PERCENT option

2018-11-25 Thread Tom Lane
Tomas Vondra  writes:
> On 11/25/18 1:05 PM, Vik Fearing wrote:
>> I don't see how this behavior is justified by reading the SQL standard.
>> Obviously only an integer number of rows is going to be returned, but
>> the percentage should be calculated correctly.

> Right. My draft of SQL standard says this:
> 31) The declared type of the  simply
> contained in  shall be numeric.

> So the standard pretty much requires treating the value as numeric.

Note that this should not be read as "it must be of the type that PG
calls numeric".  I'd read it as requiring a value that's in the numeric
type hierarchy.  float8 would be a reasonable implementation choice
if we're going to coerce to a specific type.

regards, tom lane



Re: FETCH FIRST clause PERCENT option

2018-11-25 Thread Tomas Vondra



On 11/25/18 1:05 PM, Vik Fearing wrote:
> On 25/11/2018 12:49, Surafel Temesgen wrote:
>>
>>
>> On Sun, Nov 25, 2018 at 1:24 PM Vik Fearing > > wrote:
>>
>>
>> Also, this query returns 210 rows instead of the expected 208:
>>
>>     select *
>>     from generate_series(1, 1000)
>>     fetch first 20.8 percent rows only
>>
>> this is because fetch first values work with integer and it change
>> fractional number to nearest integer number like select * from
>> generate_series(1, 1000) fetch first 20.3 rows only; is not an error
>> rather it return 20 rows.
> 
> I don't see how this behavior is justified by reading the SQL standard.
>  Obviously only an integer number of rows is going to be returned, but
> the percentage should be calculated correctly.
> 

Right. My draft of SQL standard says this:

 ::=
  FETCH { FIRST | NEXT } [  ] { ROW | ROWS }
  { ONLY | WITH TIES }

 ::= 
   | 

 ::=  PERCENT

and then

30) The declared type of  shall be an exact
numeric with scale 0 (zero).

31) The declared type of the  simply
contained in  shall
be numeric.

So the standard pretty much requires treating the value as numeric.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Continue work on changes to recovery.conf API

2018-11-25 Thread Stephen Frost
Greetings,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> Committed with that addition.

I'm concerned that this hasn't been entirely thought through regarding
pretty common use-cases, consider a pretty typical pg_basebackup -R
usage case:

- User performs a backup with pg_basebackup -R
- Replica is then promoted to be a primary
- User performs a backup with pg_basebackup -R on the new primary
- Duplicate entries end up in postgresql.auto.conf.  Ew.

Now thinking about it from the perspective of... more complete backup
solutions, which give the user the option to fill out more of what used
to go into recovery.conf, it seems like there's going to have to be a
lot more hacking around with postgresql.auto.conf, such as adding in
recovery target time and other parameters, which I'm not terribly
thrilled with.

For example, when a backup is done, typically postgresql.auto.conf comes
along with it and gets restored with it, but if that's done now without
mucking with the contents, we'll run into the same issue that
pg_basebackup has as outlined above- recovery target time and other
parameters being included in the postgresql.auto.conf, possibly multiple
times if care isn't taken to avoid that.  While it used to be that you
could end up with recovery.done files ending up in backups, they were
just a nuisance and didn't impact anything.

One possible approach for dealing with at least some of these concerns
would be to remove the recovery settings from postgresql.auto.conf once
recovery is done, which is similar to how we moved recovery.conf to
recovery.done, at least.  We could even create a recovery.done file if
we wanted to.

Unfortunately, this also means that users are going to have their own
issues when just using pg_basebackup and trying to perform PITR since
they'll be modifying postgresql.conf and will have to remember to remove
those settings since we aren't going to modify that.  Maybe we should
have a warning or notice or something saying "hey, now that this has
been promoted, maybe you should check your config file and clear out
the recovery settings?".

Thinking about how this would work with something like pgbackrest
dropping settings into postgresql.auto.conf, users would need to be
educated to use ALTER SYSTEM if they want to move the recovery target
time to be later while doing PITR (to avoid having them hacking around
in postgresql.auto.conf), which is perhaps not terrible.  What would
make that experience nicer would be a way to restart PG remotely after
making that change (well, or just making it so that users could tell PG
to continue to play forward, but as I understand it this patch doesn't
give us that).

These are my thoughts on this, at least at the moment.  I've also asked
David Steele to comment, of course, since this will impact pgbackrest.

In the end, I'm not entirely convinced that eliminating recovery.conf is
actually an improvement; I think I would have rather seen the original
approach of having an 'auto' recovery.conf, but perhaps we can improve
on this since others seemed anxious to get rid of recovery.conf (though
I'm not sure why- seems like we'll likely end up with more code to
handle things cleanly with a merged recovery.auto/postgresql.auto than
if we had kept them independent).

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Centralize use of PG_INTXX_MIN/MAX for integer limits

2018-11-25 Thread Tom Lane
Peter Eisentraut  writes:
> On 24/11/2018 13:15, Michael Paquier wrote:
>> A couple of years ago, 62e2a8dc has introduced in c.h a set of limits
>> (to fix some portability issues from 83ff1618) to make the code more
>> system-independent.  Those are for example PG_INT32_MIN, etc.  The core
>> code now mixes the internal PG_ limits with the system ones.  Would we
>> want to unify a bit the whole thing and replace all the SHRT_MIN/MAX,
>> LONG_MIN/MAX and such with the internal limit definitions?

> Since we now require C99, we could also just use the system-provided
> definitions in .

We require a C99 *compiler*.  That's a different thing from assuming
that the contents of /usr/include are C99-compliant.

Admittedly, the days of user-installed copies of gcc being used with
crufty vendor-supplied system headers might be over for most people.
But my animal gaur still has such a configuration, and it hasn't
got stdint.h at all, never mind any particular contents thereof.

regards, tom lane



Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Laurenz Albe
I'm a bit late to the party, but I only recently noticed that the exclusive
backup API is deprecated.

On Sun, 2016-04-24 at 11:49 -0400, Stephen Frost wrote:
> * Magnus Hagander (mag...@hagander.net) wrote:
> > On Sun, Apr 24, 2016 at 5:37 AM, Bruce Momjian  wrote:
> > > On Fri, Apr 22, 2016 at 11:53:46AM -0400, Robert Haas wrote:
> > > > 
> > > > Frankly, I think that's right.  It is one thing to say that the new
> > > > method is preferred - +1 for that.  But the old method is going to
> > > > continue to be used by many people for a long time, and in some cases
> > > > will be superior.  That's not something we can deprecate, unless I'm
> > > > misunderstanding the situation.
> > > 
> > > I agree with Robert.  One the one hand we are saying pg_stop_backup()
> > > doesn't work well in psql because you get those two file contents output
> > > that you have to write, and on the other hand we are saying we are going
> > > to deprecate the method that does work well in psql?  I must be missing
> > > something too, as that makes no sense.
> > 
> > I don't agree. I don't see how "making a backup using psql" is more
> > important than "making a backup without potentially dangerous sideeffects".
> > But if others don't agree, could one of you at least provide an example of
> > how you'd like the docs to read in a way that doesn't deprecate the unsafe
> > way but still informs the user properly?
> 
> I'm with Magnus on this, primairly because I've come to understand just
> how dangerous the old backup method is.  That method *should* be
> deprecated and discouraged.  A backup method which means your database
> doesn't restart properly if the system crashes during the backup is
> *bad*.
> 
> Fixing that means using something more complicated than the old method
> and that's a bit of a pain in psql, but that doesn't mean we should tell
> people that the old method is an acceptable approach.
> 
> Perhaps we can look at improving psql to make it easier to use it for
> the new backup method but, honestly, all these hackish scripts to do
> backups aren't doing a lot of things that a real backup solution needs
> to do.  Improving psql for this is a bit like new paint on the titanic.

I guess that the danger mentioned above is that the database may not
restart if it crashes while in exclusive backup mode, because the
WAL containing the starting checkpoint has already been purged.
True, that is a problem.

I would say the typical use case for the exclusive backup method is
the following (and I have seen it often):

You have some kind of backup software that does file system backups
and is able to run a "pre-backup" and "post-backup" script.
The backup is triggered by the backup software.

Another thing that is problematic with non-exclusive backups is that
you have to write the backup_label file into the backup after the
backup has been taken.  With a technique like the above, you cannot
easily do that.

So in essence, all backup methods that work like the above won't be
possible any more once the exclusive backup is gone.


I expect that that will make a lot of users unhappy.

Would it be an option to change the semantics so that "backup_label"
is ignored if there is no "recovery.conf"?  Of course that opens
another way to corrupt your database (by starting it from a backup
without first creating "recovery.conf"), but we could add another
big warning.

I'd say that the people who ignore such a warning are the same
people that happily remove "backup_label" when they see the following
message upon starting a cluster from a backup without recovery:

  If you are not restoring from a backup, try removing the file "backup_label".

Yours,
Laurenz Albe




Re: Don't wake up to check trigger file if none is configured

2018-11-25 Thread Jeff Janes
On Sat, Nov 24, 2018 at 11:29 AM Jeff Janes  wrote:

> A streaming replica waiting on WAL from the master will wake up every 5
> seconds to check for a trigger file.  This is pointless if no trigger file
> has been configured.
>
> The attached patch suppresses the timeout when there is no trigger file
> configured.
>

Rebased over the removal of recovery.conf and renaming of TriggerFile.

If the promote_trigger_file param is someday made responsive to SIGHUP,  I
think this will just continue to do the right thing without further
modification.

Cheers,

Jeff


WAL_sleep_not_triggered_v2.patch
Description: Binary data


Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

2018-11-25 Thread Tom Lane
Thomas Munro  writes:
> Fix pushed.
> By way of penance, I have now configured PG_TEST_EXTRA="ssl ldap
> kerberos" for my build farm animals elver and eelpout.  elver should
> pass at the next build, as I just tested it with --nosend, but eelpout
> is so slow I'll just take my chances see if that works.

Nope :-(.  Looks like something about key length ... probably just
misconfiguration?

> I'll also
> review the firewall config on those VMs, since apparently everyone is
> too chicken to run those tests, perhaps for those sorts of reasons.

I think in many cases the answer is just "it's not in the default
buildfarm configuration".  I couldn't think of a strong reason not
to run the ssl check on longfin, so I've just updated that to do so.

regards, tom lane



Re: Desirability of client-side expressions in psql?

2018-11-25 Thread Pavel Stehule
so 24. 11. 2018 v 22:03 odesílatel Corey Huinker 
napsal:

>
>
>> >>psql> \if :i >= 5
>> >>
>> > I think we're ok with that so long as none of the operators or values
>> has a
>> > \ in it.
>> > What barriers do you see to re-using the pgbench grammar?
>>
>> The pgbench expression grammar mimics SQL expression grammar,
>> on integers, floats, booleans & NULL.
>>
>> I'm unsure about some special cases in psql (`shell command`,
>> 'text' "identifier"). They can be forbidden on a new commande (\let),
>> but what happens on "\if ..." which I am afraid allows them is unclear.
>>
>> --
>> Fabien.
>>
>
> (raising this thread from hibernation now that I have the bandwidth)
>

thank you for it :)


> It seems like the big barriers to just using pgbench syntax are:
>   - the ability to indicate that the next thing to follow will be a
> pgbench expression
>   - a way to coax pgbench truth-y values into psql truthy values (t/f,
> y/n, 1/0)
>
> For that, I see a few ways forward:
>
> 1. A suffix on \if, \elif, -exp suffix (or even just -x) to existing
> commands to indicate that a pgbench expression would follow
> This would look something like
> \ifx \elifx \setx
> \if$ \elif$ \set$
>
> 2. A command-line-esque switch or other sigil to indicate that what
> follows is a pgbench expression with psql vars to interpolate
> Example:
> \set foo -x 1 + 4
> \set foo \expr 1 + 4
> \if -x :limit > 10
> \if \expr :limit > 10
>
> 3. A global toggle to indicate which mode should be used by \if, \elif,
> and \set
> Example:
>  \pset expressions [on | off]
>
> 4. A combination of #2 and #3 with a corresponding switch/sigil to
> indicate "do not evaluate pgbench-style
>This is particularly appealing to me because it would allow code
> snippets from pgbench to be used without modification, while still allowing
> the user to mix-in old/new style to an existing script.
>
> 5. A special variant of `command` where variables are interpolated before
> being sent to the OS, and allow that on \if, \elif
> \set foo ``expr :y + :z``
> \set foo $( expr :y + :z )
> \if ``expr :limit > 10``
> \if $( expr :limit > 10 )
>
> This also has some appeal because it allows for a great amount of
> flexibility, but obviously constrains us with OS-dependencies. The user
> might have a hard time sending commands with ')' in them if we go the $( )
> route
>
> 6. Option #5, but we add an additional executable (suggested name: pgexpr)
> to the client libs, which encapsulates the pgbench expression library as a
> way around OS-dependent code.
>
> 7. I believe someone suggested introducing the :{! pgbench-command} or :{{
> pgbench-command }} var-mode
> \set foo :{! :y + :z }
> \set foo :{{ :y + :z }}
> \if :{! :limit > 10 }
> \if :{{ :limit > 10 }}
>
> This has some appeal as well, though I prefer the {{...}}  syntax
> because "!" looks like negation, and {{ resembles the [[ x + y ]] syntax in
> bash
>

I think so your proposed syntax {{ }} can be great.

\if {{ :SERVER_NUM > 10 }}

looks perfect.

I am not sure, how difficult is implement this syntax. Another good
possibility can be `` expr ``.

Regards

Pavel


> One nice thing is that most of these options are not mutually exclusive.
>
> Thoughts?
>


Re: Continue work on changes to recovery.conf API

2018-11-25 Thread Sergei Kornilov
HelloThank you! Regards, Sergei



Re: Continue work on changes to recovery.conf API

2018-11-25 Thread Peter Eisentraut
On 21/11/2018 07:00, Michael Paquier wrote:
>> Author: Simon Riggs 
>> Author: Abhijit Menon-Sen 
>> Author: Sergei Kornilov 
> I think that Fujii Masao should also be listed as an author, or at least
> get a mention.  He was one of the first to work on this stuff.

Committed with that addition.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Centralize use of PG_INTXX_MIN/MAX for integer limits

2018-11-25 Thread Peter Eisentraut
On 24/11/2018 13:15, Michael Paquier wrote:
> A couple of years ago, 62e2a8dc has introduced in c.h a set of limits
> (to fix some portability issues from 83ff1618) to make the code more
> system-independent.  Those are for example PG_INT32_MIN, etc.  The core
> code now mixes the internal PG_ limits with the system ones.  Would we
> want to unify a bit the whole thing and replace all the SHRT_MIN/MAX,
> LONG_MIN/MAX and such with the internal limit definitions?

Since we now require C99, we could also just use the system-provided
definitions in .

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add extension options to control TAP and isolation tests

2018-11-25 Thread Nikolay Shaplov
В письме от 23 ноября 2018 09:43:45 пользователь Michael Paquier написал:

> > So, I continued exploring your patch. First I carefully read changes in
> > pgxs.mk. The only part of it I did not understand is
> > 
> >  .PHONY: submake
> > 
> > -submake:
> >  ifndef PGXS
> > 
> > +submake:
> > +ifdef REGRESS
> > 
> > $(MAKE) -C $(top_builddir)/src/test/regress pg_regress$(X)
> >  
> >  endif
> > 
> > +ifdef ISOLATION
> > +   $(MAKE) -C $(top_builddir)/src/test/isolation all
> > +endif
> > +endif # PGXS
> 
> This is to make sure that the necessary tools are compiled before
> running the related tests.  "submake:" needs to be moved out actually.
> Thinking about it a bit more, we should also remove the "ifdef REGRESS"
> and "ifdef ISOLATION" because there are some dependencies here.  For
> example TAP tests call pg_regress to initialize connection policy.  TAP
> tests may also use isolation_tester or such.

Can you add some comments in this part of pgxs.mk that explains this thing 
about pre-building needed tools? This will make understanding it more easy...

> 
> > I suppose it is because the purpose of PGXS is never explained, not in the
> > documentation, not in the code comments, and PGXS := $(shell $(PG_CONFIG)
> > -- pgxs) sounds like some strange magic spell, that explains nothing. In
> > what cases it is defined? In what cases it is not defined? What exactly
> > does it store? Can we make things more easy to understand here?
> 
> That's part of a larger scope which is here:
> https://www.postgresql.org/docs/11/extend-pgxs.html

I've carefully read this documentation. And did not get clear explanation of 
what is the true purpose of PGXS environment variable. Only  

"The last three lines should always be the same. Earlier in the file, you 
assign variables or add custom make rules." 

May be it should bot be discussed in the doc, but it should be mentioned in 
pgxs.mk . I guess both PGXS nad NO_PGXS should be explicitly described,  in 
order to make the rest of the code more readable. (As for me I now have some 
intuitive understanding of it's nature, but it would be better to have strict 
explanation)


So about test_decoding 

contrib/test_decoding/Makefile: 

EXTRA_INSTALL=contrib/test_decoding

This sounds a bit strange to me. Why in make file for  we 
should explicitly specify that this   is needed for tests. It 
is obviously needed! Man, we are testing it!! ;-)

I would suspect that is should be added somewhere in pgxs.mk code, when it is 
needed. Or this is not as obvious and trivial as I see it?

I guess it came from
-submake-test_decoding:
-$(MAKE) -C $(top_builddir)/contrib/test_decoding

but still there is no logic in it.


+REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
+ISOLATION_OPTS = --temp-config 
$(top_srcdir)/contrib/test_decoding/logical.conf

When I read these lines an idea about ISOLATION_CONF and REGRESS_CONF 
variables came into my mind. That are transformed into proper _OPTS in pgxs.mk 
? Since there is only one use case, may be it do not worth it. So I just speak 
this thought aloud without any intention to make it reality.



-$(MAKE) -C $(top_builddir)/src/test/regress all
is replaced with
$(MAKE) -C $(top_builddir)/src/test/regress pg_regress$(X)
in pgxs.mk. I hope changing "all" to "pg_regress$(X)" is ok, because I do not 
understand it.



I've also greped for "pg_isolation_regress_check" and found that it is also 
used in src/test/modules/snapshot_too_old  that also uses PGXS (at least as an 
option) should not we include it in your patch too?


So I think that is it. Since Artur said, he successfully used your TAP patch 
in other extensions, I will not do it, considering it is already checked. If 
you think it is good to recheck, I can do it too :-)



signature.asc
Description: This is a digitally signed message part.


Re: FETCH FIRST clause PERCENT option

2018-11-25 Thread Andrew Gierth
> "Surafel" == Surafel Temesgen  writes:

 Surafel> this is because fetch first values work with integer and it
 Surafel> change fractional number to nearest integer number like select
 Surafel> * from generate_series(1, 1000) fetch first 20.3 rows only; is
 Surafel> not an error rather it return 20 rows.

 31) The declared type of the  simply
 contained in  shall be numeric.

Note that unlike  there is no requirement for the
type to be _exact_ numeric or to have scale 0.

later:

ii) If  is specified, then let FFP be the
 simply contained in , and let LOCT be a  whose value is OCT. Let
FFRC be the value of

  CEILING ( FFP * LOCT / 100.0E0 )

NOTE 333 -- The percentage is computed using the number
of rows before removing the rows specified by .

ceil(20.3 * 1000 / 100.0E0) is definitely 203, not 200.

-- 
Andrew.



Re: RHEL 8.0 build

2018-11-25 Thread Andrew Dunstan



On 11/24/18 3:49 PM, Tom Lane wrote:

Jeremy Harris  writes:

  Trying to set up a buildfarm animal on a RHEL 8.0 (beta) system,
the build fails early:
...
It appears to be a "configure" script looking for python; and there is
no such.  You can have python3 or python2 - but neither package install
provides a symlink of just "python".

Yeah, some distros are starting to act that way, and I suspect it's
causing pain for a lot of people.

Currently we are agnostic about which python version to use, so if you
don't have anything simply named "python", you have to tell configure
what to use by setting the PYTHON environment variable.

In a buildfarm configuration file this would look something like

 # env settings to pass to configure. These settings will only be seen 
by
 # configure.
 config_env => {
+   PYTHON => "/usr/local/bin/python3",




or just


    PYTHON => 'python3',


It's installed by default on RH8, although you will also need 
python3-devel installed.





There's been some preliminary discussion about starting to default to
python3, but given this project's inherent conservatism, I don't expect
that to happen for some years yet.  In any case, whenever we do pull
that trigger we'd surely do so only in HEAD not released branches, so
buildfarm owners will need to deal with the case for years more.





Right.


cheers


andrew



--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: FETCH FIRST clause PERCENT option

2018-11-25 Thread Vik Fearing
On 25/11/2018 12:49, Surafel Temesgen wrote:
> 
> 
> On Sun, Nov 25, 2018 at 1:24 PM Vik Fearing  > wrote:
> 
> 
> Also, this query returns 210 rows instead of the expected 208:
> 
>     select *
>     from generate_series(1, 1000)
>     fetch first 20.8 percent rows only
> 
> this is because fetch first values work with integer and it change
> fractional number to nearest integer number like select * from
> generate_series(1, 1000) fetch first 20.3 rows only; is not an error
> rather it return 20 rows.

I don't see how this behavior is justified by reading the SQL standard.
 Obviously only an integer number of rows is going to be returned, but
the percentage should be calculated correctly.

I assume you meant 200 rows there, but the correct number of rows to
return is 203 for that query.  Please fix it.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



Re: FETCH FIRST clause PERCENT option

2018-11-25 Thread Surafel Temesgen
On Sun, Nov 25, 2018 at 1:24 PM Vik Fearing 
wrote:

>
> Also, this query returns 210 rows instead of the expected 208:
>
> select *
> from generate_series(1, 1000)
> fetch first 20.8 percent rows only
>
>

this is because fetch first values work with integer and it change
fractional number to nearest integer number like select * from
generate_series(1, 1000) fetch first 20.3 rows only; is not an error rather
it return 20 rows.


> As for the design, I think you should put some serious consideration
> into Andrew Gierth's approach.  I would rather see something like that.
>
>

I didn't go in that direction because I don’t understand why this approach
is inferior to using window function and I am not fully understand on how
to implement it.

regards

Surafel


Re: FETCH FIRST clause PERCENT option

2018-11-25 Thread Vik Fearing
On 25/11/2018 10:00, Surafel Temesgen wrote:
> 
> 
> 
> 
> I messed around with your changes to the grammar and it seems you don't
> need to add PERCENT as a reserved keyword.  Moving this to the
> unreserved
> keyword section does not cause any shift/reduce errors, and the
> regression
> tests still pass.  Relative to your patch v4, these changes help:
> 
> 
> In sql standard PERCENT list as reserved world so i don't think it is a
> thing that can change by me.

Yes it absolutely is.  In PostgreSQL we only make words as reserved as
they need to be, and PERCENT doesn't need to be reserved at all.

Also, this query returns 210 rows instead of the expected 208:

select *
from generate_series(1, 1000)
fetch first 20.8 percent rows only

As for the design, I think you should put some serious consideration
into Andrew Gierth's approach.  I would rather see something like that.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



Re: FETCH FIRST clause PERCENT option

2018-11-25 Thread Surafel Temesgen
> I messed around with your changes to the grammar and it seems you don't
> need to add PERCENT as a reserved keyword.  Moving this to the unreserved
> keyword section does not cause any shift/reduce errors, and the regression
> tests still pass.  Relative to your patch v4, these changes help:
>
>
In sql standard PERCENT list as reserved world so i don't think it is a
thing that can change by me.
I think the main reason create_table fail in  test_ddl_deparse  is because
i don't surround PERCENT key word in
/src/test/regress/expected/create_table.out too

regards
Surafel
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 4db8142afa..8491b7831a 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -44,7 +44,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionexpression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
 [ LIMIT { count | ALL } ]
 [ OFFSET start [ ROW | ROWS ] ]
-[ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY ]
+[ FETCH { FIRST | NEXT } [ count ] [ PERCENT ] { ROW | ROWS } ONLY ]
 [ FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF table_name [, ...] ] [ NOWAIT | SKIP LOCKED ] [...] ]
 
 where from_item can be one of:
@@ -1397,7 +1397,7 @@ OFFSET start
 which PostgreSQL also supports.  It is:
 
 OFFSET start { ROW | ROWS }
-FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY
+FETCH { FIRST | NEXT } [ count ] [ PERCENT ] { ROW | ROWS } ONLY
 
 In this syntax, the start
 or count value is required by
@@ -1407,7 +1407,8 @@ FETCH { FIRST | NEXT } [ count ] {
 ambiguity.
 If count is
 omitted in a FETCH clause, it defaults to 1.
-ROW
+with PERCENT count specifies the maximum number of rows to return 
+in percentage.ROW
 and ROWS as well as FIRST
 and NEXT are noise words that don't influence
 the effects of these clauses.
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index 6792f9e86c..762024be61 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -43,6 +43,7 @@ ExecLimit(PlanState *pstate)
 	LimitState *node = castNode(LimitState, pstate);
 	ScanDirection direction;
 	TupleTableSlot *slot;
+	TupleDesc   tupleDescriptor;
 	PlanState  *outerPlan;
 
 	CHECK_FOR_INTERRUPTS();
@@ -52,6 +53,8 @@ ExecLimit(PlanState *pstate)
 	 */
 	direction = node->ps.state->es_direction;
 	outerPlan = outerPlanState(node);
+	slot = node->subSlot;
+	tupleDescriptor = node->ps.ps_ResultTupleDesc;
 
 	/*
 	 * The main logic is a simple state machine.
@@ -60,6 +63,23 @@ ExecLimit(PlanState *pstate)
 	{
 		case LIMIT_INITIAL:
 
+			if (node->limitOption == PERCENTAGE)
+			{
+
+			/*
+			 * Find all rows the plan will return.
+			 */
+for (;;)
+{
+	slot = ExecProcNode(outerPlan);
+	if (TupIsNull(slot))
+	{
+		break;
+	}
+	tuplestore_puttupleslot(node->totalTuple, slot);
+}
+			}
+
 			/*
 			 * First call for this node, so compute limit/offset. (We can't do
 			 * this any earlier, because parameters from upper nodes will not
@@ -87,24 +107,46 @@ ExecLimit(PlanState *pstate)
 return NULL;
 			}
 
-			/*
-			 * Fetch rows from subplan until we reach position > offset.
-			 */
-			for (;;)
+			if (node->limitOption == PERCENTAGE)
 			{
-slot = ExecProcNode(outerPlan);
-if (TupIsNull(slot))
+for (;;)
 {
-	/*
-	 * The subplan returns too few tuples for us to produce
-	 * any output at all.
-	 */
-	node->lstate = LIMIT_EMPTY;
-	return NULL;
+	slot = MakeSingleTupleTableSlot(tupleDescriptor, &TTSOpsMinimalTuple);
+	if (!tuplestore_gettupleslot(node->totalTuple, true, true, slot))
+	{
+		node->lstate = LIMIT_EMPTY;
+		return NULL;
+	}
+	else
+	{
+		node->subSlot = slot;
+		if (++node->position > node->offset)
+		break;
+	}
+}
+			}
+			else if (node->limitOption == EXACT_NUMBER)
+			{
+
+/*
+ * Fetch rows from subplan until we reach position > offset.
+ */
+for (;;)
+{
+	slot = ExecProcNode(outerPlan);
+	if (TupIsNull(slot))
+	{
+		/*
+		 * The subplan returns too few tuples for us to produce
+		 * any output at all.
+		 */
+		node->lstate = LIMIT_EMPTY;
+		return NULL;
+	}
+	node->subSlot = slot;
+	if (++node->position > node->offset)
+		break;
 }
-node->subSlot = slot;
-if (++node->position > node->offset)
-	break;
 			}
 
 			/*
@@ -144,18 +186,34 @@ ExecLimit(PlanState *pstate)
 
 	return NULL;
 }
+if (node->limitOption == PERCENTAGE)
+{
+	if (tuplestore_gettupleslot(node->totalTuple, true, false, slot))
+	{
+		node->subSlot = slot;
+		node->position++;
+	}
+	else
+	{
+		node->lstate = LIMIT_SUBPLANEOF;
+		return NULL;
+	}
+}
+else if (node->limitOption == EXACT_NUMBER)
+{
 
-/*
- * Get next tuple from subplan, if an

Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

2018-11-25 Thread Thomas Munro
On Sun, Nov 25, 2018 at 12:59 PM Thomas Munro
 wrote:
> On Sun, Nov 25, 2018 at 3:38 AM Christoph Berg  wrote:
> > TRAP: FailedAssertion(»!(!IsUnderPostmaster || (wakeEvents & (1 << 5)) || 
> > (wakeEvents & (1 << 4)))«, Datei: 
> > »/build/postgresql-12-JElZNq/postgresql-12-12~~devel~20181124.1158/build/../src/backend/storage/ipc/latch.c«,
> >  Zeile: 389)
> > 2018-11-24 15:20:43.193 CET [17834] LOG:  Serverprozess (PID 18425) wurde 
> > von Signal 6 beendet: Aborted

Fix pushed.

By way of penance, I have now configured PG_TEST_EXTRA="ssl ldap
kerberos" for my build farm animals elver and eelpout.  elver should
pass at the next build, as I just tested it with --nosend, but eelpout
is so slow I'll just take my chances see if that works.  I'll also
review the firewall config on those VMs, since apparently everyone is
too chicken to run those tests, perhaps for those sorts of reasons.
I've also set those tests up for cfbot, which would have caught this
when draft patches were posted, and also enabled -Werror on cfbot
which would have caught a GCC warning I missed because I usually
develop/test with clang.

-- 
Thomas Munro
http://www.enterprisedb.com