Re: Fix for configure error in 9.5/9.6 on macOS 11.0 Big Sur

2020-09-03 Thread Tom Lane
Peter Eisentraut  writes:
> I suppose backpatching the patch that fixed this would be appropriate.

[ confused ... ]  Back-patching what patch?

regards, tom lane




Re: Fix for configure error in 9.5/9.6 on macOS 11.0 Big Sur

2020-09-03 Thread Peter Eisentraut

On 2020-09-03 19:36, Tom Lane wrote:

At least, that's what I got when I reinstalled Xcode just now on
my Catalina machine.  It does not exhibit this behavior.  I see

$ clang -c c.c
c.c:1:14: warning: implicitly declaring library function 'exit' with type 'void
   (int) __attribute__((noreturn))' [-Wimplicit-function-declaration]
int main() { exit(0); }
  ^
c.c:1:14: note: include the header  or explicitly provide a
   declaration for 'exit'
1 warning generated.

and PG configure and build goes through just fine.

Smells like an Apple bug from here.  Surely they're not expecting
that anyone will appreciate -Werror suddenly being the default.


IIRC, calling an undeclared function is (or may be?) an error in C99. 
So perhaps the implicit -Werror only applies to this particular warning 
class.


I suppose backpatching the patch that fixed this would be appropriate.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: history file on replica and double switchover

2020-09-03 Thread Fujii Masao




On 2020/09/04 8:29, Anastasia Lubennikova wrote:

On 27.08.2020 16:02, Grigory Smolkin wrote:

Hello!

I`ve noticed, that when running switchover replica to master and back to 
replica, new history file is streamed to replica, but not archived,
which is not great, because it breaks PITR if archiving is running on replica. 
The fix looks trivial.
Bash script to reproduce the problem and patch are attached.


Thanks for the report. I agree that it looks like a bug.


+1

+   /* Mark history file as ready for archiving */
+   if (XLogArchiveMode != ARCHIVE_MODE_OFF)
+   XLogArchiveNotify(fname);

I agree that history file should be archived in the standby when
archive_mode=always. But why do we need to do when archive_mode=on?
I'm just concerned about the case where the primary and standby
have the shared archive area, and archive_mode is on.



For some reason, patch failed to apply on current master, even though I don't 
see any difference in the code.
I'll attach this thread to the next commitfest, so it doesn't get lost.


Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: New statistics for tuning WAL buffer size

2020-09-03 Thread Fujii Masao




On 2020/09/04 11:50, tsunakawa.ta...@fujitsu.com wrote:

From: Fujii Masao 

I changed the view name from pg_stat_walwrites to pg_stat_walwriter.
I think it is better to match naming scheme with other views like

pg_stat_bgwriter,

which is for bgwriter statistics but it has the statistics related to backend.


I prefer the view name pg_stat_walwriter for the consistency with
other view names. But we also have pg_stat_wal_receiver. Which
makes me think that maybe pg_stat_wal_writer is better for
the consistency. Thought? IMO either of them works for me.
I'd like to hear more opinons about this.


I think pg_stat_bgwriter is now a misnomer, because it contains the backends' 
activity.  Likewise, pg_stat_walwriter leads to misunderstanding because its 
information is not limited to WAL writer.

How about simply pg_stat_wal?  In the future, we may want to include WAL reads 
in this view, e.g. reading undo logs in zheap.


Sounds reasonable.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: proposal: possibility to read dumped table's name from file

2020-09-03 Thread Pavel Stehule
pá 4. 9. 2020 v 2:15 odesílatel Tom Lane  napsal:

> Alvaro Herrera  writes:
> > So, Tom added a coding pattern for doing this in commit 8f8154a503c7,
> > which is ostensibly also to be used in pg_regress [1] -- maybe it'd be
> > useful to have this in src/common?
>
> Done, see pg_get_line() added by 67a472d71.
>

Here is updated patch for pg_dump

Regards

Pavel



>
> regards, tom lane
>
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 0b2e2de87b..48dd75bfb1 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -755,6 +755,56 @@ PostgreSQL documentation
   
  
 
+ 
+  --filter=filename
+  
+   
+Read objects filters from the specified file.
+If you use "-" as a filename, the filters are read from stdin.
+The lines of this file must have the following format:
+
+(+|-)[tnfd] objectname
+
+   
+
+   
+The first character specifies whether the object is to be included
+(+) or excluded (-), and the
+second character specifies the type of object to be filtered:
+t (table),
+n (schema),
+f (foreign table),
+d (table data).
+   
+
+   
+With the following filter file, the dump would include table
+mytable1 and data from foreign table
+some_foreign_table, but exclude data
+from table mytable2.
+
++t mytable1
++f some_foreign_table
+-d mytable2
+
+   
+
+   
+The lines starting with symbol # are ignored.
+Previous white chars (spaces, tabs) are not allowed. These
+lines can be used for comments, notes.
+   
+
+   
+The --filter option works just like the other
+options to include or exclude tables, schemas, table data, or foreign
+tables, and both forms may be combined.  Note that there are no options
+to exclude a specific foreign table or to include a specific table's
+data.
+   
+  
+ 
+
  
   --if-exists
   
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index d3ca54e4dc..1d3c97ad97 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -53,6 +53,7 @@
 #include "catalog/pg_trigger_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/connect.h"
+#include "common/string.h"
 #include "dumputils.h"
 #include "fe_utils/string_utils.h"
 #include "getopt_long.h"
@@ -291,6 +292,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 static char *get_synchronized_snapshot(Archive *fout);
 static void setupDumpWorker(Archive *AHX);
 static TableInfo *getRootTableInfo(TableInfo *tbinfo);
+static void read_patterns_from_file(char *filename, DumpOptions *dopt);
 
 
 int
@@ -365,6 +367,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, _row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"extra-float-digits", required_argument, NULL, 8},
+		{"filter", required_argument, NULL, 12},
 		{"if-exists", no_argument, _exists, 1},
 		{"inserts", no_argument, NULL, 9},
 		{"lock-wait-timeout", required_argument, NULL, 2},
@@ -604,6 +607,10 @@ main(int argc, char **argv)
 		  optarg);
 break;
 
+			case 12:			/* filter implementation */
+read_patterns_from_file(optarg, );
+break;
+
 			default:
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 exit_nicely(1);
@@ -1023,6 +1030,8 @@ help(const char *progname)
 			 "   access to)\n"));
 	printf(_("  --exclude-table-data=PATTERN do NOT dump data for the specified table(s)\n"));
 	printf(_("  --extra-float-digits=NUM override default setting for extra_float_digits\n"));
+	printf(_("  --filter=FILENAMEdump objects and data based on the filter expressions\n"
+			 "   from the filter file\n"));
 	printf(_("  --if-exists  use IF EXISTS when dropping objects\n"));
 	printf(_("  --include-foreign-data=PATTERN\n"
 			 "   include data of foreign tables on foreign\n"
@@ -18473,3 +18482,157 @@ appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 	if (!res)
 		pg_log_warning("could not parse reloptions array");
 }
+
+/*
+ * Print error message and exit.
+ */
+static void
+exit_invalid_filter_format(FILE *fp, char *filename, char *message, char *line, int lineno)
+{
+	pg_log_error("invalid format of filter file \"%s\": %s",
+ filename,
+ message);
+
+	fprintf(stderr, "%d: %s\n", lineno, line);
+
+	if (fp != stdin)
+		fclose(fp);
+
+	exit_nicely(-1);
+}
+
+/*
+ * Read dumped object specification from file
+ */
+static void
+read_patterns_from_file(char *filename, DumpOptions *dopt)
+{
+	FILE	   *fp;
+	char	   *line;
+	int			lineno = 0;
+
+	/* use "-" as symbol for stdin */
+	if (strcmp(filename, "-") != 0)
+	{
+		fp = fopen(filename, "r");
+		if (!fp)
+			

Re: Clang UndefinedBehaviorSanitize (Postgres14) Detected undefined-behavior

2020-09-03 Thread Noah Misch
On Thu, Sep 03, 2020 at 10:53:37PM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > We do assume dereferencing NULL would crash, but we also assume this
> > optimization doesn't happen:
> 
> > #ifndef REMOVE_MEMCPY
> > memcpy(dest, src, n);
> > #endif
> > if (src)
> > pause();
> 
> > [ gcc believes the if-test is unnecessary ]
> 
> > So yes, it would be reasonable to adopt -fno-delete-null-pointer-checks 
> > and/or
> > remove -fno-sanitize=nonnull-attribute from buildfarm member thorntail.

> If there actually are places where this is a problem, I think we
> need to fix it by doing
> 
>   if (n > 0)
>   memcpy(dest, src, n);
> 
> so that the compiler can no longer assume that src!=NULL even
> when n is zero.  I'd still leave -fdelete-null-pointer-checks
> enabled, because it can make valid and useful optimizations in
> other cases.  (Besides that, it's far from clear that disabling
> that flag would suppress all bad consequences of the assumption
> that memcpy's arguments aren't null.)

Your proposal is what I had in mind when I wrote "remove
-fno-sanitize=nonnull-attribute from buildfarm member thorntail", and I agree
it's attractive.  In particular, gcc is not likely to be the last compiler to
attempt such an optimization, and other compilers may not offer
-fno-delete-null-pointer-checks or equivalent.

One might argue for -fno-delete-null-pointer-checks in addition, because it
would defend against cases that sanitizers miss.  I tend to think that's
overkill, but maybe not.  I suppose one could do an audit, diffing the
generated code with and without the option.




Re: Clang UndefinedBehaviorSanitize (Postgres14) Detected undefined-behavior

2020-09-03 Thread Tom Lane
Peter Geoghegan  writes:
> On Thu, Sep 3, 2020 at 7:53 PM Tom Lane  wrote:
>> I'd still leave -fdelete-null-pointer-checks
>> enabled, because it can make valid and useful optimizations in
>> other cases.

> Is there any evidence that that's true? I wouldn't assume that the gcc
> people exercised good judgement here.

I have not actually dug for examples, but the sort of situation where
I think it would help us is that macros or static inlines could contain
null tests that can be proven useless at particular call sites due to
surrounding code.

regards, tom lane




Re: Clang UndefinedBehaviorSanitize (Postgres14) Detected undefined-behavior

2020-09-03 Thread Peter Geoghegan
On Thu, Sep 3, 2020 at 7:53 PM Tom Lane  wrote:
> Hm.  I would not blame that on -fdelete-null-pointer-checks per se.
> Rather the problem is what we were touching on before: the dubious
> but standard-approved assumption that memcpy's arguments cannot be
> null.

Isn't it both, together? That is, it's the combination of that
assumption alongside -fdelete-null-pointer-checks's actual willingness
to propagate the assumption.

> I'd still leave -fdelete-null-pointer-checks
> enabled, because it can make valid and useful optimizations in
> other cases.

Is there any evidence that that's true? I wouldn't assume that the gcc
people exercised good judgement here.

-- 
Peter Geoghegan




Re: [PATCH] Redudant initilization

2020-09-03 Thread Bruce Momjian
On Wed, Apr  1, 2020 at 08:57:18AM -0300, Ranier Vilela wrote:
> Hi,
> New patch with yours suggestions.

Patch applied to head, thanks.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: default partition and concurrent attach partition

2020-09-03 Thread Amit Langote
Hi Alvaro,

On Fri, Sep 4, 2020 at 6:28 AM Alvaro Herrera  wrote:
>
> Also, I should have pointed out that ExecInsert doesn't actually check
> the partitin constraint except in very specific cases; what it does is
> expect that the partition routing code got it right.  So the comment
> you're adding about that is wrong, and it did misled me into changing
> your code in a way that actually caused a bug -- hence my proposed
> rewording.

Thanks for taking a look.

/*
 * If this partition is the default one, we must check its partition
-* constraint, because it may have changed due to partitions being
-* added to the parent concurrently.  We do the check here instead of
-* in ExecInsert(), because we would not want to miss checking the
-* constraint of any nonleaf partitions as they never make it to
-* ExecInsert().
+* constraint now, which may have changed due to partitions being
+* added to the parent concurrently.

I am fine with your rewording but let me explain how I ended up
writing what I wrote:

At first I had pondered tweaking the following code in ExecInsert() to
fix this bug:

/*
 * Also check the tuple against the partition constraint, if there is
 * one; except that if we got here via tuple-routing, we don't need to
 * if there's no BR trigger defined on the partition.
 */
if (resultRelInfo->ri_PartitionCheck &&
(resultRelInfo->ri_PartitionRoot == NULL ||
 (resultRelInfo->ri_TrigDesc &&
  resultRelInfo->ri_TrigDesc->trig_insert_before_row)))
ExecPartitionCheck(resultRelInfo, slot, estate, true);

I gave up because I concluded that there isn't enough information at
this place in code to determine if the partition is a default
partition, so I moved my attention to ExecFindPartition() where we
have access to the parent's PartitionDesc which is enough to do so.
In the process of modifying ExecFindPartition() to fix the bug I
realized that default partitions can be partitioned and
sub-partitioned partitions never reach ExecInsert().  So, beside the
earlier mentioned inconvenience of fixing this bug in ExecInsert(),
there is also the problem that such a fix would have missed addressing
sub-partitioned default partitions.  I thought someone beside me would
also wonder why ExecInsert() is not touched in this fix, hence the
comment.

FWIW, I still prefer "minimally valid ResultRelInfo" to "fake
ResultRelInfo", because those aren't really fake, are they?  They are
as valid as any other ResultRelInfo as far I can see.  I said
"minimally valid" because a fully-valid partition ResultRelInfo is one
made by ExecInitPartitionInfo(), which I avoided to use here thinking
that would be overkill.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com




Re: Clang UndefinedBehaviorSanitize (Postgres14) Detected undefined-behavior

2020-09-03 Thread Tom Lane
Noah Misch  writes:
> We do assume dereferencing NULL would crash, but we also assume this
> optimization doesn't happen:

> #ifndef REMOVE_MEMCPY
> memcpy(dest, src, n);
> #endif
> if (src)
>   pause();

> [ gcc believes the if-test is unnecessary ]

Hm.  I would not blame that on -fdelete-null-pointer-checks per se.
Rather the problem is what we were touching on before: the dubious
but standard-approved assumption that memcpy's arguments cannot be
null.

If there actually are places where this is a problem, I think we
need to fix it by doing

if (n > 0)
memcpy(dest, src, n);

so that the compiler can no longer assume that src!=NULL even
when n is zero.  I'd still leave -fdelete-null-pointer-checks
enabled, because it can make valid and useful optimizations in
other cases.  (Besides that, it's far from clear that disabling
that flag would suppress all bad consequences of the assumption
that memcpy's arguments aren't null.)

regards, tom lane




RE: New statistics for tuning WAL buffer size

2020-09-03 Thread tsunakawa.ta...@fujitsu.com
From: Fujii Masao 
> > I changed the view name from pg_stat_walwrites to pg_stat_walwriter.
> > I think it is better to match naming scheme with other views like
> pg_stat_bgwriter,
> > which is for bgwriter statistics but it has the statistics related to 
> > backend.
> 
> I prefer the view name pg_stat_walwriter for the consistency with
> other view names. But we also have pg_stat_wal_receiver. Which
> makes me think that maybe pg_stat_wal_writer is better for
> the consistency. Thought? IMO either of them works for me.
> I'd like to hear more opinons about this.

I think pg_stat_bgwriter is now a misnomer, because it contains the backends' 
activity.  Likewise, pg_stat_walwriter leads to misunderstanding because its 
information is not limited to WAL writer.

How about simply pg_stat_wal?  In the future, we may want to include WAL reads 
in this view, e.g. reading undo logs in zheap.


Regards
Takayuki Tsunakawa







Re: Get memory contexts of an arbitrary backend process

2020-09-03 Thread Kasahara Tatsuhito
On Fri, Sep 4, 2020 at 2:40 AM Tom Lane  wrote:
> Kasahara Tatsuhito  writes:
> > Yes, but it's not only for future expansion, but also for the
> > usability and the stability of this feature.
> > For example, if you want to read one dumped file multiple times and analyze 
> > it,
> > you will want the ability to just read the dump.
>
> If we design it to make that possible, how are we going to prevent disk
> space leaks from never-cleaned-up dump files?
In my thought, with features such as a view that allows us to see a
list of dumped files,
it would be better to have a function that simply deletes the dump
files associated with a specific PID,
or to delete all dump files.
Some files may be dumped with unexpected delays, so I think the
cleaning feature will be necessary.
( Also, as the pgsql_tmp file, it might better to delete dump files
when PostgreSQL start.)

Or should we try to delete the dump file as soon as we can read it?

Best regards,

-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com




Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-09-03 Thread Justin Pryzby
On Wed, Sep 02, 2020 at 06:07:06PM -0500, Justin Pryzby wrote:
> On my side, I've also rearranged function parameters to make the diff more
> readable.  And squishes your changes into the respective patches.

This resolves a breakage I failed to notice from a last-minute edit.
And squishes two commits.
And rebased on Michael's commit removing ReindexStmt->concurrent.

-- 
Justin
>From 7e04caad0d010b5fd3eeca8d9bd436e89b657e4a Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Mar 2020 17:50:46 -0500
Subject: [PATCH v26 1/5] Change REINDEX/CLUSTER to accept an option list..

..like EXPLAIN (..), VACUUM (..), and ANALYZE (..).

Change docs in the style of VACUUM.  See also: 52dcfda48778d16683c64ca4372299a099a15b96
---
 doc/src/sgml/ref/cluster.sgml  | 27 ++---
 doc/src/sgml/ref/reindex.sgml  | 43 ++-
 src/backend/commands/cluster.c | 28 --
 src/backend/nodes/copyfuncs.c  |  4 +--
 src/backend/nodes/equalfuncs.c |  4 +--
 src/backend/parser/gram.y  | 54 +++---
 src/backend/tcop/utility.c | 42 ++
 src/bin/psql/tab-complete.c| 22 ++
 src/include/commands/cluster.h |  3 +-
 src/include/commands/defrem.h  |  2 +-
 src/include/nodes/parsenodes.h |  4 +--
 11 files changed, 170 insertions(+), 63 deletions(-)

diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 4da60d8d56..e6ebce27e6 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -21,8 +21,13 @@ PostgreSQL documentation
 
  
 
-CLUSTER [VERBOSE] table_name [ USING index_name ]
-CLUSTER [VERBOSE]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ] table_name [ USING index_name ]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ]
+
+where option can be one of:
+
+VERBOSE [ boolean ]
+
 
  
 
@@ -81,6 +86,15 @@ CLUSTER [VERBOSE]
   Parameters
 
   
+   
+VERBOSE
+
+ 
+  Prints a progress report as each table is clustered.
+ 
+
+   
+

 table_name
 
@@ -100,10 +114,15 @@ CLUSTER [VERBOSE]

 

-VERBOSE
+boolean
 
  
-  Prints a progress report as each table is clustered.
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
  
 

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index c16f223e4e..a32e192a87 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -25,7 +25,7 @@ REINDEX [ ( option [, ...] ) ] { IN
 
 where option can be one of:
 
-VERBOSE
+VERBOSE [ boolean ]
 
  
 
@@ -141,19 +141,6 @@ REINDEX [ ( option [, ...] ) ] { IN
 

 
-   
-name
-
- 
-  The name of the specific index, table, or database to be
-  reindexed.  Index and table names can be schema-qualified.
-  Presently, REINDEX DATABASE and REINDEX SYSTEM
-  can only reindex the current database, so their parameter must match
-  the current database's name.
- 
-
-   
-

 CONCURRENTLY
 
@@ -181,6 +168,34 @@ REINDEX [ ( option [, ...] ) ] { IN
  
 

+
+   
+name
+
+ 
+  The name of the specific index, table, or database to be
+  reindexed.  Index and table names can be schema-qualified.
+  Presently, REINDEX DATABASE and REINDEX SYSTEM
+  can only reindex the current database, so their parameter must match
+  the current database's name.
+ 
+
+   
+
+   
+boolean
+
+ 
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
+ 
+
+   
+
   
  
 
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 0d647e912c..4a3678831d 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -35,6 +35,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/defrem.h"
 #include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
@@ -102,8 +103,29 @@ static List *get_tables_to_cluster(MemoryContext cluster_context);
  *---
  */
 void
-cluster(ClusterStmt *stmt, bool isTopLevel)
+cluster(ParseState *pstate, ClusterStmt *stmt, bool isTopLevel)
 {
+	ListCell	*lc;
+	int			options = 0;
+
+	/* Parse list of generic parameters not handled by the parser */
+	foreach(lc, stmt->params)
+	{
+		DefElem	*opt = (DefElem *) lfirst(lc);
+
+		if (strcmp(opt->defname, "verbose") == 0)
+			if (defGetBoolean(opt))
+options |= CLUOPT_VERBOSE;
+			else
+options 

Re: Clang UndefinedBehaviorSanitize (Postgres14) Detected undefined-behavior

2020-09-03 Thread Noah Misch
On Sat, Aug 29, 2020 at 12:36:42PM -0400, Tom Lane wrote:
> Peter Geoghegan  writes:
> > I wonder if we should start using -fno-delete-null-pointer-checks:
> > https://lkml.org/lkml/2018/4/4/601
> > This may not be strictly relevant to the discussion, but I was
> > reminded of it just now and thought I'd mention it.
> 
> Hmm.  gcc 8.3 defines this as:
> 
>  Assume that programs cannot safely dereference null pointers, and
>  that no code or data element resides at address zero.  This option
>  enables simple constant folding optimizations at all optimization
>  levels.  In addition, other optimization passes in GCC use this
>  flag to control global dataflow analyses that eliminate useless
>  checks for null pointers; these assume that a memory access to
>  address zero always results in a trap, so that if a pointer is
>  checked after it has already been dereferenced, it cannot be null.
> 
> AFAICS, that's a perfectly valid assumption for our usage.  I can see why
> the kernel might not want it, but we set things up whenever possible to
> ensure that dereferencing NULL would crash.

We do assume dereferencing NULL would crash, but we also assume this
optimization doesn't happen:

=== opt-null.c
#include 
#include 

int my_memcpy(void *dest, const void *src, size_t n)
{
#ifndef REMOVE_MEMCPY
memcpy(dest, src, n);
#endif
if (src)
pause();
return 0;
}
===

$ gcc --version
gcc (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ diff -sU1 <(gcc -O2 -fno-delete-null-pointer-checks -S -o- opt-null.c) <(gcc 
-O2 -S -o- opt-null.c)
--- /dev/fd/63  2020-09-03 19:23:53.206864378 -0700
+++ /dev/fd/62  2020-09-03 19:23:53.206864378 -0700
@@ -8,13 +8,8 @@
.cfi_startproc
-   pushq   %rbx
+   subq$8, %rsp
.cfi_def_cfa_offset 16
-   .cfi_offset 3, -16
-   movq%rsi, %rbx
callmemcpy@PLT
-   testq   %rbx, %rbx
-   je  .L2
callpause@PLT
-.L2:
xorl%eax, %eax
-   popq%rbx
+   addq$8, %rsp
.cfi_def_cfa_offset 8
$ diff -sU1 <(gcc -DREMOVE_MEMCPY -O2 -fno-delete-null-pointer-checks -S -o- 
opt-null.c) <(gcc -DREMOVE_MEMCPY -O2 -S -o- opt-null.c)
Files /dev/fd/63 and /dev/fd/62 are identical


So yes, it would be reasonable to adopt -fno-delete-null-pointer-checks and/or
remove -fno-sanitize=nonnull-attribute from buildfarm member thorntail.

> However, while grepping the manual for that I also found
> 
> '-Wnull-dereference'
>  Warn if the compiler detects paths that trigger erroneous or
>  undefined behavior due to dereferencing a null pointer.  This
>  option is only active when '-fdelete-null-pointer-checks' is
>  active, which is enabled by optimizations in most targets.  The
>  precision of the warnings depends on the optimization options used.
> 
> I wonder whether turning that on would find anything interesting.

Promising.  Sadly, it doesn't warn for the above test case.




Re: [PATCH] - Provide robust alternatives for replace_string

2020-09-03 Thread Alvaro Herrera
Note that starting with commit 67a472d71c98 you can use pg_get_line and
not worry about the hard part of this anymore :-)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Rare deadlock failure in create_am test

2020-09-03 Thread Tom Lane
conchuela just showed an unusual failure:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=conchuela=2020-09-03%2023%3A00%3A36

The core of it is a deadlock failure in create_am.sql; there's then
some follow-on noise from not having successfully dropped the test AM.
The deadlock looks like:

2020-09-04 01:05:06.904 CEST [609175:34] pg_regress/create_am LOG:  process 
609175 detected deadlock while waiting for AccessExclusiveLock on relation 
17347 of database 16384 after 4616.873 ms
2020-09-04 01:05:06.904 CEST [609175:35] pg_regress/create_am DETAIL:  Process 
holding the lock: 609183. Wait queue: .
2020-09-04 01:05:06.904 CEST [609175:36] pg_regress/create_am STATEMENT:  DROP 
ACCESS METHOD gist2 CASCADE;
2020-09-04 01:05:06.904 CEST [609175:37] pg_regress/create_am ERROR:  deadlock 
detected
2020-09-04 01:05:06.904 CEST [609175:38] pg_regress/create_am DETAIL:  Process 
609175 waits for AccessExclusiveLock on relation 17347 of database 16384; 
blocked by process 609183.
Process 609183 waits for RowExclusiveLock on relation 20095 of database 
16384; blocked by process 609175.
Process 609175: DROP ACCESS METHOD gist2 CASCADE;
Process 609183: autovacuum: VACUUM ANALYZE public.fast_emp4000
2020-09-04 01:05:06.904 CEST [609175:39] pg_regress/create_am HINT:  See server 
log for query details.
2020-09-04 01:05:06.904 CEST [609175:40] pg_regress/create_am STATEMENT:  DROP 
ACCESS METHOD gist2 CASCADE;
2020-09-04 01:05:06.904 CEST [609183:11] LOG:  process 609183 acquired 
RowExclusiveLock on relation 20095 of database 16384 after 13377.776 ms
2020-09-04 01:04:52.895 CEST [609183:6] LOG:  automatic analyze of table 
"regression.public.tenk2" system usage: CPU: user: 0.03 s, system: 0.00 s, 
elapsed: 0.59 s

So it's not hard to understand the problem: DROP of an index AM, cascading
to an index, will need to acquire lock on the index and then lock on the
index's table.  Any other operation on the table, like say autovacuum,
is going to acquire locks in the other direction.

This is pretty rare, but not unheard of:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison=2020-03-24%2022%3A00%3A23

(There might be more such failures, but I only looked back six months,
and these two are all I found in that window.)

I'm inclined to think that the best fix is to put

begin;
lock table tenk2;
...
commit;

around the DROP CASCADE.  We could alternatively disable autovacuum on
tenk2, but that could have undesirable side-effects.

regards, tom lane




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-09-03 Thread Amit Kapila
On Fri, Sep 4, 2020 at 3:10 AM Bossart, Nathan  wrote:
>
> I noticed a small compiler warning for this.
>
> diff --git a/src/backend/replication/logical/worker.c 
> b/src/backend/replication/logical/worker.c
> index 812aca8011..88d3444c39 100644
> --- a/src/backend/replication/logical/worker.c
> +++ b/src/backend/replication/logical/worker.c
> @@ -199,7 +199,7 @@ typedef struct ApplySubXactData
>  static ApplySubXactData subxact_data = {0, 0, InvalidTransactionId, NULL};
>
>  static void subxact_filename(char *path, Oid subid, TransactionId xid);
> -static void changes_filename(char *path, Oid subid, TransactionId xid);
> +static inline void changes_filename(char *path, Oid subid, TransactionId 
> xid);
>

Thanks for the report, I'll take care of this. I think the nearby
similar function subxact_filename() should also be inline.

-- 
With Regards,
Amit Kapila.




Concurrent failure in create_am with buildfarm member conchuela

2020-09-03 Thread Michael Paquier
Hi all,

conchuela has just reported the following error:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=conchuela=2020-09-03%2023%3A00%3A36

-- Drop access method cascade
 DROP ACCESS METHOD gist2 CASCADE;
 NOTICE:  drop cascades to index grect2ind2
+ERROR:  deadlock detected
+DETAIL:  Process 609175 waits for AccessExclusiveLock on relation 17347 of 
database 16384; blocked by process 609183.
+Process 609183 waits for RowExclusiveLock on relation 20095 of database 16384; 
blocked by process 609175.
+HINT:  See server log for query details.

Looking at the logs, 609183 is running an autovacuum analyze:
2020-09-04 01:04:52.895 CEST [609183:6] LOG:  automatic analyze of
table "regression.public.tenk2" system usage: CPU: user: 0.03 s,
system: 0.00 s, elapsed: 0.59 s

I have not looked at that stuff much yet.

Thanks,
--
Michael


signature.asc
Description: PGP signature


Re: Refactor ReindexStmt and its "concurrent" boolean

2020-09-03 Thread Michael Paquier
On Thu, Sep 03, 2020 at 11:14:38AM +0900, Michael Paquier wrote:
> It should, thanks for looking at it.  Let's wait a couple of days and
> see if others have any comments.  If there are no objections, I'll try
> to commit this one.

And applied.
--
Michael


signature.asc
Description: PGP signature


Re: [NBTREE] Possible NULL pointer dereference (backend/access/nbtree/nbutils.c)

2020-09-03 Thread Michael Paquier
On Thu, Sep 03, 2020 at 11:39:57AM -0300, Ranier Vilela wrote:
> I'm using about 4 static analysis tools.

It seems to me that this is the root of the problem here.
--
Michael


signature.asc
Description: PGP signature


Re: Support for NSS as a libpq TLS backend

2020-09-03 Thread Michael Paquier
On Thu, Sep 03, 2020 at 03:26:03PM -0400, Andrew Dunstan wrote:
>> The 0001 patch isn't strictly necessary but it seems reasonable to address 
>> the
>> various ways OpenSSL was spelled out in the docs while at updating the SSL
>> portions.  It essentially ensures that markup around OpenSSL and SSL is used
>> consistently.  I didn't address the linelengths being too long in this patch 
>> to
>> make review easier instead.
> 
> I'll take a look.

Adding a  markup around OpenSSL in the docs makes things
consistent.  +1.
--
Michael


signature.asc
Description: PGP signature


Re: BUG #16419: wrong parsing BC year in to_date() function

2020-09-03 Thread Bruce Momjian
On Wed, Jul 15, 2020 at 09:26:53AM -0700, David G. Johnston wrote:
> On Tue, May 12, 2020 at 8:56 PM Laurenz Albe  wrote:
> 
> On Tue, 2020-05-12 at 18:09 -0700, David G. Johnston wrote:
> > Redirecting to -hackers for visibility.  I feel there needs to be
> something done here, even if just documentation (a bullet in the usage
> notes section - and a code comment update for the macro)
> > pointing this out and not changing any behavior.
> 
> Since "to_date" is an Oracle compatibility function, here is what Oracle
> 18.4 has to say to that:
> 
> SQL> SELECT to_date('', '') FROM dual;
> SELECT to_date('', '') FROM dual
>                *
> ERROR at line 1:
> ORA-01841: (full) year must be between -4713 and +, and not be 0
> 
> 
> 
> Attached is a concrete patch (back-patchable hopefully) documenting the 
> current
> reality.
> 
> As noted in the patch commit message (commentary really):
> 
> make_timestamp not agreeing with make_date on how to handle negative years
> should probably just be fixed - but that is for someone else to handle.
> 
> Whether to actually change the behavior of to_date is up for debate though I
> would presume it would not be back-patched.

OK, so, looking at this thread, we have to_date() treating -1 as -2 BC,
make_date() treating -1 as 1 BC, and we have Oracle, which to_date() is
supposed to match, making -1 as 1 BC.

Because we already have the to_date/make_date inconsistency, and the -1
to -2 BC mapping is confusing, and doesn't match Oracle, I think the
clean solution is to change PG 14 to treat -1 as 1 BC, and document the
incompatibility in the release notes.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Switch to multi-inserts for pg_depend

2020-09-03 Thread Michael Paquier
On Thu, Sep 03, 2020 at 10:50:49AM -0400, Alvaro Herrera wrote:
> I'm not sure you need the second sentence in this comment; keeping the
> "delay initialization until ..." part seems sufficient.  If you really
> want to highlight that initialization is costly, maybe just say "delay
> costly initialization".

Thanks for the review.

This extra comment was to answer to Daniel's suggestion upthread, and
the simple wording you are suggesting is much better than what I did,
so I have just added "costly initialization" in those two places.

>> +/*
>> + * Record the Dependency.  Note we don't bother to check for 
>> duplicate
>> + * dependencies; there's no harm in them.
>> + */
> 
> No need to uppercase "dependency".  (I know this is carried forward from
> prior comment, but it was equally unnecessary there.)

Thanks, fixed.
--
Michael


signature.asc
Description: PGP signature


Re: Disk-based hash aggregate's cost model

2020-09-03 Thread Jeff Davis
On Tue, 2020-09-01 at 23:19 +0200, Tomas Vondra wrote:
> FWIW any thoughts about the different in temp size compared to
> CP_SMALL_TLIST?

Are you referring to results from a while ago? In this thread I don't
see what you're referring to.

I tried in a simple case on REL_13_STABLE, with and without the
CP_SMALL_TLIST change, and I saw only a tiny difference. Do you have a
current case that shows a larger difference?

The only thing I can think of that might change is the size of the null
bitmap or how fields are aligned.

Regards,
Jeff Davis






Re: 回复:how to create index concurrently on partitioned table

2020-09-03 Thread Michael Paquier
On Thu, Sep 03, 2020 at 10:02:58PM +0300, Anastasia Lubennikova wrote:
> First of all, this patch fails at cfbot:
> 
> indexcmds.c:2848:7: error: variable ‘parentoid’ set but not used
> [-Werror=unused-but-set-variable]
> Oid parentoid;^

Missed this warning, thanks for pointing it out.  This is an incorrect
rebase: this variable was used as a workaround to take a session lock
on the parent table to be reindexed, session lock logic existing to
prevent cache lookup errors when dropping some portions of the tree
concurrently (1d65416 as you already know).  But we don't need that
anymore.

> If this guessed fix is correct, I see the problem in the patch logic. In
> reindex_partitions() we collect parent relations to pass them to
> reindex_multiple_internal(). It implicitly changes the logic from REINDEX
> INDEX to REINDEX RELATION, which is not the same, if table has more than one
> index.

Incorrect guess here.  parentoid refers to the parent table of an
index, so by saving its OID in the list of things to reindex, you
would finish by reindexing all the indexes of a partition.  We need to
use partoid, as returned by find_all_inheritors() for all the members
of the partition tree (relid can be a partitioned index or partitioned
table in reindex_partitions).

> 1) in create_index.sql
> 
> Are these two lines intentional checks that must fail? If so, I propose to
> add a comment.
> 
> REINDEX TABLE concur_reindex_part_index;
> REINDEX TABLE CONCURRENTLY concur_reindex_part_index;
> 
> A few lines around also look like they were copy-pasted and need a second
> look.

You can note some slight differences though.  These are test cases for
REINDEX INDEX with tables, and REINDEX TABLE with indexes.  What you
are quoting here is the part for indexes with REINDEX TABLE.  And
there are already comments, see:
"-- REINDEX INDEX fails for partitioned tables"
"-- REINDEX TABLE fails for partitioned indexes"

Perhaps that's not enough, so I have added some more comments to
outline that these commands fail (8 commands in total).

> 2)  This part of ReindexRelationConcurrently() is misleading.
> 
> Maybe assertion is enough. It seems, that we should never reach this code
> because both ReindexTable and ReindexIndex handle those relkinds
> separately.  Which leads me to the next question.

Yes, we could use an assert, but I did not feel any strong need to
change that either for this patch.

> 3) Is there a reason, why reindex_partitions() is not inside
> ReindexRelationConcurrently() ? I think that logically it belongs there.

Yes, it simplifies the build of the index list indexIds, as there is
no need to loop back into a different routine if working on a table or
a matview.

> 4) I haven't tested multi-level partitions yet. In any case, it would be
> good to document this behavior explicitly.

Not sure what addition we could do here.  The patch states that each
partition of the partitioned relation defined gets reindexed, which
implies that this handles multiple layers automatically.

> I need a bit more time to review this patch more thoroughly. Please, wait
> for it, before committing.

Glad to hear that, please take the time you need.
--
Michael
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index c26a102b17..22db3f660d 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -34,8 +34,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
  bool check_not_in_use,
  bool skip_build,
  bool quiet);
-extern void ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
-extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
+extern void ReindexIndex(RangeVar *indexRelation, int options, bool concurrent,
+		 bool isTopLevel);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent,
+		 bool isTopLevel);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
   int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 1d662e9af4..1764739ff5 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -77,6 +77,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
+#include "utils/rel.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
 #include "utils/tuplesort.h"
@@ -3473,11 +3474,12 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 	 iRel->rd_rel->relam);
 
 	/*
-	 * The case of reindexing partitioned tables and indexes is handled
-	 * differently by upper layers, so this case shouldn't arise.
+	 * Partitioned indexes should never get processed here, as they have no
+	 * physical storage.
 	 */
 	if (iRel->rd_rel->relkind == RELKIND_PARTITIONED_INDEX)
-		elog(ERROR, "unsupported relation kind for index \"%s\"",
+		elog(ERROR, "cannot 

Re: proposal: possibility to read dumped table's name from file

2020-09-03 Thread Tom Lane
Alvaro Herrera  writes:
> So, Tom added a coding pattern for doing this in commit 8f8154a503c7,
> which is ostensibly also to be used in pg_regress [1] -- maybe it'd be
> useful to have this in src/common?

Done, see pg_get_line() added by 67a472d71.

regards, tom lane




RE: On login trigger: take three

2020-09-03 Thread tsunakawa.ta...@fujitsu.com
From: Konstantin Knizhnik 
> Recently I have asked once again by one of our customers about login trigger 
> in
> postgres. People are migrating to Postgres from Oracle and looking for 
> Postgres
> analog of this Oracle feature.
> This topic is not new:

> I attached my prototype implementation of this feature.
> I just to be sure first that this feature will be interested to community.
> If so, I will continue work in it and prepare new version of the patch for the
> commitfest.

Thanks a lot for taking on this! +10

> It may be considered as argument against handling session start using trigger.
> But it seems to be the most natural mechanism for users.

Yeah, it's natural, just like the Unix shells run some shell scripts in the 
home directory.


Regards
Takayuki Tsunakawa



Re: A micro-optimisation for walkdir()

2020-09-03 Thread Thomas Munro
On Fri, Sep 4, 2020 at 3:31 AM Tom Lane  wrote:
> Thomas Munro  writes:
> > Hmm.  Well I had it like that in an earlier version, but then I
> > couldn't figure out the right way to write code that would work in
> > both frontend and backend code, without writing two copies in two
> > translation units, or putting the whole thing in a header.  What
> > approach do you prefer?
>
> Well, there are plenty of places in src/port/ where we do things like
>
> #ifndef FRONTEND
> ereport(ERROR,
> (errcode_for_file_access(),
>  errmsg("could not get junction for \"%s\": %s",
> path, msg)));
> #else
> fprintf(stderr, _("could not get junction for \"%s\": %s\n"),
> path, msg);
> #endif
>
> I don't see a compelling reason why this function couldn't report
> stat() failures similarly, especially if we're just going to have
> the callers do exactly the same thing as that anyway.

Ok, so the main weird thing is that you finish up having to pass in an
elevel, but that has different meanings in FE and BE code.  Note that
you still need a PGFILE_ERROR return value, because we don't log
messages at a level that exits non-locally (and that concept doesn't
even exist for FE logging).
From 6f937ccef54afdcebaa52a036591523c67ac9d41 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 3 Sep 2020 13:58:17 +1200
Subject: [PATCH v3 1/3] Skip unnecessary stat() calls in walkdir().
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Some kernels can tell us the type of a "dirent", so we can avoid a call
to stat() or lstat() in many cases.  In order to be able to apply this
change to both frontend and backend versions of walkdir(), define a new
function get_dirent_type() in a new translation unit file_utils_febe.c.

Reviewed-by: Tom Lane 
Reviewed-by: Juan José Santamaría Flecha 
Discussion: https://postgr.es/m/CA%2BhUKG%2BFzxupGGN4GpUdbzZN%2Btn6FQPHo8w0Q%2BAPH5Wz8RG%2Bww%40mail.gmail.com
---
 src/backend/storage/file/fd.c| 33 ++-
 src/common/Makefile  |  1 +
 src/common/file_utils.c  | 32 +--
 src/common/file_utils_febe.c | 99 
 src/include/common/file_utils.h  | 22 ++-
 src/tools/msvc/Mkvcbuild.pm  |  2 +-
 src/tools/pgindent/typedefs.list |  1 +
 7 files changed, 154 insertions(+), 36 deletions(-)
 create mode 100644 src/common/file_utils_febe.c

diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index f376a97ed6..bd72a87ee3 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -89,6 +89,7 @@
 #include "access/xlog.h"
 #include "catalog/pg_tablespace.h"
 #include "common/file_perm.h"
+#include "common/file_utils.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "portability/mem.h"
@@ -3340,8 +3341,6 @@ walkdir(const char *path,
 	while ((de = ReadDirExtended(dir, path, elevel)) != NULL)
 	{
 		char		subpath[MAXPGPATH * 2];
-		struct stat fst;
-		int			sret;
 
 		CHECK_FOR_INTERRUPTS();
 
@@ -3351,23 +3350,23 @@ walkdir(const char *path,
 
 		snprintf(subpath, sizeof(subpath), "%s/%s", path, de->d_name);
 
-		if (process_symlinks)
-			sret = stat(subpath, );
-		else
-			sret = lstat(subpath, );
-
-		if (sret < 0)
+		switch (get_dirent_type(subpath, de, process_symlinks, elevel))
 		{
-			ereport(elevel,
-	(errcode_for_file_access(),
-	 errmsg("could not stat file \"%s\": %m", subpath)));
-			continue;
-		}
+			case PGFILETYPE_REG:
+(*action) (subpath, false, elevel);
+break;
+			case PGFILETYPE_DIR:
+walkdir(subpath, action, false, elevel);
+break;
+			default:
 
-		if (S_ISREG(fst.st_mode))
-			(*action) (subpath, false, elevel);
-		else if (S_ISDIR(fst.st_mode))
-			walkdir(subpath, action, false, elevel);
+/*
+ * Errors are already reported directly by get_dirent_type(),
+ * and any remaining symlinks and unknown file types are
+ * ignored.
+ */
+break;
+		}
 	}
 
 	FreeDir(dir);/* we ignore any error here */
diff --git a/src/common/Makefile b/src/common/Makefile
index 16619e4ba8..aac92aabe1 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -56,6 +56,7 @@ OBJS_COMMON = \
 	exec.o \
 	f2s.o \
 	file_perm.o \
+	file_utils_febe.o \
 	hashfn.o \
 	ip.o \
 	jsonapi.o \
diff --git a/src/common/file_utils.c b/src/common/file_utils.c
index a2faafdf13..e24f31dd3b 100644
--- a/src/common/file_utils.c
+++ b/src/common/file_utils.c
@@ -2,7 +2,7 @@
  *
  * File-processing utility routines.
  *
- * Assorted utility functions to work on files.
+ * Assorted utility functions to work on files, frontend only.
  *
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
@@ -167,8 +167,6 @@ walkdir(const char *path,
 	while (errno = 0, (de = readdir(dir)) != NULL)
 	{
 		char		subpath[MAXPGPATH * 2];
-		struct stat fst;
-		int			sret;
 
 		if (strcmp(de->d_name, ".") == 0 ||
 			strcmp(de->d_name, 

Re: history file on replica and double switchover

2020-09-03 Thread Anastasia Lubennikova

On 27.08.2020 16:02, Grigory Smolkin wrote:

Hello!

I`ve noticed, that when running switchover replica to master and back 
to replica, new history file is streamed to replica, but not archived,
which is not great, because it breaks PITR if archiving is running on 
replica. The fix looks trivial.

Bash script to reproduce the problem and patch are attached.


Thanks for the report. I agree that it looks like a bug.

For some reason, patch failed to apply on current master, even though I 
don't see any difference in the code.

I'll attach this thread to the next commitfest, so it doesn't get lost.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Maximum password length

2020-09-03 Thread Tom Lane
I wrote:
> Alvaro proposes nearby that we ought to have a src/common/ function
> to slurp an indefinitely long line from a file [1].  If we do that,
> it'd be entirely reasonable to make this code use that.  So maybe
> the right comment is "XXX FIXME later".

Actually, on further thought, the obviously right thing to do here
is to refactor simple_prompt into two functions: the inner one is
basically like fgets except it returns a malloc'd, variable-size
string, and then the outer one does the other stuff simple_prompt needs
such as prompting and opening /dev/tty.  The inner function would
serve initdb's need directly, and it would also have uses elsewhere,
as per the other thread.

I'll go make that happen.

regards, tom lane




Re: report expected contrecord size

2020-09-03 Thread Tom Lane
Alvaro Herrera  writes:
> On 2020-Sep-03, Tom Lane wrote:
>> Uh ... is it really possible for gotlen to be more than total_len?
>> (I've not looked at the surrounding code here, but that seems weird.)

> Well, as I understand, total_len comes from one page, and gotlen comes
> from the continuation record(s) in the next page(s) of WAL.  So if
> things are messed up, it could happen.  (This *is* the code that
> validates the record, so it can't make too many assumptions.)

Got it.  No further objections.

regards, tom lane




Re: report expected contrecord size

2020-09-03 Thread Alvaro Herrera
On 2020-Sep-03, Tom Lane wrote:

> Alvaro Herrera  writes:
> > Well, the intention there is to cast the first operand (which is uint32)
> > so that it turns into signed 64-bits; the subtraction then occurs in 64
> > bit arithmetic normally.  If I let the subtraction occur in 32-bit width
> > unsigned, the result might overflow 32 bits.
> 
> Uh ... is it really possible for gotlen to be more than total_len?
> (I've not looked at the surrounding code here, but that seems weird.)

Well, as I understand, total_len comes from one page, and gotlen comes
from the continuation record(s) in the next page(s) of WAL.  So if
things are messed up, it could happen.  (This *is* the code that
validates the record, so it can't make too many assumptions.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: report expected contrecord size

2020-09-03 Thread Tom Lane
Alvaro Herrera  writes:
> Well, the intention there is to cast the first operand (which is uint32)
> so that it turns into signed 64-bits; the subtraction then occurs in 64
> bit arithmetic normally.  If I let the subtraction occur in 32-bit width
> unsigned, the result might overflow 32 bits.

Uh ... is it really possible for gotlen to be more than total_len?
(I've not looked at the surrounding code here, but that seems weird.)

regards, tom lane




Re: report expected contrecord size

2020-09-03 Thread Alvaro Herrera
On 2020-Sep-03, Tom Lane wrote:

> Alvaro Herrera  writes:
> > A pretty minor issue: when reporting that WAL appears invalid because
> > contrecord length doesn't match, we may as well print to the server log
> > the value that we're expecting.  Patch attached.
> 
> ITYW
> 
> +  (long long) (total_len - gotlen),
> 
> just to be sure about what's getting casted to what.

Well, the intention there is to cast the first operand (which is uint32)
so that it turns into signed 64-bits; the subtraction then occurs in 64
bit arithmetic normally.  If I let the subtraction occur in 32-bit width
unsigned, the result might overflow 32 bits.  I'm thinking in
1 - UINT32_MAX or some such.

Maybe to make that more explicit, it should be

+  ((long long) total_len) - gotlen,

(If I understand the precedence correctly, it's the same thing I wrote).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-09-03 Thread Bossart, Nathan
I noticed a small compiler warning for this.

diff --git a/src/backend/replication/logical/worker.c 
b/src/backend/replication/logical/worker.c
index 812aca8011..88d3444c39 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -199,7 +199,7 @@ typedef struct ApplySubXactData
 static ApplySubXactData subxact_data = {0, 0, InvalidTransactionId, NULL};

 static void subxact_filename(char *path, Oid subid, TransactionId xid);
-static void changes_filename(char *path, Oid subid, TransactionId xid);
+static inline void changes_filename(char *path, Oid subid, TransactionId xid);

 /*
  * Information about subtransactions of a given toplevel transaction.

Nathan



Re: Maximum password length

2020-09-03 Thread Tom Lane
"Bossart, Nathan"  writes:
> On 9/3/20, 2:14 PM, "Tom Lane"  wrote:
>> If you insist, I'll change it, but it seems even less likely to ever
>> matter to anybody than the changes to make simple_prompt accept
>> indefinitely long passwords.  (Perhaps a reasonable compromise
>> is to extend this comment to note that we're also not bothering
>> to support indefinitely long passwords.)

> I don't feel strongly about this.  A comment in the code seems
> reasonable.

Alvaro proposes nearby that we ought to have a src/common/ function
to slurp an indefinitely long line from a file [1].  If we do that,
it'd be entirely reasonable to make this code use that.  So maybe
the right comment is "XXX FIXME later".

regards, tom lane

[1] 
https://www.postgresql.org/message-id/20200903200842.GA11952%40alvherre.pgsql




Re: report expected contrecord size

2020-09-03 Thread Tom Lane
Alvaro Herrera  writes:
> A pretty minor issue: when reporting that WAL appears invalid because
> contrecord length doesn't match, we may as well print to the server log
> the value that we're expecting.  Patch attached.

ITYW

+  (long long) (total_len - gotlen),

just to be sure about what's getting casted to what.  Otherwise +1.

regards, tom lane




Re: Parallel worker hangs while handling errors.

2020-09-03 Thread Tom Lane
I wrote:
> As for the question of SIGQUIT handling, I see that postgres.c
> does "PG_SETMASK()" immediately after applying the sigdelset
> change, so there probably isn't any harm in having the background
> processes do likewise.

Concretely, something about like this (I just did the bgwriter, but
we'd want the same in all the background processes).  I tried to
respond to Robert's complaint about the inaccurate comment just above
sigsetjmp, too.

This passes check-world, for what little that's worth.

regards, tom lane

diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 069e27e427..3ae7901bf6 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -117,6 +117,7 @@ BackgroundWriterMain(void)
 
 	/* We allow SIGQUIT (quickdie) at all times */
 	sigdelset(, SIGQUIT);
+	PG_SETMASK();
 
 	/*
 	 * We just started, assume there has been either a shutdown or
@@ -140,7 +141,19 @@ BackgroundWriterMain(void)
 	/*
 	 * If an exception is encountered, processing resumes here.
 	 *
-	 * See notes in postgres.c about the design of this coding.
+	 * You might wonder why this isn't coded as an infinite loop around a
+	 * PG_TRY construct.  The reason is that this is the bottom of the
+	 * exception stack, and so with PG_TRY there would be no exception handler
+	 * in force at all during the CATCH part.  By leaving the outermost setjmp
+	 * always active, we have at least some chance of recovering from an error
+	 * during error recovery.  (If we get into an infinite loop thereby, it
+	 * will soon be stopped by overflow of elog.c's internal state stack.)
+	 *
+	 * Note that we use sigsetjmp(..., 1), so that the prevailing signal mask
+	 * (to wit, BlockSig) will be restored when longjmp'ing to here.  Thus,
+	 * signals will be blocked until we complete error recovery.  It might
+	 * seem that this policy makes the HOLD_INTERRUPTS() call redundant, but
+	 * it is not since InterruptPending might be set already.
 	 */
 	if (sigsetjmp(local_sigjmp_buf, 1) != 0)
 	{


Re: Maximum password length

2020-09-03 Thread Bossart, Nathan
On 9/3/20, 2:14 PM, "Tom Lane"  wrote:
> If you insist, I'll change it, but it seems even less likely to ever
> matter to anybody than the changes to make simple_prompt accept
> indefinitely long passwords.  (Perhaps a reasonable compromise
> is to extend this comment to note that we're also not bothering
> to support indefinitely long passwords.)

I don't feel strongly about this.  A comment in the code seems
reasonable.

Nathan



Re: default partition and concurrent attach partition

2020-09-03 Thread Alvaro Herrera
Also, I should have pointed out that ExecInsert doesn't actually check
the partitin constraint except in very specific cases; what it does is
expect that the partition routing code got it right.  So the comment
you're adding about that is wrong, and it did misled me into changing
your code in a way that actually caused a bug -- hence my proposed
rewording.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




report expected contrecord size

2020-09-03 Thread Alvaro Herrera
A pretty minor issue: when reporting that WAL appears invalid because
contrecord length doesn't match, we may as well print to the server log
the value that we're expecting.  Patch attached.

-- 
Álvaro Herrera http://www.flickr.com/photos/alvherre/
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 67996018da..b69cf80b59 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -464,8 +464,9 @@ XLogReadRecord(XLogReaderState *state, char **errormsg)
 total_len != (pageHeader->xlp_rem_len + gotlen))
 			{
 report_invalid_record(state,
-	  "invalid contrecord length %u at %X/%X",
+	  "invalid contrecord length %u (expected %lld) at %X/%X",
 	  pageHeader->xlp_rem_len,
+	  (long long) total_len - gotlen,
 	  (uint32) (RecPtr >> 32), (uint32) RecPtr);
 goto err;
 			}


Re: Maximum password length

2020-09-03 Thread Tom Lane
"Bossart, Nathan"  writes:
> On 9/3/20, 10:19 AM, "Tom Lane"  wrote:
> +   charpwdbuf[8192];

> If I am reading correctly, this would be the only defined password
> length limit once this patch is applied.  While it's probably unlikely
> that this will cause problems for anybody anytime soon, is there any
> reason not to give this the same treatment as the .pgpass code and
> remove the line length limit altogether?

Yeah, it just didn't quite seem worthwhile there, given the adjacent
comment that clearly says that this is second-class-citizen code:

 * Ideally this should insist that the file not be world-readable.
 * However, this option is mainly intended for use on Windows where
 * file permissions may not exist at all, so we'll skip the paranoia
 * for now.

If you insist, I'll change it, but it seems even less likely to ever
matter to anybody than the changes to make simple_prompt accept
indefinitely long passwords.  (Perhaps a reasonable compromise
is to extend this comment to note that we're also not bothering
to support indefinitely long passwords.)

regards, tom lane




Re: Parallel worker hangs while handling errors.

2020-09-03 Thread Alvaro Herrera
On 2020-Sep-03, Tom Lane wrote:

> As for the question of SIGQUIT handling, I see that postgres.c
> does "PG_SETMASK()" immediately after applying the sigdelset
> change, so there probably isn't any harm in having the background
> processes do likewise.  I wonder though why bgworkers are not
> applying the same policy.

It's quite likely that it's the way it is more by accident than because
I was thinking extremely carefully about signal handling when originally
writing that code.  Some parts of that code I was copying from others'
patches, and I could easily have missed a detail like this.  (I didn't
"git blame" to verify that these parts are mine, though).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Parallel worker hangs while handling errors.

2020-09-03 Thread Tom Lane
vignesh C  writes:
> The Solution Robert & Tom are suggesting by Calling
> BackgroundWorkerUnblockSignals fixes the actual problem.

I've gone ahead and pushed the bgworker fix, since everyone seems
to agree that that's okay, and it is provably fixing a problem.

As for the question of SIGQUIT handling, I see that postgres.c
does "PG_SETMASK()" immediately after applying the sigdelset
change, so there probably isn't any harm in having the background
processes do likewise.  I wonder though why bgworkers are not
applying the same policy.  (I remain of the opinion that any
changes in this area should not be back-patched without evidence
of a concrete problem; it's at least as likely that we'll introduce
a problem as fix one.)

regards, tom lane




Re: default partition and concurrent attach partition

2020-09-03 Thread Alvaro Herrera
On 2020-Sep-03, Alvaro Herrera wrote:

> + /*
> +  * If setting up a PartitionDispatch for a sub-partitioned table, we may
> +  * also need a fake ResultRelInfo for checking the partition constraint
> +  * later; set that up now.
> +  */
> + if (parent_pd)
> + {
> + ResultRelInfo *rri = makeNode(ResultRelInfo);
> +
> + InitResultRelInfo(rri, rel, 1, proute->partition_root, 0);
> + proute->nonleaf_partitions[dispatchidx] = rri;
> + }
> +

Heh, I had intended to remove the attachment before sending, because I
thought I was seeing a problem with this proposed coding of mine.  But
since I sent it by mistake, I'll explain: I think this will fail if we
have a partitioned default partition, and we direct the insert to it
directly while attaching a partition to its parent.  I think that kind
of scenario deserves its own test case.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: default partition and concurrent attach partition

2020-09-03 Thread Alvaro Herrera
Thanks for this fix!  Looking into it now.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/executor/execPartition.c 
b/src/backend/executor/execPartition.c
index 4d34734a45..fe42670e0a 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -52,11 +52,9 @@
  * indexed.
  *
  * nonleaf_partitions
- * Array of 'max_dispatch' elements containing pointers to
- * ResultRelInfos of nonleaf partitions touched by tuple routing.  
These
- * are currently only used for checking the partition's constraint 
if
- * the nonleaf partition happens to be a default partition of its
- * parent
+ * Array of 'max_dispatch' elements containing pointers to fake
+ * ResultRelInfo objects for nonleaf partitions, useful for 
checking
+ * the partition constraint.
  *
  * num_dispatch
  * The current number of items stored in the 
'partition_dispatch_info'
@@ -305,7 +303,7 @@ ExecFindPartition(ModifyTableState *mtstate,
 
/* start with the root partitioned table */
dispatch = pd[0];
-   while (true)
+   do
{
AttrNumber *map = dispatch->tupmap;
int partidx = -1;
@@ -455,22 +453,12 @@ ExecFindPartition(ModifyTableState *mtstate,
 
/*
 * If this partition is the default one, we must check its 
partition
-* constraint, because it may have changed due to partitions 
being
-* added to the parent concurrently.  We do the check here 
instead of
-* in ExecInsert(), because we would not want to miss checking 
the
-* constraint of any nonleaf partitions as they never make it to
-* ExecInsert().
+* constraint now, which may have changed due to partitions 
being
+* added to the parent concurrently.
 */
if (partidx == partdesc->boundinfo->default_index)
-   {
-   Assert(rri != NULL);
ExecPartitionCheck(rri, slot, estate, true);
-   }
-
-   /* Break out and return if we've found the leaf partition. */
-   if (dispatch == NULL)
-   break;
-   }
+   } while (dispatch != NULL);
 
MemoryContextSwitchTo(oldcxt);
ecxt->ecxt_scantuple = ecxt_scantuple_old;
@@ -1037,7 +1025,6 @@ ExecInitPartitionDispatchInfo(EState *estate,
Relationrel;
PartitionDesc partdesc;
PartitionDispatch pd;
-   ResultRelInfo *rri = NULL;
int dispatchidx;
MemoryContext oldcxt;
 
@@ -1123,6 +1110,19 @@ ExecInitPartitionDispatchInfo(EState *estate,
}
proute->partition_dispatch_info[dispatchidx] = pd;
 
+   /*
+* If setting up a PartitionDispatch for a sub-partitioned table, we may
+* also need a fake ResultRelInfo for checking the partition constraint
+* later; set that up now.
+*/
+   if (parent_pd)
+   {
+   ResultRelInfo *rri = makeNode(ResultRelInfo);
+
+   InitResultRelInfo(rri, rel, 1, proute->partition_root, 0);
+   proute->nonleaf_partitions[dispatchidx] = rri;
+   }
+
/*
 * Finally, if setting up a PartitionDispatch for a sub-partitioned 
table,
 * install a downlink in the parent to allow quick descent.
@@ -1133,17 +1133,6 @@ ExecInitPartitionDispatchInfo(EState *estate,
parent_pd->indexes[partidx] = dispatchidx;
}
 
-   /*
-* Also create a minimally valid ResultRelInfo to check the partition's
-* partition's constraint.
-*/
-   if (rel->rd_rel->relispartition)
-   {
-   rri = makeNode(ResultRelInfo);
-   InitResultRelInfo(rri, rel, 1, proute->partition_root, 0);
-   }
-   proute->nonleaf_partitions[dispatchidx] = rri;
-
MemoryContextSwitchTo(oldcxt);
 
return pd;


Re: proposal: possibility to read dumped table's name from file

2020-09-03 Thread Alvaro Herrera
On 2020-Jul-27, Pavel Stehule wrote:

> +/*
> + * getline is originally GNU function, and should not be everywhere still.
> + * Use own reduced implementation.
> + */
> +static size_t
> +pg_getline(char **lineptr, size_t *n, FILE *fp)
> +{

So, Tom added a coding pattern for doing this in commit 8f8154a503c7,
which is ostensibly also to be used in pg_regress [1] -- maybe it'd be
useful to have this in src/common?

[1] 
https://postgr.es/m/m_1NfbowTqSJnrC6rq1a9cQK7E-CHQE7B6Kz9w6fNH-OiV-4mcsdMw7UP2oA2_6dZmXvAMjbSPZjW9U7FD2R52D3d9DtaJxcBprsqJqZNBc=@protonmail.com

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Maximum password length

2020-09-03 Thread Bossart, Nathan
On 9/3/20, 10:19 AM, "Tom Lane"  wrote:
> Hearing no objections to this general plan, I went ahead and did that
> cleanup.  This version seems committable to me.

FILE   *pwf = fopen(pwfilename, "r");
-   int i;
+   charpwdbuf[8192];

If I am reading correctly, this would be the only defined password
length limit once this patch is applied.  While it's probably unlikely
that this will cause problems for anybody anytime soon, is there any
reason not to give this the same treatment as the .pgpass code and
remove the line length limit altogether?

Otherwise, the patch looks good to me.

Nathan



Re: INSERT ON CONFLICT and RETURNING

2020-09-03 Thread Andreas Karlsson

On 9/3/20 6:52 PM, Konstantin Knizhnik wrote:
But frankly speaking I still didn't find answer for my question in this 
thread: what are the dangerous scenarios with ON CONFLICT DO 
NOTHING/SELECT.
Yes, record is not exclusively locked. But I just want to obtain value 
of some column which is not a source of conflict. I do not understand 
what can be wrong if some

other transaction changed this column.

And I certainly can't agree with Peter's statement:
 > Whereas here, with ON CONFLICT DO SELECT,
 > I see a somewhat greater risk, and a much, much smaller benefit. A
 > benefit that might actually be indistinguishable from zero.

 From my point of view it is quite common use case when we need to 
convert some long key to small autogenerated record identifier.
Without UPSERT we have to perform two queries instead of just one . And 
even with current implementation of INSERT ON CONFLICT...
we have to either perform extra lookup, either produce new (useless) 
tuple version.


I have no idea about the potential risks here since I am not very 
familiar with the ON CONFLICT code, but I will chime in and agree that 
this is indeed a common use case. Selecting and taking a SHARE lock 
would also be a nice feature.


Andreas





Re: proposal: possibility to read dumped table's name from file

2020-09-03 Thread Justin Pryzby
On Sun, Jul 05, 2020 at 10:08:09PM +0200, Pavel Stehule wrote:
> st 1. 7. 2020 v 23:24 odesílatel Justin Pryzby  napsal:
> 
> > On Thu, Jun 11, 2020 at 09:36:18AM +0200, Pavel Stehule wrote:
> > > st 10. 6. 2020 v 0:30 odesílatel Justin Pryzby > 
> > > napsal:
> > Also, your getline is dynamically re-allocating lines of arbitrary length.
> > Possibly that's not needed.  We'll typically read "+t schema.relname",
> > which is
> > 132 chars.  Maybe it's sufficient to do
> > char buf[1024];
> > fgets(buf);
> > if strchr(buf, '\n') == NULL: error();
> > ret = pstrdup(buf);
> 
> 63 bytes is max effective identifier size, but it is not max size of
> identifiers. It is very probably so buff with 1024 bytes will be enough for
> all, but I do not want to increase any new magic limit. More when dynamic
> implementation is not too hard.

Maybe you'd want to use a StrInfo like recent patches (8f8154a50).

> Table name can be very long - sometimes the data names (table names) can be
> stored in external storages with full length and should not be practical to
> require truncating in filter file.
> 
> For this case it is very effective, because a resized (increased) buffer is
> used for following rows, so realloc should not be often. So when I have to
> choose between two implementations with similar complexity, I prefer more
> dynamic code without hardcoded limits. This dynamic hasn't any overhead.

-- 
Justin




Re: Support for NSS as a libpq TLS backend

2020-09-03 Thread Andrew Dunstan
>
>  OK, this version contains pre-generated nss files, and passes a full
>  buildfarm run including the ssl test module, with both openssl and NSS.
>  That should keep the cfbot happy :-)
>
> Turns out the CFBot doesn't like the binary diffs.  They are included in this
> version too but we should probably drop them again it seems.
>

I did ask Thomas about this, he was going to try to fix it. In
principle we should want it to accept binary diffs exactly for this
sort of thing.


> The attached v9 contains mostly a first stab at getting some documentation
> going, it's far from completed but I'd rather share more frequently to not 
> have
> local trees deviate too much in case you've had time to hack as well.  I had a
> few documentation tweaks in the code too, but no real functionality change for
> now.
>
> The 0001 patch isn't strictly necessary but it seems reasonable to address the
> various ways OpenSSL was spelled out in the docs while at updating the SSL
> portions.  It essentially ensures that markup around OpenSSL and SSL is used
> consistently.  I didn't address the linelengths being too long in this patch 
> to
> make review easier instead.
>


I'll take a look.

cheers

andrew



-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: 回复:how to create index concurrently on partitioned table

2020-09-03 Thread Anastasia Lubennikova

On 02.09.2020 04:39, Michael Paquier wrote:


The problem with dropped relations in REINDEX has been addressed by
1d65416, so I have gone through this patch again and simplified the
use of session locks, these being taken only when doing a REINDEX
CONCURRENTLY for a given partition.  This part is in a rather
committable shape IMO, so I would like to get it done first, before
looking more at the other cases with CIC and CLUSTER.  I am still
planning to go through it once again.
--
Michael


Thank you for advancing this work.

I was reviewing the previous version, but the review became outdated 
before I sent it. Overall design is fine, but I see a bunch of things 
that need to be fixed before commit.


First of all, this patch fails at cfbot:

indexcmds.c:2848:7: error: variable ‘parentoid’ set but not used 
[-Werror=unused-but-set-variable]

Oid parentoid;^


It seems to be just a typo. With this minor fix the patch compiles and 
passes tests.


diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 75008eebde..f5b3c53a83 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -2864,7 +2864,7 @@ reindex_partitions(Oid relid, int options, bool 
concurrent,
 
/* Save partition OID */

old_context = MemoryContextSwitchTo(reindex_context);
-   partitions = lappend_oid(partitions, partoid);
+   partitions = lappend_oid(partitions, parentoid);
MemoryContextSwitchTo(old_context);
}


If this guessed fix is correct, I see the problem in the patch logic. In 
reindex_partitions() we collect parent relations to pass them to 
reindex_multiple_internal(). It implicitly changes the logic from 
REINDEX INDEX to REINDEX RELATION, which is not the same, if table has 
more than one index. For example, if I add one more index to a 
partitioned table from a create_index.sql test:


create index idx on concur_reindex_part_0_2 using btree (c2);

and call

REINDEX INDEX CONCURRENTLY concur_reindex_part_index;

idx will be reindexed as well. I doubt that it is the desired behavior.


A few more random issues, that I noticed at first glance:

1) in create_index.sql

Are these two lines intentional checks that must fail? If so, I propose 
to add a comment.


REINDEX TABLE concur_reindex_part_index;
REINDEX TABLE CONCURRENTLY concur_reindex_part_index;

A few lines around also look like they were copy-pasted and need a 
second look.


2)  This part of ReindexRelationConcurrently() is misleading.

        case RELKIND_PARTITIONED_TABLE:
        case RELKIND_PARTITIONED_INDEX:
        default:
            /* Return error if type of relation is not supported */
            ereport(ERROR,
                    (errcode(ERRCODE_WRONG_OBJECT_TYPE),
                     errmsg("cannot reindex this type of relation 
concurrently")));


Maybe assertion is enough. It seems, that we should never reach this 
code because both ReindexTable and ReindexIndex handle those relkinds 
separately.  Which leads me to the next question.


3) Is there a reason, why reindex_partitions() is not inside 
ReindexRelationConcurrently() ? I think that logically it belongs there.


4) I haven't tested multi-level partitions yet. In any case, it would be 
good to document this behavior explicitly.



I need a bit more time to review this patch more thoroughly. Please, 
wait for it, before committing.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Yet another fast GiST build (typo)

2020-09-03 Thread Heikki Linnakangas

On 30/08/2020 15:04, Andrey M. Borodin wrote:

23 авг. 2020 г., в 14:39, Andrey M. Borodin  написал(а):

Thanks for reviewing and benchmarking, Pavel!


Pavel sent me few typos offlist. PFA v12 fixing these typos.


In gist_indexsortbuild(), you first build all the leaf pages. Then, you 
read through all the index pages you just built, to form the tuples for 
the next level, and repeat for all the upper levels. That seems 
inefficient, it would be more better to form the tuples for the 
downlinks as you go, when you build the leaf pages in the first place. 
That's how nbtsort.c works. Also, you could WAL-log the pages as you go.


In gist_indexsortbuild_flush(), can't you just memcpy() the page from
memory to the buffer?

- Heikki




Re: Creating foreign key on partitioned table is too slow

2020-09-03 Thread Tom Lane
Amit Langote  writes:
>> Fwiw, I am fine with applying the memory-leak fix in all branches down
>> to v12 if we are satisfied with the implementation.

> I have revised the above patch slightly to introduce a variable for
> the condition whether to use a temporary memory context.

This CF entry has been marked "ready for committer", which I find
inappropriate since there doesn't seem to be any consensus about
whether we need it.

I tried running the original test case under HEAD.  I do not see
any visible memory leak, which I think indicates that 5b9312378 or
some other fix has taken care of the leak since the original report.
However, after waiting awhile and noting that the ADD FOREIGN KEY
wasn't finishing, I poked into its progress with a debugger and
observed that each iteration of RI_Initial_Check() was taking about
15 seconds.  I presume we have to do that for each partition,
which leads to the estimate that it'll take 34 hours to finish this
... and that's with no data in the partitions, god help me if
there'd been a lot.

Some quick "perf" work says that most of the time seems to be
getting spent in the planner, in get_eclass_for_sort_expr().
So this is likely just a variant of performance issues we've
seen before.  (I do wonder why we're not able to prune the
join to just the matching PK partition, though.)

Anyway, the long and the short of it is that this test case is far
larger than anything anyone could practically use in HEAD, let alone
in released branches.  I can't get excited about back-patching a fix
to a memory leak if that's just going to allow people to hit other
performance-killing issues.

In short, I don't see a reason why we need this patch in any branch,
so I recommend rejecting it.  If we did think we need a leak fix in
the back branches, back-porting 5b9312378 would likely be a saner
way to proceed.

regards, tom lane




Re: LogwrtResult contended spinlock

2020-09-03 Thread Alvaro Herrera
Looking at patterns like this

if (XLogCtl->LogwrtRqst.Write < EndPos)
XLogCtl->LogwrtRqst.Write = EndPos;

It seems possible to implement with

do {
XLogRecPtr  currwrite;

currwrite = pg_atomic_read_u64(LogwrtRqst.Write);
if (currwrite > EndPos)
break;  // already done by somebody else
if (pg_atomic_compare_exchange_u64(LogwrtRqst.Write,
   currwrite, EndPos))
break;  // successfully updated
} while (true);

This assumes that LogwrtRqst.Write never goes backwards, so it doesn't
seem good material for a general routine.

This *seems* correct to me, though this is muddy territory to me.  Also,
are there better ways to go about this?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers

2020-09-03 Thread Surafel Temesgen
Hi Joe,

This is my review of your patch
On Fri, Jul 17, 2020 at 1:22 AM Joe Wildish  wrote:

> Hi hackers,
>
> Attached is a patch for supporting queries in the WHEN expression of
> statement triggers.




- Currently, WHEN expressions cannot contain

- subqueries.

subqueries in row trigger's is not supported in your patch so the the
documentation have to reflect it


+ UPDATE triggers are able to refer to both
OLD

+ and NEW

Opening and ending tag mismatch on UPDATE and OLD literal so documentation
build fails and please update the documentation on server programming
section too


+ /*

+ * Plan the statement. No need to rewrite as it can only refer to the

+ * transition tables OLD and NEW, and the relation which is being

+ * triggered upon.

+ */

+ stmt = pg_plan_query(query, trigger->tgqual, 0, NULL);

+ dest = CreateDestReceiver(DestTuplestore);

+ store = tuplestore_begin_heap(false, false, work_mem);

+ tupdesc = CreateTemplateTupleDesc(1);

+ whenslot = MakeSingleTupleTableSlot(tupdesc, );

Instead of planning every time the trigger fire I suggest to store plan or
prepared statement node so planning time can be saved


There are server crash on the following sequence of command

CREATE TABLE main_table (a int unique, b int);


CREATE FUNCTION trigger_func() RETURNS trigger LANGUAGE plpgsql AS '

BEGIN

RAISE NOTICE ''trigger_func(%) called: action = %, when = %, level = %'',
TG_ARGV[0], TG_OP, TG_WHEN, TG_LEVEL;

RETURN NULL;

END;';


INSERT INTO main_table DEFAULT VALUES;


CREATE TRIGGER after_insert AFTER INSERT ON main_table

REFERENCING NEW TABLE AS NEW FOR EACH STATEMENT

WHEN (500 <= ANY(SELECT b FROM NEW union SELECT a FROM main_table))

EXECUTE PROCEDURE trigger_func('after_insert');


INSERT INTO main_table (a, b) VALUES

(101, 498),

(102, 499);

server crashed


regards

Surafel


Re: INSERT ON CONFLICT and RETURNING

2020-09-03 Thread Alexander Korotkov
On Thu, Sep 3, 2020 at 7:56 PM Geoff Winkless  wrote:
>
> On Mon, 31 Aug 2020 at 14:53, Konstantin Knizhnik
>  wrote:
> > If we are doing such query:
> >
> > INSERT INTO jsonb_schemas (schema) VALUES (obj_schema)
> >ON CONFLICT (schema) DO UPDATE schema=jsonb_schemas.schema RETURNING id
> >
> >
> > Then as far as I understand no extra lookup is used to return ID:
>
> The conflict resolution checks the unique index on (schema) and
> decides whether or not a conflict will exist. For DO NOTHING it
> doesn't have to get the actual row from the table; however in order
> for it to return the ID it would have to go and get the existing row
> from the table. That's the "extra lookup", as you term it. The only
> difference from doing it with RETURNING id versus WITH... COALESCE()
> as you described is the simpler syntax.

As I know, conflict resolution still has to fetch heap tuples, see
_bt_check_unique().  As I understand it, the issues are as follows.
1) Conflict resolution uses the dirty snapshot.  It's unclear whether
we can return this tuple to the user, because the query has a
different snapshot.  Note, that CTE query by Konstantin at thead start
doesn't handle all the cases correctly, it can return no rows on
conflict. We probably should do the trick similar to the EPQ mechanism
for UPDATE.  For instance, UPDATE ... RETURNING old.* can return the
tuple, which doesn't match the query snapshot.  But INSERT ON CONFLICT
might have other caveats in this area, it needs careful analysis.
2) Checking unique conflicts inside the index am is already the
encapsulation-breaking hack.  Returning the heap tuple for index am
would be even worse hack.  We probably should refactor this whole area
before.

--
Regards,
Alexander Korotkov




Re: Get memory contexts of an arbitrary backend process

2020-09-03 Thread Tom Lane
Kasahara Tatsuhito  writes:
> Yes, but it's not only for future expansion, but also for the
> usability and the stability of this feature.
> For example, if you want to read one dumped file multiple times and analyze 
> it,
> you will want the ability to just read the dump.

If we design it to make that possible, how are we going to prevent disk
space leaks from never-cleaned-up dump files?

regards, tom lane




Re: Fix for configure error in 9.5/9.6 on macOS 11.0 Big Sur

2020-09-03 Thread Tom Lane
Jesse Zhang  writes:
>> Peter Eisentraut  writes:
>>> Where did the -Werror come from?

> If you noticed the full invocation of clang, you'd notice that Werror is
> nowhere on the command line, even though the error message suggests
> otherwise. I think this is a behavior from the new AppleClang,

Hmph.  If you explicitly say -Wno-error, does the error drop back down
to being a warning?

> I've heard reports of the same under the latest Xcode 12 on macOS
> Catalina, but I don't have my hands on such an env.

The latest thing available to the unwashed masses seems to be
Xcode 11.7 with

$ clang --version
Apple clang version 11.0.3 (clang-1103.0.32.62)

At least, that's what I got when I reinstalled Xcode just now on
my Catalina machine.  It does not exhibit this behavior.  I see

$ clang -c c.c
c.c:1:14: warning: implicitly declaring library function 'exit' with type 'void
  (int) __attribute__((noreturn))' [-Wimplicit-function-declaration]
int main() { exit(0); }
 ^
c.c:1:14: note: include the header  or explicitly provide a
  declaration for 'exit'
1 warning generated.

and PG configure and build goes through just fine.

Smells like an Apple bug from here.  Surely they're not expecting
that anyone will appreciate -Werror suddenly being the default.

regards, tom lane




Re: Maximum password length

2020-09-03 Thread Tom Lane
I wrote:
> This could be refined; in particular, I think that most of the
> password-prompting sites could drop their separate have_password
> flags in favor of checking whether the password pointer is NULL
> or not.  That would likely also prove that some of the free(password)
> calls I sprinkled in are unnecessary.

Hearing no objections to this general plan, I went ahead and did that
cleanup.  This version seems committable to me.

regards, tom lane

diff --git a/contrib/oid2name/oid2name.c b/contrib/oid2name/oid2name.c
index 91b7958c48..5a884e2904 100644
--- a/contrib/oid2name/oid2name.c
+++ b/contrib/oid2name/oid2name.c
@@ -12,6 +12,7 @@
 #include "catalog/pg_class_d.h"
 #include "common/connect.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
 #include "pg_getopt.h"
@@ -293,8 +294,7 @@ PGconn *
 sql_conn(struct options *my_opts)
 {
 	PGconn	   *conn;
-	bool		have_password = false;
-	char		password[100];
+	char	   *password = NULL;
 	bool		new_pass;
 	PGresult   *res;
 
@@ -316,7 +316,7 @@ sql_conn(struct options *my_opts)
 		keywords[2] = "user";
 		values[2] = my_opts->username;
 		keywords[3] = "password";
-		values[3] = have_password ? password : NULL;
+		values[3] = password;
 		keywords[4] = "dbname";
 		values[4] = my_opts->dbname;
 		keywords[5] = "fallback_application_name";
@@ -336,11 +336,10 @@ sql_conn(struct options *my_opts)
 
 		if (PQstatus(conn) == CONNECTION_BAD &&
 			PQconnectionNeedsPassword(conn) &&
-			!have_password)
+			!password)
 		{
 			PQfinish(conn);
-			simple_prompt("Password: ", password, sizeof(password), false);
-			have_password = true;
+			password = simple_prompt("Password: ", false);
 			new_pass = true;
 		}
 	} while (new_pass);
diff --git a/contrib/vacuumlo/vacuumlo.c b/contrib/vacuumlo/vacuumlo.c
index e4019fafaa..532cc596c4 100644
--- a/contrib/vacuumlo/vacuumlo.c
+++ b/contrib/vacuumlo/vacuumlo.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_class_d.h"
 #include "common/connect.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
 #include "pg_getopt.h"
@@ -69,15 +70,11 @@ vacuumlo(const char *database, const struct _param *param)
 	int			i;
 	bool		new_pass;
 	bool		success = true;
-	static bool have_password = false;
-	static char password[100];
+	static char *password = NULL;
 
 	/* Note: password can be carried over from a previous call */
-	if (param->pg_prompt == TRI_YES && !have_password)
-	{
-		simple_prompt("Password: ", password, sizeof(password), false);
-		have_password = true;
-	}
+	if (param->pg_prompt == TRI_YES && !password)
+		password = simple_prompt("Password: ", false);
 
 	/*
 	 * Start the connection.  Loop until we have a password if requested by
@@ -97,7 +94,7 @@ vacuumlo(const char *database, const struct _param *param)
 		keywords[2] = "user";
 		values[2] = param->pg_user;
 		keywords[3] = "password";
-		values[3] = have_password ? password : NULL;
+		values[3] = password;
 		keywords[4] = "dbname";
 		values[4] = database;
 		keywords[5] = "fallback_application_name";
@@ -115,12 +112,11 @@ vacuumlo(const char *database, const struct _param *param)
 
 		if (PQstatus(conn) == CONNECTION_BAD &&
 			PQconnectionNeedsPassword(conn) &&
-			!have_password &&
+			!password &&
 			param->pg_prompt != TRI_NO)
 		{
 			PQfinish(conn);
-			simple_prompt("Password: ", password, sizeof(password), false);
-			have_password = true;
+			password = simple_prompt("Password: ", false);
 			new_pass = true;
 		}
 	} while (new_pass);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 02b6c3f127..36565df4fc 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -698,7 +698,7 @@ recv_password_packet(Port *port)
 	}
 
 	initStringInfo();
-	if (pq_getmessage(, 1000))	/* receive password */
+	if (pq_getmessage(, 0)) /* receive password */
 	{
 		/* EOF - pq_getmessage already logged a suitable message */
 		pfree(buf.data);
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 786672b1b6..b62f8f6a5e 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -67,6 +67,7 @@
 #include "common/file_utils.h"
 #include "common/logging.h"
 #include "common/restricted_token.h"
+#include "common/string.h"
 #include "common/username.h"
 #include "fe_utils/string_utils.h"
 #include "getaddrinfo.h"
@@ -1481,23 +1482,25 @@ setup_auth(FILE *cmdfd)
 static void
 get_su_pwd(void)
 {
-	char		pwd1[100];
-	char		pwd2[100];
+	char	   *pwd1;
 
 	if (pwprompt)
 	{
 		/*
 		 * Read password from terminal
 		 */
+		char	   *pwd2;
+
 		printf("\n");
 		fflush(stdout);
-		simple_prompt("Enter new superuser password: ", pwd1, sizeof(pwd1), false);
-		simple_prompt("Enter it again: ", pwd2, sizeof(pwd2), false);
+		pwd1 = simple_prompt("Enter new superuser password: ", false);
+		pwd2 = simple_prompt("Enter it again: ", false);
 		if (strcmp(pwd1, pwd2) != 0)
 		{
 			fprintf(stderr, 

Re: Get memory contexts of an arbitrary backend process

2020-09-03 Thread Kasahara Tatsuhito
Hi,

On Thu, Sep 3, 2020 at 3:38 PM torikoshia  wrote:
> >> - Currently, "the signal transmission for dumping memory
> >> information"
> >> and "the read & output of dump information "
> >> are on the same interface, but I think it would be better to
> >> separate them.
> >> How about providing the following three types of functions for
> >> users?
> >> - send a signal to specified pid
> >> - check the status of the signal sent and received
> >> - read the dumped information
>
> Is this for future extensibility to make it possible to get
> other information like the current execution plan which was
> suggested by Pavel?
Yes, but it's not only for future expansion, but also for the
usability and the stability of this feature.
For example, if you want to read one dumped file multiple times and analyze it,
you will want the ability to just read the dump.
Moreover, when it takes a long time from the receive the signal to the
dump output,
or the dump output itself takes a long time, users can investigate
where it takes time
if each process is separated.

> If so, I agree with considering extensibility, but I'm not
> sure it's necessary whether providing these types of
> functions for 'users'.
Of course, it is possible and may be necessary to provide a wrapped
sequence of processes
from sending a signal to reading  dump files.
But IMO, some users would like to perform the signal transmission,
state management and
dump file reading processes separately.

Best regards,

-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com




Re: shared-memory based stats collector

2020-09-03 Thread Stephen Frost
Greetings,

* Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> At Mon, 01 Jun 2020 18:00:01 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 
> > Rebased on the current HEAD. 36ac359d36 conflicts with this. Tranche
> 
> Hmm. This conflicts with 0fd2a79a63. Reabsed on it.

Thanks for working on this and keeping it updated!

I've started taking a look and at least right off...

> >From 4926e50e7635548f86dcd0d36cbf56d168a5d242 Mon Sep 17 00:00:00 2001
> From: Kyotaro Horiguchi 
> Date: Mon, 16 Mar 2020 17:15:35 +0900
> Subject: [PATCH v35 1/7] Use standard crash handler in archiver.
> 
> The commit 8e19a82640 changed SIGQUIT handler of almost all processes
> not to run atexit callbacks for safety. Archiver process should behave
> the same way for the same reason. Exit status changes 1 to 2 but that
> doesn't make any behavioral change.

Shouldn't this:

a) be back-patched, as the other change was
b) also include a change to have the stats collector (which I realize is
   removed later on in this patch set, but we're talking about fixing
   existing things..) for the same reason, and because there isn't much
   point in trying to write out the stats after we get a SIGQUIT, since
   we're just going to blow them away again since we're going to go
   through crash recovery..?

Might be good to have a separate thread to address these changes.

I've looked through (some of) this thread and through the patches also
and hope to provide a review of the bits that should be targetting v14
(unlike the above) soon.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: INSERT ON CONFLICT and RETURNING

2020-09-03 Thread Geoff Winkless
On Mon, 31 Aug 2020 at 14:53, Konstantin Knizhnik
 wrote:
> If we are doing such query:
>
> INSERT INTO jsonb_schemas (schema) VALUES (obj_schema)
>ON CONFLICT (schema) DO UPDATE schema=jsonb_schemas.schema RETURNING id
>
>
> Then as far as I understand no extra lookup is used to return ID:

The conflict resolution checks the unique index on (schema) and
decides whether or not a conflict will exist. For DO NOTHING it
doesn't have to get the actual row from the table; however in order
for it to return the ID it would have to go and get the existing row
from the table. That's the "extra lookup", as you term it. The only
difference from doing it with RETURNING id versus WITH... COALESCE()
as you described is the simpler syntax.

I'm not saying the simpler syntax isn't nice, mind you. I was just
pointing out that it's not inherently any less efficient.

Geoff




Re: INSERT ON CONFLICT and RETURNING

2020-09-03 Thread Konstantin Knizhnik




On 03.09.2020 19:30, Marko Tiikkaja wrote:

There's prior art on this: https://commitfest.postgresql.org/15/1241/


.m

Ooops:(
Thank you.
I missed it.

But frankly speaking I still didn't find answer for my question in this 
thread: what are the dangerous scenarios with ON CONFLICT DO NOTHING/SELECT.
Yes, record is not exclusively locked. But I just want to obtain value 
of some column which is not a source of conflict. I do not understand 
what can be wrong if some

other transaction changed this column.

And I certainly can't agree with Peter's statement:
> Whereas here, with ON CONFLICT DO SELECT,
> I see a somewhat greater risk, and a much, much smaller benefit. A
> benefit that might actually be indistinguishable from zero.

From my point of view it is quite common use case when we need to 
convert some long key to small autogenerated record identifier.
Without UPSERT we have to perform two queries instead of just one . And 
even with current implementation of INSERT ON CONFLICT...
we have to either perform extra lookup, either produce new (useless) 
tuple version.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Disk-based hash aggregate's cost model

2020-09-03 Thread Jeff Davis
On Wed, 2020-09-02 at 17:35 -0700, Peter Geoghegan wrote:
> On Wed, Sep 2, 2020 at 5:18 PM Jeff Davis  wrote:
> > create table text10m(t text collate "C.UTF-8", i int, n numeric);
> > insert into text10m select s.g::text, s.g, s.g::numeric from
> > (select
> > (random()*10)::int as g from generate_series(1,1000))
> > s;
> > explain analyze select distinct t from text10m;
> 
> Note that you won't get what Postgres considers to be the C collation
> unless you specify "collate C" -- "C.UTF-8" is the C collation
> exposed
> by glibc. The difference matters a lot, because only the former can
> use abbreviated keys (unless you manually #define TRUST_STRXFRM). And
> even without abbreviated keys it's probably still significantly
> faster
> for other reasons.

Thank you. I reran with:

  create table text10m2(t text collate "C", i int, n numeric);
  -- same data, same queries

And the new table is:

Plan | work | 10M  | 100M INT4 | 100M | 10M  | 10M
 |  mem | INT4 | 10M grps  | INT4 | TEXT | TEXTC
-+--+--+---+--+--+--
HashAgg  | 4MB  |  88  |  63   |  82  |  78  |  80
HashAgg  | 1TB  |  41  |  37   |  33  |  38  |  43
Sort | 4MB  | 182  | 188   | 174  |  37  | 146
Sort | 1TB  | 184  | 231   | 189  |  30  | 149
HashAgg* | 4MB  | 192  | 131   | 178  | 166  | 176

*: patched

For the 'COLLATE "C"' case, the costs still come out the almost the
same between HashAgg and Sort, but the runtimes are much closer. So
even if it did flip the plan from HashAgg to Sort, it goes from 9.5s
(HashAgg) to 12s (Sort), which is not so bad.

So the patched version looks good to me at this point. It accounts for
Tomas's observations about IO:

  "The other thing is that sort seems to be doing only about half the
physical I/O (as measured by iosnoop) compared to hashagg, even though
the estimates of pages / input_bytes are exactly the same."

by penalizing HashAgg disk costs by 2X.

The patch also accounts for his other observation about missing CPU
costs by costing the spilling. Tomas framed the CPU costs as the cost
of the extra lookups, but the extra lookups are only in the cases where
it misses in the hash table and needs to spill. So I think it's
reasonable to consider the extra lookups as a part of the spill cost.

The remaining problems are:

* comparison costs for Sort should be adjusted to make them relatively
consistent between data types
* in-memory HashAgg is unfairly favored in a lot of cases

I don't think either of those problems need to be (or should be) fixed
in 13, but we can revisit in 14 if there are reports of bad plans.

Regards,
Jeff Davis






Re: INSERT ON CONFLICT and RETURNING

2020-09-03 Thread Marko Tiikkaja
There's prior art on this: https://commitfest.postgresql.org/15/1241/


.m


Re: Allow continuations in "pg_hba.conf" files

2020-09-03 Thread Tom Lane
I wrote:
> Accordingly, I borrowed some code from that thread and present
> the attached revision.  I also added some test coverage, since
> that was lacking before, and wordsmithed docs and comments slightly.

Hearing no comments, pushed that way.

regards, tom lane




Re: INSERT ON CONFLICT and RETURNING

2020-09-03 Thread Konstantin Knizhnik



On 22.08.2020 10:16, Konstantin Knizhnik wrote:

Hi hackers,

I am sorry for the question which may be already discussed multiple 
times.
But I have not found answer for it neither in internet neither in 
pgsql-hackers archieve.
UPSERT (INSERT ... IN CONFLICT...) clause was added to the Postgres a 
long time ago.
As far as I remember there was long discussions about its syntax and 
functionality.
But today I found that there is still no way to perform one of the 
most frequently needed operation:
locate record by key and return its autogenerated ID or insert new 
record if key is absent.


Something like this:

  create table jsonb_schemas(id serial, schema bytea primary key);
  create index on jsonb_schemas(id);
  insert into jsonb_schemas (schema) values (?) on conflict(schema) do 
nothing returning id;


But it doesn't work because in case of conflict no value is returned.
It is possible to do something like this:

  with ins as (insert into jsonb_schemas (schema) values (obj_schema) 
on conflict(schema) do nothing returning id) select coalesce((select 
id from ins),(select id from jsonb_schemas where schema=obj_schema));


but it requires extra lookup.
Or perform update:

  insert into jsonb_schemas (schema) values (?) on conflict(schema) do 
update set schema=excluded.schema returning id;


But it is even worse because we have to perform useless update and 
produce new version.


May be I missing something, but according to stackoverflow:
https://stackoverflow.com/questions/34708509/how-to-use-returning-with-on-conflict-in-postgresql 


there is no better solution.

I wonder how it can happen that such popular use case ia not covered 
by Postgresql UPSERT?

Are there some principle problems with it?
Why it is not possible to add one more on-conflict action: SELECT, 
making it possible to return data when key is found?


Thanks in advance,
Konstantin


I'm sorry for been intrusive.
But can somebody familiar with Postgres upsert mechanism explain me why 
current implementation doesn't support very popular use case:
locate record by some unique key and and return its primary 
(autogenerated) key if found otherwise insert new tuple.

I have explained the possible workarounds of the problem above.
But all of them looks awful or inefficient.

What I am suggesting is just add ON CONFLICT DO SELECT clause:

insert into jsonb_schemas (schema) values ('one') on conflict(schema) do 
select returning id;


I attached small patch with prototype implementation of this construction.
It seems to be very trivial. What's wring with it?
Are there some fundamental problems which I do not understand?

Below is small illustration of how this patch is working:

postgres=# create table jsonb_schemas(id serial, schema bytea primary key);
CREATE TABLE
postgres=# create index on jsonb_schemas(id);
CREATE INDEX
postgres=# insert into jsonb_schemas (schema) values ('some') on 
conflict(schema) do nothing returning id;

 id

  1
(1 row)

INSERT 0 1
postgres=# insert into jsonb_schemas (schema) values ('some') on 
conflict(schema) do nothing returning id;

 id

(0 rows)

INSERT 0 0
postgres=# insert into jsonb_schemas (schema) values ('some') on 
conflict(schema) do select returning id;

 id

  1
(1 row)

INSERT 0 1


Thanks in advance,
Konstantin

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index c98c9b5..8a22b8c 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -3755,8 +3755,10 @@ show_modifytable_info(ModifyTableState *mtstate, List *ancestors,
 	if (node->onConflictAction != ONCONFLICT_NONE)
 	{
 		ExplainPropertyText("Conflict Resolution",
-			node->onConflictAction == ONCONFLICT_NOTHING ?
-			"NOTHING" : "UPDATE",
+			node->onConflictAction == ONCONFLICT_NOTHING
+			? "NOTHING"
+			: node->onConflictAction == ONCONFLICT_SELECT
+			? "SELECT" : "UPDATE",
 			es);
 
 		/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 20a4c47..8e64061 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -545,6 +545,8 @@ ExecInsert(ModifyTableState *mtstate,
 	else
 		goto vlock;
 }
+/* committed conflict tuple found */
+
 else
 {
 	/*
@@ -558,11 +560,26 @@ ExecInsert(ModifyTableState *mtstate,
 	 * type. As there's no conflicting usage of
 	 * ExecGetReturningSlot() in the DO NOTHING case...
 	 */
-	Assert(onconflict == ONCONFLICT_NOTHING);
+	Assert(onconflict == ONCONFLICT_NOTHING || onconflict == ONCONFLICT_SELECT);
 	ExecCheckTIDVisible(estate, resultRelInfo, ,
 		ExecGetReturningSlot(estate, resultRelInfo));
 	InstrCountTuples2(>ps, 1);
-	return NULL;
+	if (onconflict == ONCONFLICT_SELECT && resultRelInfo->ri_projectReturning)
+	{
+		TupleTableSlot *existing = resultRelInfo->ri_onConflict->oc_Existing;
+		TM_FailureData tmfd;
+		TM_Result	   test = 

Re: Kerberos support broken on MSVC builds for Windows x64?

2020-09-03 Thread Dave Page
On Thu, Sep 3, 2020 at 4:15 PM Dave Page  wrote:

>
> So having rebuilt PostgreSQL against that, I'm now in the situation where
> the server never even attempts to get a ticket as far as I can see, and
> psql just crashes with nothing more than a useless error in the event log:
>
> Faulting application name: psql.exe, version: 14.0.0.20246, time stamp:
> 0x5f50e477
> Faulting module name: unknown, version: 0.0.0.0, time stamp: 0x
> Exception code: 0xc005
> Fault offset: 0x
> Faulting process id: 0xd10
> Faulting application start time: 0x01d681f189a17360
> Faulting application path: C:\pg\bin\psql.exe
> Faulting module path: unknown
> Report Id: eb68d787-1c82-420d-8878-bc0648932a5d
> Faulting package full name:
> Faulting package-relative application ID:
>
> So I'm going to have to break out the debugger, though I suspect this may
> require more effort than I have time for right now.
>

Yeah, this is almost certainly well beyond what I have the time to figure
out. Happy to do any testing etc. that may be needed, but I think this
needs someone familiar with the GSS API to take the lead.

Here's what I got from psql in the debugger:

Exception thrown at 0x in psql.exe: 0xC005: Access
violation executing location 0x. occurred

()
krb5_64.dll!51942807()
krb5_64.dll!5194214b()
krb5_64.dll!51980611()
krb5_64.dll!519766cb()
krb5_64.dll!519670ff()
gssapi64.dll!51bb1839()
gssapi64.dll!51bb48e4()
gssapi64.dll!51bb4575()
gssapi64.dll!51b993df()
libpq.dll!pqsecure_open_gss(pg_conn * conn) Line 632
at
c:\users\dpage\downloads\postgresql\src\interfaces\libpq\fe-secure-gssapi.c(632)
libpq.dll!PQconnectPoll(pg_conn * conn) Line 3173
at
c:\users\dpage\downloads\postgresql\src\interfaces\libpq\fe-connect.c(3173)
libpq.dll!connectDBComplete(pg_conn * conn) Line 2187
at
c:\users\dpage\downloads\postgresql\src\interfaces\libpq\fe-connect.c(2187)
libpq.dll!PQconnectdbParams(const char * const * keywords, const char *
const * values, int expand_dbname) Line 655
at
c:\users\dpage\downloads\postgresql\src\interfaces\libpq\fe-connect.c(655)
psql.exe!main(int argc, char * * argv) Line 266
at c:\users\dpage\downloads\postgresql\src\bin\psql\startup.c(266)
[External Code]

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com


Re: Fix for configure error in 9.5/9.6 on macOS 11.0 Big Sur

2020-09-03 Thread Jesse Zhang
Hi Tom and Peter,

On Wed, Sep 2, 2020 at 10:40 PM Tom Lane  wrote:
>
> Peter Eisentraut  writes:
> > On 2020-09-02 22:43, Jesse Zhang wrote:
> >> | conftest.c:184:3: error: implicitly declaring library function
> >> 'exit' with type 'void (int) __attribute__((noreturn))'
> >> [-Werror,-Wimplicit-function-declaration]
>
> > Where did the -Werror come from?
>
> Peter wasn't entirely explicit here, but note the advice at the end of
>
> https://www.postgresql.org/docs/devel/install-procedure.html
>
> that you cannot include -Werror in any CFLAGS you tell configure
> to use.  It'd be nice if autoconf was more robust about that,
> but it is not our bug to fix.
>
> regards, tom lane

If you noticed the full invocation of clang, you'd notice that Werror is
nowhere on the command line, even though the error message suggests
otherwise. I think this is a behavior from the new AppleClang, here's
the minimal repro:

int main() { exit(0); }

And boom!

$ clang -c c.c
c.c:1:14: error: implicitly declaring library function 'exit' with
type 'void (int) __attribute__((noreturn))'
[-Werror,-Wimplicit-function-declaration]
int main() { exit(0); }
 ^
c.c:1:14: note: include the header  or explicitly provide a
declaration for 'exit'
1 error generated.

My environment:

$ uname -rsv
Darwin 20.0.0 Darwin Kernel Version 20.0.0: Fri Aug 14 00:25:13 PDT
2020; root:xnu-7195.40.44.151.1~4/RELEASE_X86_64
$ clang --version
Apple clang version 12.0.0 (clang-1200.0.31.1)
Target: x86_64-apple-darwin20.0.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

I've heard reports of the same under the latest Xcode 12 on macOS
Catalina, but I don't have my hands on such an env.

Cheers,
Jesse




Re: [PATCH] Missing links between system catalog documentation pages

2020-09-03 Thread Dagfinn Ilmari Mannsåker
Hi Peter,

Peter Eisentraut  writes:

> On 2020-06-21 19:49, Dagfinn Ilmari Mannsåker wrote:
>> There were only three cases of multiple mentions of the same table in a
>> single paragraph, I've removed them in the attached patch.
>>
>> I've also added a second patch that makes the SQL commands links.  There
>> were some cases of the same commands being mentioned in the descriptions
>> of multiple columns in the same table, but I've left those in place,
>> since that feels less disruptive than in prose.
>
> Committed after some rebasing and tweaking.

Thanks!

I just noticed that both this and the command link patch modified the
same sentence about CREATE DATABASE and pg_database, so those changes
seem to have been lost in the merge.  Attached is a follow-up patch that
adds them both.

- ilmari
-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law

>From d043e64c925a21d42bfd28e6c4ca2e44e4b4d356 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Sun, 24 May 2020 21:48:29 +0100
Subject: [PATCH] Add two more missing command and catalog cross-links

The two earlier doc patches both modified this paragraph, so these two
changes appear to have been lost while resolving the conflict.
---
 doc/src/sgml/catalogs.sgml | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 508bea3bc6..6f2e0d19dc 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -14,13 +14,14 @@
tables.  You can drop and recreate the tables, add columns, insert
and update values, and severely mess up your system that way.
Normally, one should not change the system catalogs by hand, there
-   are normally SQL commands to do that.  (For example, CREATE
-   DATABASE inserts a row into the
-   pg_database catalog  and actually
-   creates the database on disk.)  There are some exceptions for
-   particularly esoteric operations, but many of those have been made
-   available as SQL commands over time, and so the need for direct manipulation
-   of the system catalogs is ever decreasing.
+   are normally SQL commands to do that.  (For example,  inserts a row into the pg_database
+   catalog  and actually creates the database on disk.)  There
+   are some exceptions for particularly esoteric operations, but many
+   of those have been made available as SQL commands over time, and so
+   the need for direct manipulation of the system catalogs is ever
+   decreasing.
   
 
  
-- 
2.27.0



Re: A micro-optimisation for walkdir()

2020-09-03 Thread Tom Lane
Thomas Munro  writes:
> On Thu, Sep 3, 2020 at 5:36 PM Tom Lane  wrote:
>> Both of these concerns would abate if we had get_dirent_type()
>> just throw an error itself when stat() fails, thereby removing the
>> PGFILETYPE_ERROR result code.  I'm not 100% sold either way on
>> that, but it's something to think about.  Is there ever going
>> to be a reason for the caller to ignore an error?

> Hmm.  Well I had it like that in an earlier version, but then I
> couldn't figure out the right way to write code that would work in
> both frontend and backend code, without writing two copies in two
> translation units, or putting the whole thing in a header.  What
> approach do you prefer?

Well, there are plenty of places in src/port/ where we do things like

#ifndef FRONTEND
ereport(ERROR,
(errcode_for_file_access(),
 errmsg("could not get junction for \"%s\": %s",
path, msg)));
#else
fprintf(stderr, _("could not get junction for \"%s\": %s\n"),
path, msg);
#endif

I don't see a compelling reason why this function couldn't report
stat() failures similarly, especially if we're just going to have
the callers do exactly the same thing as that anyway.

regards, tom lane




Re: Kerberos support broken on MSVC builds for Windows x64?

2020-09-03 Thread Dave Page
On Wed, Sep 2, 2020 at 5:21 PM Dave Page  wrote:

>
>
> On Wed, Sep 2, 2020 at 2:47 PM Stephen Frost  wrote:
>
>> Greetings,
>>
>> * Dave Page (dp...@pgadmin.org) wrote:
>> > On Tue, Sep 1, 2020 at 5:29 PM Stephen Frost 
>> wrote:
>> > > * Dave Page (dp...@pgadmin.org) wrote:
>> > > > Attached is a patch against 12.4 for the build system in case anyone
>> > > wants
>> > > > to play (I'll do it properly against the head branch later). I'm
>> guessing
>> > > > this will work for < 12, as with 12 I'm now getting the following
>> which
>> > > > looks like it's related to GSS encryption:
>> > > >
>> > > > "C:\Users\dpage\Downloads\postgresql-12.4\pgsql.sln" (default
>> target)
>> > > (1) ->
>> > > > "C:\Users\dpage\Downloads\postgresql-12.4\pgcrypto.vcxproj" (default
>> > > > target) (2) ->
>> > > > "C:\Users\dpage\Downloads\postgresql-12.4\postgres.vcxproj" (default
>> > > > target) (3) ->
>> > > > (Link target) ->
>> > > >   be-secure-gssapi.obj : error LNK2019: unresolved external symbol
>> setenv
>> > > > referenced in function secure_open_gssapi
>> > > > [C:\Users\dpage\Downloads\postgresql-12.4\postgres.vcxproj]
>> > > >   .\Release\postgres\postgres.exe : fatal error LNK1120: 1
>> unresolved
>> > > > externals
>> [C:\Users\dpage\Downloads\postgresql-12.4\postgres.vcxproj]
>> > > >
>> > > > I'll dig into that some more.
>> > >
>> > > Yes, that'd be in the GSSENC code, which I hadn't been expecting to be
>> > > used under Windows.  If you're successful, I don't have any issue
>> > > helping to make that work, though I'm curious if you're trying to
>> build
>> > > with MIT KfW (which is rather ancient these days, being based on krb5
>> > > 1.13 and not updated since..) or with a more current release...?
>> >
>> > I'm currently using the KFW 4.1 build from MIT. I've tried building it
>> > myself but it requires a very old toolchain (which defeated the point of
>> > what I was trying to do at the time).
>>
>> > I haven't yet looked to see if the source for krb5-1.8.2 will build or
>> even
>> > has the right bits in it for Windows - as I'm sure you know MIT seem to
>> > maintain an entirely different version for Windows for which I assume
>> > there's a reason.
>>
>> I'm a bit confused as to why you'd consider trying 1.8.2- did you mean
>> 1.18.2 there, perhaps..?
>
>
> Yes, typo.
>
>
>> That's what I would think to try, since, as I
>> understand it from following the Kerberos Dev list (which is pretty
>> responsive...) has been updated to work with newer Windows build
>> toolchains.
>>
>
> OK, will try to do that tomorrow.
>
> Thanks!
>

OK, so 1.18.2 builds OK. It's a bit of a faff, but nothing major. It seems
to work fine as a standalone set of tools.

Of course, they've changed the installation paths again - they've dropped
the i386 and amd64 parts from the library path :-/

So having rebuilt PostgreSQL against that, I'm now in the situation where
the server never even attempts to get a ticket as far as I can see, and
psql just crashes with nothing more than a useless error in the event log:

Faulting application name: psql.exe, version: 14.0.0.20246, time stamp:
0x5f50e477
Faulting module name: unknown, version: 0.0.0.0, time stamp: 0x
Exception code: 0xc005
Fault offset: 0x
Faulting process id: 0xd10
Faulting application start time: 0x01d681f189a17360
Faulting application path: C:\pg\bin\psql.exe
Faulting module path: unknown
Report Id: eb68d787-1c82-420d-8878-bc0648932a5d
Faulting package full name:
Faulting package-relative application ID:

So I'm going to have to break out the debugger, though I suspect this may
require more effort than I have time for right now.


-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com


Re: Switch to multi-inserts for pg_depend

2020-09-03 Thread Alvaro Herrera
I agree, this version looks much better, thanks.  Two very minor things:

On 2020-Sep-03, Michael Paquier wrote:

> @@ -76,11 +77,23 @@ recordMultipleDependencies(const ObjectAddress *depender,
>  
>   dependDesc = table_open(DependRelationId, RowExclusiveLock);
>  
> + /*
> +  * Allocate the slots to use, but delay initialization until we know 
> that
> +  * they will be used.  The slot initialization is the costly part, and 
> the
> +  * exact number of dependencies inserted cannot be known in advance as 
> it
> +  * depends on what is pinned by the system.
> +  */

I'm not sure you need the second sentence in this comment; keeping the
"delay initialization until ..." part seems sufficient.  If you really
want to highlight that initialization is costly, maybe just say "delay
costly initialization".

> + /*
> +  * Record the Dependency.  Note we don't bother to check for 
> duplicate
> +  * dependencies; there's no harm in them.
> +  */

No need to uppercase "dependency".  (I know this is carried forward from
prior comment, but it was equally unnecessary there.)

>   /*
>* Allocate the slots to use, but delay initialization until we know 
> that
> -  * they will be used.
> +  * they will be used.  A full scan of pg_shdepend is done to find all 
> the
> +  * dependencies from the template database to copy.  Their number is not
> +  * known in advance and the slot initialization is the costly part.
>*/

As above, this change is not needed.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [NBTREE] Possible NULL pointer dereference (backend/access/nbtree/nbutils.c)

2020-09-03 Thread Ranier Vilela
Em qua., 2 de set. de 2020 às 20:17, Peter Geoghegan  escreveu:

> On Wed, Sep 2, 2020 at 3:16 PM Ranier Vilela  wrote:



> Perhaps you recall our discussion of a similar false positive in
> nbtsplitloc.c; that had a similar feel to it. For example, if your
> static analysis tool says that code that has been around for many
> years is riddled with bugs, then the problem is almost certainly with
> the tool (or with how the tool has been used).
>
I'm using about 4 static analysis tools.
Only one of them has a report of 67690 pages.
Assuming an optimistic trend, only 1% could be real failures.
It would have 670 real flaws in the entire code.
In addition, if the code is compiled with -Wextra, hundreds of type
mismatch alerts will be reported,
especially comparison between int (signed) and unsigned types.

If you try to leaks, you will find this:
== 42483 == ERROR: LeakSanitizer: detected memory leaks

Direct leak of 576 byte (s) in 1 object (s) allocated from:
# 0 0x7f3204fb7bc8 in malloc
(/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
# 1 0x561c58d5b766 in save_ps_display_args
/usr/src/postgres/src/backend/utils/misc/ps_status.c:178
# 2 0x561c584a90b3 in main /usr/src/postgres/src/backend/main/main.c:89
# 3 0x7f3204b7f0b2 in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

Indirect leak of 14 byte (s) in 1 object (s) allocated from:
# 0 0x7f3204f403dd in strdup
(/lib/x86_64-linux-gnu/libasan.so.5+0x963dd)
# 1 0x561c58d5b817 in save_ps_display_args
/usr/src/postgres/src/backend/utils/misc/ps_status.c:186
# 2 0x561c584a90b3 in main /usr/src/postgres/src/backend/main/main.c:89
# 3 0x7f3204b7f0b2 in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

For each reported failure, I have analyzed the code, without focusing on
the "intention of the code",
but trying to abstract and concentrate on following the flow of the
variables to make sure that the alert is invalid.
Maybe I didn't realize it, but I'm not "getting out of my head", these
alerts.


>
> High performance C code very often relies on representational
> invariants that may not be readily apparent to a compiler --
> especially when dealing with on-disk state. Barring some obvious
> problem like a hard crash, you have to do the work of assessing if the
> code is correct based on the actual assumptions/context of the code. A
> static analysis tool is of very little use if you're not going to put
> the work in.

I'm sure that all these problems will be solved in the future. But how many
others will be added.
I have realized that maybe, it may be hindering the development course, so
I decided to stop.

I went to a lot of trouble to clearly document the code
> in question here (the heap TID stuff in _bt_truncate()) -- did you
> even bother to read those comments once?
>
Yes, I read.
Maybe in this specific case, only this, it would solve for the code and the
static analysis tool, but it's just a thought.

pivotheaptid = (ItemPointer) ((char *) tidpivot + IndexTupleSize(tidpivot) -
 sizeof(ItemPointerData));

Anyway, thanks for all the answers.

Ranier Vilela


Re: [PATCH] Detect escape of ErrorContextCallback stack pointers (and from PG_TRY() )

2020-09-03 Thread Tom Lane
Craig Ringer  writes:
> The attached patch series adds support for detecting coding errors where a
> stack-allocated ErrorContextCallback is not popped from the
> error_context_stack before the variable leaves scope.

So my immediate thoughts about this are

(1) It's mighty invasive for what it accomplishes.  AFAIK we have
had few of this class of bug, and so I'm not excited about touching
every callback use in the tree to add defenses.  It's also not great
that future code additions won't be protected unless they remember
to add these magic annotations.  The PG_TRY application is better since
it's wrapped into the existing macro.

(2) I don't like that this is adding runtime overhead to try to detect
such bugs.

(3) I think it's a complete failure that such a bug will only be
detected if the faulty code path is actually taken.  I think it's
quite likely that any such bugs that are lurking are in "can't
happen", or at least hard-to-hit, corner cases --- if they were in
routinely tested paths, we'd have noticed them by now.  So it'd be far
more helpful if we had a static-analysis way to detect such issues.

Thinking about (3), I wonder whether there's a way to instruct Coverity
to detect such errors.

regards, tom lane




Re: On login trigger: take three

2020-09-03 Thread Pavel Stehule
Hi

čt 3. 9. 2020 v 15:43 odesílatel Konstantin Knizhnik <
k.knizh...@postgrespro.ru> napsal:

> Hi hackers,
>
> Recently I have asked once again by one of our customers about login
> trigger in postgres. People are migrating to Postgres from Oracle and
> looking for Postgres analog of this Oracle feature.
> This topic is not new:
>
>
> https://www.postgresql.org/message-id/flat/1570308356720-0.post%40n3.nabble.com#4748bcb0c5fc98cec0a735dbdffb0c68
>
> https://www.postgresql.org/message-id/flat/OSAPR01MB507373499CCCEA00EAE79875FE2D0%40OSAPR01MB5073.jpnprd01.prod.outlook.com#ed50c248be32be6955c385ca67d6cdc1
>
> end even session connect/disconnect hooks were sometimes committed (but
> then reverted).
> As far as I understand most of the concerns were related with disconnect
> hook.
> Performing some action on session disconnect is actually much more
> complicated than on login.
> But customers are not needed it, unlike actions performed at session start.
>
> I wonder if we are really going to make some steps in this directions?
> The discussion above was finished with "We haven't rejected the concept
> altogether, AFAICT"
>
> I have tried to resurrect this patch and implement on-connect trigger on
> top of it.
> The syntax is almost the same as proposed by Takayuki:
>
> CREATE EVENT TRIGGER mytrigger
> AFTER CONNECTION ON mydatabase
> EXECUTE {PROCEDURE | FUNCTION} myproc();
>
> I have replaced CONNECT with CONNECTION because last keyword is already
> recognized by Postgres and
> make ON clause mandatory to avoid shift-reduce conflicts.
>
> Actually specifying database name is redundant, because we can define
> on-connect trigger only for self database (just because triggers and
> functions are local to the database).
> It may be considered as argument against handling session start using
> trigger. But it seems to be the most natural mechanism for users.
>
> On connect trigger can be dropped almost in the same way as normal (on
> relation) trigger, but with specifying name of the database instead of
> relation name:
>
> DROP TRIGGER mytrigger ON mydatabase;
>
> It is possible to define arbitrary number of on-connect triggers with
> different names.
>
> I attached my prototype implementation of this feature.
> I just to be sure first that this feature will be interested to community.
> If so, I will continue work in it and prepare new version of the patch
> for the commitfest.
>
>
I have a customer that requires this feature too. Now it uses a solution
based on dll session autoloading.  Native solution can be great.

+1

Pavel

Example
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>


Re: Transactions involving multiple postgres foreign servers, take 2

2020-09-03 Thread Masahiko Sawada
On Fri, 28 Aug 2020 at 17:50, Masahiro Ikeda  wrote:
>
> > I think there is a case we can't check orphaned foreign
> > prepared transaction in pg_foreign_xacts view on the new standby
> > server.
> > It confuses users and database administrators.
> >
> > If the primary coordinator crashes after preparing foreign transaction,
> > but before sending XLOG_FDWXACT_INSERT records to the standby server,
> > the standby server can't restore their transaction status and
> > pg_foreign_xacts view doesn't show the prepared foreign transactions.
> >
> > To send XLOG_FDWXACT_INSERT records asynchronously leads this problem.
>
> If the primary replicates XLOG_FDWXACT_INSERT to the standby
> asynchronously,
> some prepared transaction may be unsolved forever.
>
> Since I think to solve this inconsistency manually is hard operation,
> we need to support synchronous XLOG_FDWXACT_INSERT replication.
>
> I understood that there are a lot of impact to the performance,
> but users can control the consistency/durability vs performance
> with synchronous_commit parameter.
>
> What do you think?

I think the user can check such prepared transactions by seeing
transactions that exist on the foreign server's pg_prepared_xact but
not on the coordinator server's pg_foreign_xacts, no? To make checking
such prepared transactions easy, perhaps we could contain the
timestamp to prepared transaction id. But I’m concerned the
duplication of transaction id due to clock skew.

If there is a way to identify such unresolved foreign transactions and
it's not cumbersome, given that the likelihood of problem you're
concerned is unlikely high I guess a certain number of would be able
to accept it as a restriction. So I’d recommend not dealing with this
problem in the first version patch and we will be able to improve this
feature to deal with this problem as an additional feature. Thoughts?

> > Thank you for letting me know. I've attached the latest version patch
> > set.
>
> Thanks for updating.
> But, the latest patches failed to be applied to the master branch.

I'll submit the updated version patch.

Regards,
--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




On login trigger: take three

2020-09-03 Thread Konstantin Knizhnik

Hi hackers,

Recently I have asked once again by one of our customers about login 
trigger in postgres. People are migrating to Postgres from Oracle and  
looking for Postgres analog of this Oracle feature.

This topic is not new:

https://www.postgresql.org/message-id/flat/1570308356720-0.post%40n3.nabble.com#4748bcb0c5fc98cec0a735dbdffb0c68
https://www.postgresql.org/message-id/flat/OSAPR01MB507373499CCCEA00EAE79875FE2D0%40OSAPR01MB5073.jpnprd01.prod.outlook.com#ed50c248be32be6955c385ca67d6cdc1

end even session connect/disconnect hooks were sometimes committed (but 
then reverted).
As far as I understand most of the concerns were related with disconnect 
hook.
Performing some action on session disconnect is actually much more 
complicated than on login.

But customers are not needed it, unlike actions performed at session start.

I wonder if we are really going to make some steps in this directions?
The discussion above was finished with "We haven't rejected the concept 
altogether, AFAICT"


I have tried to resurrect this patch and implement on-connect trigger on 
top of it.

The syntax is almost the same as proposed by Takayuki:

CREATE EVENT TRIGGER mytrigger
AFTER CONNECTION ON mydatabase
EXECUTE {PROCEDURE | FUNCTION} myproc();

I have replaced CONNECT with CONNECTION because last keyword is already 
recognized by Postgres and

make ON clause mandatory to avoid shift-reduce conflicts.

Actually specifying database name is redundant, because we can define 
on-connect trigger only for self database (just because triggers and 
functions are local to the database).
It may be considered as argument against handling session start using 
trigger. But it seems to be the most natural mechanism for users.


On connect trigger can be dropped almost in the same way as normal (on 
relation) trigger, but with specifying name of the database instead of 
relation name:


DROP TRIGGER mytrigger ON mydatabase;

It is possible to define arbitrary number of on-connect triggers with 
different names.


I attached my prototype implementation of this feature.
I just to be sure first that this feature will be interested to community.
If so, I will continue work in it and prepare new version of the patch 
for the commitfest.


Example

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml
index 289dd1d..b3c1fc2 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -27,7 +27,7 @@ PostgreSQL documentation
  
 
 CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } { event [ OR ... ] }
-ON table_name
+ON { table_name | database_name }
 [ FROM referenced_table_name ]
 [ NOT DEFERRABLE | [ DEFERRABLE ] [ INITIALLY IMMEDIATE | INITIALLY DEFERRED ] ]
 [ REFERENCING { { OLD | NEW } TABLE [ AS ] transition_relation_name } [ ... ] ]
@@ -41,6 +41,7 @@ CREATE [ CONSTRAINT ] TRIGGER name
 UPDATE [ OF column_name [, ... ] ]
 DELETE
 TRUNCATE
+CONNECTION
 
  
 
diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml
index 4a0e746..22caf4c 100644
--- a/doc/src/sgml/trigger.sgml
+++ b/doc/src/sgml/trigger.sgml
@@ -71,6 +71,23 @@

 

+ You can also define on connection trigger:
+ 
+   CREATE EVENT TRIGGER mytrigger
+   AFTER CONNECTION ON mydatabase
+   EXECUTE {PROCEDURE | FUNCTION} myproc();
+ 
+ On connection triggers can only be defined for self database.
+ It can be only AFTER trigger,
+ and may not contain WHERE clause or list of columns.
+ Any number of triggers with unique names can be defined for the database.
+ On connection triggers can be dropped by specifying name of the database:
+ 
+   DROP TRIGGER mytrigger ON mydatabase;
+ 
+   
+
+   
 The trigger function must be defined before the trigger itself can be
 created.  The trigger function must be declared as a
 function taking no arguments and returning type trigger.
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index 6dfe1be..da57951 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -1433,11 +1433,23 @@ get_object_address_relobject(ObjectType objtype, List *object,
 
 	/* Extract relation name and open relation. */
 	relname = list_truncate(list_copy(object), nnames - 1);
-	relation = table_openrv_extended(makeRangeVarFromNameList(relname),
-	 AccessShareLock,
-	 missing_ok);
 
-	reloid = relation ? RelationGetRelid(relation) : InvalidOid;
+	address.objectId = InvalidOid;
+	if (objtype == OBJECT_TRIGGER && list_length(relname) == 1)
+	{
+		reloid = get_database_oid(strVal(linitial(relname)), true);
+		if (OidIsValid(reloid))
+			address.objectId = get_trigger_oid(reloid, depname, true);
+	}
+
+	if (!OidIsValid(address.objectId))
+	{
+		relation = 

Re: Reloptions for table access methods

2020-09-03 Thread gkokolatos






‐‐‐ Original Message ‐‐‐
On Tuesday, 1 September 2020 20:21, Jeff Davis  wrote:


>
> I'm fine removing the "validate" parameter completely for the sake of
> consistency.

FWIW, the more I think about this, I would agree with the removal.
However, isn't this mechanism used for other things too, e.g. 
attribute_reloptions,
tablespace_reloptions, that rely on the validation at that layer?. If my 
understanding
is correct, then simply removing the parameter would not cut it and a more 
extensive
refactoring would be needed.

>
> > [snip]
>
> The problem case would be a situation like the following:
>
> 1.  Someone loads an extension that creates a new reloption
> "custom_reloption" of kind RELOPT_KIND_HEAP for their table access
> method "my_tableam".
>
> 2.  They then create a table and forget to specify "USING my_tableam",
> but use the option "custom_reloption=123".
>
> Ideally, that would throw an error because "custom_reloption" is only
> valid for "my_tableam"; but with my patch, no error would be thrown
> because the extension has already added the reloption. It would just
> create a normal heap and "custom_reloption=123" would be ignored.

This is something that I struggle to understand as an "error". In the example,
the set RELOPT_KIND_HEAP was extended for everyone. Regardless of whether
the newly added member will be used or not.

I mean, if the intention was to add reloptions specific to the extension,
shouldn't a new RELOPT_KIND_XXX be introduced? You seem to be thinking
along the same lines. Please, correct me if I understand you wrong.

What I am trying to say, is that with the current patch, I feel the behaviour
is not strange nor unexpected.


> I went with the simple approach because fixing that problem seemed a
> bit over-engineered.

Fixing that problem seems worth it on the long run. I do see the benefit of
the simple approach on the meantime.

//Georgios

>
> Regards,
> Jeff Davis
>






Re: [PATCH] Missing links between system catalog documentation pages

2020-09-03 Thread Peter Eisentraut

On 2020-06-21 19:49, Dagfinn Ilmari Mannsåker wrote:

There were only three cases of multiple mentions of the same table in a
single paragraph, I've removed them in the attached patch.

I've also added a second patch that makes the SQL commands links.  There
were some cases of the same commands being mentioned in the descriptions
of multiple columns in the same table, but I've left those in place,
since that feels less disruptive than in prose.


Committed after some rebasing and tweaking.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: PoC: custom signal handler for extensions

2020-09-03 Thread Pavel Borisov
>
> Is there a chance to see you restarting working on this patch ?
>

I noticed that despite interest to the matter, the discussion in this CF
thread stopped quite a while ago. So I'd like to push here a patch revised
according to the previous discussion. Actually the patch is being used as a
part of the pg_query_state extension during several major PostgreSQL
releases. So I'd like to raise the matter of reviewing this updated patch
and include it into version 14 as I see much use of this change.

I welcome all your considerations,

-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com 


v3-0001-Custom-signal-handler-for-extensions.patch
Description: Binary data


[PATCH] Detect escape of ErrorContextCallback stack pointers (and from PG_TRY() )

2020-09-03 Thread Craig Ringer
Hi folks

The attached patch series adds support for detecting coding errors where a
stack-allocated ErrorContextCallback is not popped from the
error_context_stack before the variable leaves scope.

With the attached patches this code will emit a backtrace and abort() on
cassert builds at the "return" statement where it leaks the stack pointer:

ErrorContextCallback cb pg_errcontext_check();
...
cb.previous = error_context_stack;
error_context_stack = 
...
if (something)
return; /* whoops! */
...
error_context_stack = error_context_stack->previous;

as I discuss on this old blog article here:
https://www.2ndquadrant.com/en/blog/dev-corner-error-context-stack-corruption/

Note the pgl_errcontext_check() in the above, which enables the check.

The infrastructure added by the patch could well be useful for other
similar validation steps. In fact, I've added a similar validator that
detects escapes from PG_TRY() blocks, so you get a neat assertion failure
at the point of escape instead of a crash at a __longjmp() with a
nonsensical stack. Example using a deliberately introduced mistake in
postgres_fdw as a test:

Without the PG_TRY() patch:

#0  __longjmp () at ../sysdeps/x86_64/__longjmp.S:111
#1  0xc35a5d9e1a902cfc in ?? ()
Backtrace stopped: Cannot access memory at address 0xc0ea5d9e1a902cfc

With the PG_TRY() patch (abbreviated):

#3  0x00abbd4b in pg_try_check_return_guard
#4  0x7f8e2e20e41a in fetch_more_data
#5  0x7f8e2e20a80a in postgresIterateForeignScan
...

Note that valgrind's memcheck, gcc's -fstack-protector, etc are all quite
bad at detecting this class of error, so being able to do so automatically
should be nice. I've had an interesting time chasing down error context
stack mistakes a few times in the past

We unfortunately cannot use it for RAII-style coding because the underlying
__attribute__((callback(..))) is not available on MSVC. As usual. But it
remains useful for checking and assertions.

The patch is supplied as a small series:

(1)  Add some documentation on error context callback usage
(2)  Add pg_errcontext_check() to detect ErrorContextCallback escapes
(3)  Enable pg_errcontext_check() for all in-tree users
(4)  Document that generally PG_CATCH() should PG_RE_THROW()
(5)  Assert() if returning from inside a PG_TRY() / PG_CATCH() block
(6)  [RFC] Add a more succinct API for error context stack callbacks
(7)  [RFC] Use __has_attribute for compiler attribute detection where
possible

I also added a patch (6) on top as a RFC of sorts, which adds a more
succinct API for error context setup and teardown. It's not required for
these changes, but it might make sense to adopt it at the same time if
we're changing usage across the tree anyway.

A further patch (7) switches over to using __has_attribute instead of
explicit compiler version checks. Again, optional, more of a RFC.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise
From 65127dced44b737179af53f2d3aca6b1b9993fa2 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Thu, 3 Sep 2020 12:53:44 +0800
Subject: [PATCH 1/9] Add some documentation on error context callback usage

---
 src/include/utils/elog.h | 55 +++-
 1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 1e09ee0541..18276e1e93 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -221,7 +221,60 @@ extern void pre_format_elog_string(int errnumber, const char *domain);
 extern char *format_elog_string(const char *fmt,...) pg_attribute_printf(1, 2);
 
 
-/* Support for attaching context information to error reports */
+/*
+ * Support for attaching context information to error reports.
+ *
+ * Functions may set append their own ErrorContextCallback to the
+ * error_context_stack chain to have callbacks invoked during ereport() or
+ * elog() processing, optionally with a pointer to local state.
+ *
+ * The callback typically calls errcontext() one or more times to append
+ * to the CONTEXT field of the error report.
+ *
+ * Error context callbacks may be omitted when Pg can show that it will not
+ * emit the error message, so it is not safe for an error context callback to
+ * have side-effects.
+ *
+ * Any memory allocated must be in the CurrentMemoryContext which is set to
+ * ErrorContext before the callback is invoked.
+ *
+ * The typical usage pattern is:
+ *
+ * struct MyFuncErrctxArg {
+ *// track your state here for error reporting
+ *int some_info_here;
+ * };
+ *
+ * static void
+ * my_func_errctx_callback(void *arg)
+ * {
+ * struct MyFuncErrctxArg *ctxarg = arg;
+ * errcontext("in my_func() with some_info=%d",
+ * ctxarg->some_info_here;
+ * }
+ *
+ * void
+ * my_func(void)
+ * {
+ * ErrorContextCallback 

Re: Switch to multi-inserts for pg_depend

2020-09-03 Thread Daniel Gustafsson
> On 3 Sep 2020, at 12:19, Michael Paquier  wrote:
> 
> On Thu, Sep 03, 2020 at 09:47:07AM +0200, Daniel Gustafsson wrote:
>> I think this version is a clear improvement.  Nothing more sticks out from a
>> read-through.
> 
> Thanks for taking the time to look at it, Daniel.  We of course could
> still try to figure out how we could group all dependencies without
> worrying about their type, but I'd like to leave that as future work
> for now.  This is much more complex than what's proposed on this
> thread, and I am not sure if we really need to make this stuff more
> complex for this purpose.

Agreed,  I think that's a separate piece of work and discussion.

cheers ./daniel



Re: Switch to multi-inserts for pg_depend

2020-09-03 Thread Michael Paquier
On Thu, Sep 03, 2020 at 09:47:07AM +0200, Daniel Gustafsson wrote:
> I think this version is a clear improvement.  Nothing more sticks out from a
> read-through.

Thanks for taking the time to look at it, Daniel.  We of course could
still try to figure out how we could group all dependencies without
worrying about their type, but I'd like to leave that as future work
for now.  This is much more complex than what's proposed on this
thread, and I am not sure if we really need to make this stuff more
complex for this purpose.
--
Michael


signature.asc
Description: PGP signature


Re: Removal of currtid()/currtid2() and some table AM cleanup

2020-09-03 Thread Michael Paquier
On Fri, Jun 26, 2020 at 01:11:55PM +0900, Michael Paquier wrote:
> From what I can see on this thread, we could just remove currtid() per
> the arguments of the RETURNING ctid clause supported since PG 8.2, but
> it would make more sense to me to just remove both currtid/currtid2()
> at once.

The CF bot is complaining, so here is a rebase for the main patch.
Opinions are welcome about the arguments of upthread.
--
Michael
From 9da62abad29e5b2b8a8b24498eba7a2c3f718f9b Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 3 Jun 2020 10:49:41 +0900
Subject: [PATCH v2] Remove currtid() and currtid2()

Those two functions have been used within the Postgres ODBC driver in
the old days, requiring them when connecting to Postgres 8.1 or older.
In 8.2, support for RETURNING has allowed the driver to fetch ctids
by directly using this clause.
---
 src/include/access/heapam.h|   1 -
 src/include/catalog/pg_proc.dat|   7 -
 src/backend/executor/nodeModifyTable.c |   1 -
 src/backend/utils/adt/tid.c| 177 -
 src/test/regress/expected/tid.out  |  88 
 src/test/regress/sql/tid.sql   |  52 
 6 files changed, 326 deletions(-)

diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 92b19dba32..54b2eb7378 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -129,7 +129,6 @@ extern bool heap_hot_search_buffer(ItemPointer tid, Relation relation,
    bool *all_dead, bool first_call);
 
 extern void heap_get_latest_tid(TableScanDesc scan, ItemPointer tid);
-extern void setLastTid(const ItemPointer tid);
 
 extern BulkInsertState GetBulkInsertState(void);
 extern void FreeBulkInsertState(BulkInsertState);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 687509ba92..d9ea8c8244 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -2542,13 +2542,6 @@
 { oid => '1292',
   proname => 'tideq', proleakproof => 't', prorettype => 'bool',
   proargtypes => 'tid tid', prosrc => 'tideq' },
-{ oid => '1293', descr => 'latest tid of a tuple',
-  proname => 'currtid', provolatile => 'v', proparallel => 'u',
-  prorettype => 'tid', proargtypes => 'oid tid', prosrc => 'currtid_byreloid' },
-{ oid => '1294', descr => 'latest tid of a tuple',
-  proname => 'currtid2', provolatile => 'v', proparallel => 'u',
-  prorettype => 'tid', proargtypes => 'text tid',
-  prosrc => 'currtid_byrelname' },
 { oid => '1265',
   proname => 'tidne', proleakproof => 't', prorettype => 'bool',
   proargtypes => 'tid tid', prosrc => 'tidne' },
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 20a4c474cc..fc53a12d56 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -629,7 +629,6 @@ ExecInsert(ModifyTableState *mtstate,
 	if (canSetTag)
 	{
 		(estate->es_processed)++;
-		setLastTid(>tts_tid);
 	}
 
 	/*
diff --git a/src/backend/utils/adt/tid.c b/src/backend/utils/adt/tid.c
index 509a0fdffc..29dddb20e6 100644
--- a/src/backend/utils/adt/tid.c
+++ b/src/backend/utils/adt/tid.c
@@ -267,180 +267,3 @@ hashtidextended(PG_FUNCTION_ARGS)
 			 sizeof(BlockIdData) + sizeof(OffsetNumber),
 			 seed);
 }
-
-
-/*
- *	Functions to get latest tid of a specified tuple.
- *
- *	Maybe these implementations should be moved to another place
- */
-
-static ItemPointerData Current_last_tid = {{0, 0}, 0};
-
-void
-setLastTid(const ItemPointer tid)
-{
-	Current_last_tid = *tid;
-}
-
-/*
- *	Handle CTIDs of views.
- *		CTID should be defined in the view and it must
- *		correspond to the CTID of a base relation.
- */
-static Datum
-currtid_for_view(Relation viewrel, ItemPointer tid)
-{
-	TupleDesc	att = RelationGetDescr(viewrel);
-	RuleLock   *rulelock;
-	RewriteRule *rewrite;
-	int			i,
-natts = att->natts,
-tididx = -1;
-
-	for (i = 0; i < natts; i++)
-	{
-		Form_pg_attribute attr = TupleDescAttr(att, i);
-
-		if (strcmp(NameStr(attr->attname), "ctid") == 0)
-		{
-			if (attr->atttypid != TIDOID)
-elog(ERROR, "ctid isn't of type TID");
-			tididx = i;
-			break;
-		}
-	}
-	if (tididx < 0)
-		elog(ERROR, "currtid cannot handle views with no CTID");
-	rulelock = viewrel->rd_rules;
-	if (!rulelock)
-		elog(ERROR, "the view has no rules");
-	for (i = 0; i < rulelock->numLocks; i++)
-	{
-		rewrite = rulelock->rules[i];
-		if (rewrite->event == CMD_SELECT)
-		{
-			Query	   *query;
-			TargetEntry *tle;
-
-			if (list_length(rewrite->actions) != 1)
-elog(ERROR, "only one select rule is allowed in views");
-			query = (Query *) linitial(rewrite->actions);
-			tle = get_tle_by_resno(query->targetList, tididx + 1);
-			if (tle && tle->expr && IsA(tle->expr, Var))
-			{
-Var		   *var = (Var *) tle->expr;
-RangeTblEntry *rte;
-
-if (!IS_SPECIAL_VARNO(var->varno) &&
-	var->varattno == SelfItemPointerAttributeNumber)
-{
-	rte = 

Re: default partition and concurrent attach partition

2020-09-03 Thread Amit Langote
On Thu, Sep 3, 2020 at 6:50 PM Amit Langote  wrote:
>
> Hi,
>
> Starting a new thread to discuss a bug related to $subject that Hao Wu
> reported on thread titled "ALTER TABLE .. DETACH PARTITION
> CONCURRENTLY" [1].  I have been able to reproduce the bug using steps
> that Hao gave in that email:
>
> create table tpart (i int, j int) partition by range(i);
> create table tpart_1 (like tpart);
> create table tpart_2 (like tpart);
> create table tpart_default (like tpart);
> alter table tpart attach partition tpart_1 for values from (0) to (100);
> alter table tpart attach partition tpart_default default;
> insert into tpart_2 values (110,110), (120,120), (150,150);
>
> Session 1:
>
> begin;
> alter table tpart attach partition tpart_2 for values from (100) to (200);
>
> Session 2:
>
> begin;
> insert into tpart values (110,110), (120,120), (150,150);
> 
>
> Session 1:
>
> end;
>
> Session 2:
>
> select tableoid::regclass, * from tpart;
> end;
>
> The select will show that rows inserted by session 2 are inserted into
> tpart_default, whereas after successfully attaching tpart_2, they do
> not actually belong there.
>
> The problem is that when session 2 inserts those rows into tpart, it
> only knows about 2 partitions: tpart_1, tpart_default, of which it
> selects tpart_default to insert those rows into.  When tpart_default
> is locked to perform the insert, it waits for session 1 to release the
> lock taken on tpart_default during the attach command.  When it is
> unblocked, it proceeds to finish the insert without rechecking the
> partition constraint which would have been updated as result of a new
> partition having been added to the parent.
>
> Note that we don't normally check the partition constraint when
> inserting a row into a partition if the insert occurs via tuple
> routing, which makes sense for non-default partitions whose partition
> constraint cannot change due to concurrent activity.  But this test
> case has shown that the assumption is not safe for a default partition
> whose constraint is a function of other partitions that exist as of
> when the insert occurs.
>
> By the way, if you reverse the order of operations between session 1
> and 2 such that the insert by session 2 occurs first and then the
> attach by session 1, then you will correctly get this error from the
> attach command:
>
> ERROR:  updated partition constraint for default partition
> "tpart_default" would be violated by some row
>
> Attached is a patch to fix things on the insert side.

Forgot to mention that the problem exists as of v12 (commit: 898e5e329).

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com




Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2020-09-03 Thread Amit Langote
Hi Hao,

On Wed, Sep 2, 2020 at 5:25 PM Hao Wu  wrote:
>
> Not related to DETACH PARTITION, but I found a bug in ATTACH PARTITION.
> Here are the steps to reproduce the issue:
>
> create table tpart(i int, j int) partition by range(i);
> create table tpart_1(like tpart);
> create table tpart_2(like tpart);
> create table tpart_default(like tpart);alter table tpart attach partition 
> tpart_1 for values from(0) to (100);
> alter table tpart attach partition tpart_default default;insert into tpart_2 
> values(110,110),(120,120),(150,150);1: begin;
> 1: alter table tpart attach partition tpart_2 for values from(100) to (200);
> 2: begin;
> -- insert will be blocked by ALTER TABLE ATTACH PARTITION
> 2: insert into tpart values (110,110),(120,120),(150,150);
> 1: end;
> 2: select * from tpart_default;
> 2: end;
>
>
> After that the partition tpart_default contains (110,110),(120,120),(150,150)
> inserted in session 2, which obviously violates the partition constraint.

Thanks for the report.  That looks like a bug.

I have started another thread to discuss this bug and a patch to fix
it to keep the discussion here focused on the new feature.

Subject: default partition and concurrent attach partition
https://www.postgresql.org/message-id/CA%2BHiwqFqBmcSSap4sFnCBUEL_VfOMmEKaQ3gwUhyfa4c7J_-nA%40mail.gmail.com

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com




default partition and concurrent attach partition

2020-09-03 Thread Amit Langote
Hi,

Starting a new thread to discuss a bug related to $subject that Hao Wu
reported on thread titled "ALTER TABLE .. DETACH PARTITION
CONCURRENTLY" [1].  I have been able to reproduce the bug using steps
that Hao gave in that email:

create table tpart (i int, j int) partition by range(i);
create table tpart_1 (like tpart);
create table tpart_2 (like tpart);
create table tpart_default (like tpart);
alter table tpart attach partition tpart_1 for values from (0) to (100);
alter table tpart attach partition tpart_default default;
insert into tpart_2 values (110,110), (120,120), (150,150);

Session 1:

begin;
alter table tpart attach partition tpart_2 for values from (100) to (200);

Session 2:

begin;
insert into tpart values (110,110), (120,120), (150,150);


Session 1:

end;

Session 2:

select tableoid::regclass, * from tpart;
end;

The select will show that rows inserted by session 2 are inserted into
tpart_default, whereas after successfully attaching tpart_2, they do
not actually belong there.

The problem is that when session 2 inserts those rows into tpart, it
only knows about 2 partitions: tpart_1, tpart_default, of which it
selects tpart_default to insert those rows into.  When tpart_default
is locked to perform the insert, it waits for session 1 to release the
lock taken on tpart_default during the attach command.  When it is
unblocked, it proceeds to finish the insert without rechecking the
partition constraint which would have been updated as result of a new
partition having been added to the parent.

Note that we don't normally check the partition constraint when
inserting a row into a partition if the insert occurs via tuple
routing, which makes sense for non-default partitions whose partition
constraint cannot change due to concurrent activity.  But this test
case has shown that the assumption is not safe for a default partition
whose constraint is a function of other partitions that exist as of
when the insert occurs.

By the way, if you reverse the order of operations between session 1
and 2 such that the insert by session 2 occurs first and then the
attach by session 1, then you will correctly get this error from the
attach command:

ERROR:  updated partition constraint for default partition
"tpart_default" would be violated by some row

Attached is a patch to fix things on the insert side.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com


[1] 
https://www.postgresql.org/message-id/DM5PR0501MB3910E97A9EDFB4C775CF3D75A42F0%40DM5PR0501MB3910.namprd05.prod.outlook.com


v1-0001-Check-default-partition-s-constraint-even-after-t.patch
Description: Binary data


Re: Improving connection scalability: GetSnapshotData()

2020-09-03 Thread Michael Paquier
On Sun, Aug 16, 2020 at 02:26:57PM -0700, Andres Freund wrote:
> So we get some builfarm results while thinking about this.

Andres, there is an entry in the CF for this thread:
https://commitfest.postgresql.org/29/2500/

A lot of work has been committed with 623a9ba, 73487a6, 5788e25, etc.
Now that PGXACT is done, how much work is remaining here?
--
Michael


signature.asc
Description: PGP signature


Making index_set_state_flags() transactional

2020-09-03 Thread Michael Paquier
Hi all,

For a couple of things I looked at lately, it would be really useful
to make index_state_set_flags() transactional and replace its use of
heap_inplace_update() by CatalogTupleUpdate():
- When dropping an index used in a replica identity, we could make the
reset of relreplident for the parent table consistent with the moment
the index is marked as invalid:
https://www.postgresql.org/message-id/20200831062304.ga6...@paquier.xyz
- In order to make CREATE INDEX CONCURRENTLY work for partitioned
tables, we need to be able to switch indisvalid for all the index
partitions in one final transaction so as we have a consistent
partition tree at any state of the operation.  Obviously,
heap_inplace_update() is a problem here.

The restriction we have in place now can be lifted thanks to 813fb03,
that removed SnapshotNow, and as far as I looked at the code paths
doing scans of pg_index, I don't see any reasons why we could not make
this stuff transactional.  Perhaps that's just because nobody had use
cases where it would be useful, and I have two of them lately.  Note
that 3c84046 was the original commit that introduced the use of
heap_inplace_update() and index_state_set_flags(), but we don't have
SnapshotNow anymore.  Note as well that we already make the index
swapping transactional in REINDEX CONCURRENTLY for the pg_index
entries.

I got the feeling that this is an issue different than the other
threads where this was discussed, which is why I am beginning a new
thread.  Any thoughts?  Attached is a proposal of patch.

Thanks,
--
Michael
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 1d662e9af4..9b956071f6 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3297,18 +3297,10 @@ validate_index_callback(ItemPointer itemptr, void *opaque)
  * index_set_state_flags - adjust pg_index state flags
  *
  * This is used during CREATE/DROP INDEX CONCURRENTLY to adjust the pg_index
- * flags that denote the index's state.  Because the update is not
- * transactional and will not roll back on error, this must only be used as
- * the last step in a transaction that has not made any transactional catalog
- * updates!
+ * flags that denote the index's state.
  *
- * Note that heap_inplace_update does send a cache inval message for the
+ * Note that CatalogTupleUpdate() does send a cache inval message for the
  * tuple, so other sessions will hear about the update as soon as we commit.
- *
- * NB: In releases prior to PostgreSQL 9.4, the use of a non-transactional
- * update here would have been unsafe; now that MVCC rules apply even for
- * system catalog scans, we could potentially use a transactional update here
- * instead.
  */
 void
 index_set_state_flags(Oid indexId, IndexStateFlagsAction action)
@@ -3317,9 +3309,6 @@ index_set_state_flags(Oid indexId, IndexStateFlagsAction action)
 	HeapTuple	indexTuple;
 	Form_pg_index indexForm;
 
-	/* Assert that current xact hasn't done any transactional updates */
-	Assert(GetTopTransactionIdIfAny() == InvalidTransactionId);
-
 	/* Open pg_index and fetch a writable copy of the index's tuple */
 	pg_index = table_open(IndexRelationId, RowExclusiveLock);
 
@@ -3383,8 +3372,8 @@ index_set_state_flags(Oid indexId, IndexStateFlagsAction action)
 			break;
 	}
 
-	/* ... and write it back in-place */
-	heap_inplace_update(pg_index, indexTuple);
+	/* ... and update it */
+	CatalogTupleUpdate(pg_index, >t_self, indexTuple);
 
 	table_close(pg_index, RowExclusiveLock);
 }


signature.asc
Description: PGP signature


Re: Switch to multi-inserts for pg_depend

2020-09-03 Thread Daniel Gustafsson
> On 3 Sep 2020, at 07:35, Michael Paquier  wrote:
> 
> On Tue, Sep 01, 2020 at 11:53:36AM +0200, Daniel Gustafsson wrote:
>> On 14 Aug 2020, at 20:23, Alvaro Herrera  wrote:
>> 
>>> The logic to keep track number of used slots used is baroque, though -- that
>>> could use a lot of simplification.
>> 
>> What if slot_init was an integer which increments together with the loop
>> variable until max_slots is reached?  If so, maybe it should be renamed
>> slot_init_count and slotCount renamed slot_stored_count to make the their use
>> clearer?
> 
> Good idea, removing slot_init looks like a good thing for readability.
> And the same can be done for pg_shdepend.

I think this version is a clear improvement.  Nothing more sticks out from a
read-through.

cheers ./daniel



Re: extension patch of CREATE OR REPLACE TRIGGER

2020-09-03 Thread Peter Smith
Hi Osumi-san.

Thanks for addressing my previous review comments in your new v08 patch.

I have checked it again.

My only remaining comments are for trivial stuff:



COMMENT trigger.c (tab alignment)

@@ -184,6 +185,13 @@ CreateTrigger(CreateTrigStmt *stmt, const char
*queryString,
  char*oldtablename = NULL;
  char*newtablename = NULL;
  bool partition_recurse;
+ bool is_update = false;
+ HeapTuple newtup;
+ TupleDesc tupDesc;
+ bool replaces[Natts_pg_trigger];
+ Oid existing_constraint_oid = InvalidOid;
+ bool trigger_exists = false;
+ bool trigger_deferrable = false;

Maybe my editor configuration is wrong, but the alignment of
"existing_constraint_oid" still does not look fixed to me.



COMMENT trigger.c (move some declarations)

>> 2. Maybe it is more convenient/readable to declare some of these in the scope
>> where they are actually used.
>> e.g. newtup, tupDesc, replaces.
>I cannot do this because those variables are used
>at the top level in this function. Anyway, thanks for the comment.

Are you sure it can't be done? It looks doable to me.



COMMENT trigger.c ("already exists" message)

In my v07 review I suggested adding a hint for the new "already exists" error.
Of course choice is yours, but since you did not add it I just wanted
to ask was that accidental or deliberate?



COMMENT triggers.sql/.out (typo in comment)

+-- test for detecting imcompatible replacement of trigger

"imcompatible" -> "incompatible"



COMMENT triggers.sql/.out (wrong comment)

+drop trigger my_trig on my_table;
+create or replace constraint trigger my_trig
+  after insert on my_table
+  for each row execute procedure funcB(); --should fail
+create or replace trigger my_trig
+  after insert on my_table
+  for each row execute procedure funcB(); --should fail
+ERROR:  trigger "my_trig" for relation "my_table" is a constraint trigger
+HINT:  use CREATE OR REPLACE CONSTRAINT TRIGGER to replace a costraint trigger

I think the first "--should fail" comment is misleading. First time is OK.



Kind Regards,
Peter Smith.
Fujitsu Australia




Re: New statistics for tuning WAL buffer size

2020-09-03 Thread Fujii Masao




On 2020/09/02 18:56, Masahiro Ikeda wrote:

+/* --
+ * Backend types
+ * --

You seem to forget to add "*/" into the above comment.
This issue could cause the following compiler warning.
../../src/include/pgstat.h:761:1: warning: '/*' within block comment [-Wcomment]


Thanks for the comment. I fixed.


Thanks for the fix! But why are those comments necessary?





The contents of pg_stat_walwrites are reset when the server
is restarted. Isn't this problematic? IMO since pg_stat_walwrites
is a collected statistics view, basically its contents should be
kept even in the case of server restart.


I agree your opinion.
I modified to use the statistics collector and persist the wal statistics.


I changed the view name from pg_stat_walwrites to pg_stat_walwriter.
I think it is better to match naming scheme with other views like 
pg_stat_bgwriter,
which is for bgwriter statistics but it has the statistics related to backend.


I prefer the view name pg_stat_walwriter for the consistency with
other view names. But we also have pg_stat_wal_receiver. Which
makes me think that maybe pg_stat_wal_writer is better for
the consistency. Thought? IMO either of them works for me.
I'd like to hear more opinons about this.




The pg_stat_walwriter is not security restricted now, so ordinary users can 
access it.
I has the same security level as pg_stat_archiver.If you have any comments, 
please let me know.


+   dirty_writes bigint

I guess that the column name "dirty_writes" derived from
the DTrace probe name. Isn't this name confusing? We should
rename it to "wal_buffers_full" or something?


+/* --
+ * PgStat_MsgWalWriter Sent by the walwriter to update 
statistics.

This comment seems not accurate because backends also send it.

+/*
+ * WAL writes statistics counter is updated in XLogWrite function
+ */
+extern PgStat_MsgWalWriter WalWriterStats;

This comment seems not right because the counter is not updated in XLogWrite().

+-- There will surely and maximum one record
+select count(*) = 1 as ok from pg_stat_walwriter;

What about changing this comment to "There must be only one record"?

+   WalWriterStats.m_xlog_dirty_writes++;
LWLockRelease(WALWriteLock);

Since WalWriterStats.m_xlog_dirty_writes doesn't need to be protected
with WALWriteLock, isn't it better to increment that after releasing the lock?

+CREATE VIEW pg_stat_walwriter AS
+SELECT
+pg_stat_get_xlog_dirty_writes() AS dirty_writes,
+pg_stat_get_walwriter_stat_reset_time() AS stats_reset;
+
 CREATE VIEW pg_stat_progress_vacuum AS

In system_views.sql, the definition of pg_stat_walwriter should be
placed just after that of pg_stat_bgwriter not pg_stat_progress_analyze.

}
-
/*
 * We found an existing collector stats file. Read it and put all the

You seem to accidentally have removed the empty line here.

-errhint("Target must be \"archiver\" or 
\"bgwriter\".")));
+errhint("Target must be \"archiver\" or \"bgwriter\" or 
\"walwriter\".")));

There are two "or" in the message, but the former should be replaced with ","?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Parallel copy

2020-09-03 Thread Greg Nancarrow
>On Wed, Sep 2, 2020 at 3:40 PM vignesh C  wrote:
> I have attached the scripts that I used for the test results I
> mentioned in my previous mail. create.sql file has the table that I
> used, insert_data_gen.txt has the insert data generation scripts. I
> varied the count in insert_data_gen to generate csv files of 1GB, 2GB
> & 5GB & varied the data to generate 1 char, 10 char & 100 char for
> each column for various testing. You can rename insert_data_gen.txt to
> insert_data_gen.sh & generate the csv file.


Hi Vignesh,

I used your script and table definition, multiplying the number of
records to produce a 5GB and 9.5GB CSV file.
I got the following results:


(1) Postgres default settings, 5GB CSV (53 rows):

Copy TypeDuration (s)   Load factor
===
Normal Copy  132.197 -

Parallel Copy
(#workers)
198.428  1.34
252.753  2.51
337.630  3.51
433.554  3.94
533.636  3.93
633.821  3.91
734.270  3.86
834.465  3.84
934.315  3.85
10   33.543  3.94


(2) Postgres increased resources, 5GB CSV (53 rows):

shared_buffers = 20% of RAM (total RAM = 376GB) = 76GB
work_mem = 10% of RAM = 38GB
maintenance_work_mem = 10% of RAM = 38GB
max_worker_processes = 16
max_parallel_workers = 16
checkpoint_timeout = 30min
max_wal_size=2GB


Copy TypeDuration (s)   Load factor
===
Normal Copy  131.835 -

Parallel Copy
(#workers)
198.301  1.34
253.261  2.48
337.868  3.48
434.224  3.85
533.831  3.90
634.229  3.85
734.512  3.82
834.303  3.84
934.690  3.80
10   34.479  3.82



(3) Postgres default settings, 9.5GB CSV (100 rows):

Copy TypeDuration (s)   Load factor
===
Normal Copy  248.503 -

Parallel Copy
(#workers)
1185.724 1.34
299.832  2.49
370.560  3.52
463.328  3.92
563.182  3.93
664.108  3.88
764.131  3.87
864.350  3.86
964.293  3.87
10   63.818  3.89


(4) Postgres increased resources, 9.5GB CSV (100 rows):

shared_buffers = 20% of RAM (total RAM = 376GB) = 76GB
work_mem = 10% of RAM = 38GB
maintenance_work_mem = 10% of RAM = 38GB
max_worker_processes = 16
max_parallel_workers = 16
checkpoint_timeout = 30min
max_wal_size=2GB


Copy TypeDuration (s)   Load factor
===
Normal Copy  248.647-

Parallel Copy
(#workers)
1182.2361.36
292.814 2.68
367.347 3.69
463.839 3.89
562.672 3.97
663.873 3.89
764.930 3.83
863.885 3.89
962.397 3.98
10   64.477 3.86



So as you found, with this particular table definition and data, 1
parallel worker always performs better than normal copy.
The different result obtained for this particular case seems to be
caused by the following factors:
- different table definition (I used a variety of column types)
- amount of data per row (I used less data per row, so more rows per
same size data file)

As I previously observed, if the target table has no indexes,
increasing resources beyond the default settings makes little
difference to the performance.

Regards,
Greg Nancarrow
Fujitsu Australia




Re: Get memory contexts of an arbitrary backend process

2020-09-03 Thread torikoshia

Thanks for reviewing!

I'm going to modify the patch according to your comments.

On 2020-09-01 10:54, Andres Freund wrote:

Hi,

On 2020-08-31 20:22:18 +0900, torikoshia wrote:

After commit 3e98c0bafb28de, we can display the usage of the
memory contexts using pg_backend_memory_contexts system
view.

However, its target is limited to the  process attached to
the current session.

As discussed in the thread[1], it'll be useful to make it
possible to get the memory contexts of an arbitrary backend
process.

Attached PoC patch makes pg_get_backend_memory_contexts()
display meory contexts of the specified PID of the process.


Awesome!



It doesn't display contexts of all the backends but only
the contexts of specified process.
I think it would be enough because I suppose this function
is used after investigations using ps command or other OS
level utilities.


It can be used as a building block if all are needed. Getting the
infrastructure right is the big thing here, I think. Adding more
detailed views on top of that data later is easier.



diff --git a/src/backend/catalog/system_views.sql 
b/src/backend/catalog/system_views.sql

index a2d61302f9..88fb837ecd 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -555,10 +555,10 @@ REVOKE ALL ON pg_shmem_allocations FROM PUBLIC;
 REVOKE EXECUTE ON FUNCTION pg_get_shmem_allocations() FROM PUBLIC;

 CREATE VIEW pg_backend_memory_contexts AS
-SELECT * FROM pg_get_backend_memory_contexts();
+SELECT * FROM pg_get_backend_memory_contexts(-1);


-1 is odd. Why not use NULL or even 0?


+   else
+   {
+   int rc;
+   int parent_len = strlen(parent);
+   int name_len = strlen(name);
+
+   /*
+* write out the current memory context information.
+* Since some elements of values are reusable, we write it out.


Not sure what the second comment line here is supposed to mean?



+*/
+   fputc('D', fpout);
+   rc = fwrite(values, sizeof(values), 1, fpout);
+   rc = fwrite(nulls, sizeof(nulls), 1, fpout);
+
+		/* write out information which is not resuable from serialized 
values */


s/resuable/reusable/



+   rc = fwrite(_len, sizeof(int), 1, fpout);
+   rc = fwrite(name, name_len, 1, fpout);
+   rc = fwrite(, sizeof(int), 1, fpout);
+   rc = fwrite(clipped_ident, idlen, 1, fpout);
+   rc = fwrite(, sizeof(int), 1, fpout);
+   rc = fwrite(_len, sizeof(int), 1, fpout);
+   rc = fwrite(parent, parent_len, 1, fpout);
+   (void) rc;  /* we'll check for 
error with ferror */
+
+   }


This format is not descriptive. How about serializing to json or
something? Or at least having field names?

Alternatively, build the same tuple we build for the SRF, and serialize
that. Then there's basically no conversion needed.


@@ -117,6 +157,8 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate 
*tupstore,

 Datum
 pg_get_backend_memory_contexts(PG_FUNCTION_ARGS)
 {
+   int pid =  PG_GETARG_INT32(0);
+
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
TupleDesc   tupdesc;
Tuplestorestate *tupstore;
@@ -147,11 +189,258 @@ 
pg_get_backend_memory_contexts(PG_FUNCTION_ARGS)


MemoryContextSwitchTo(oldcontext);

-   PutMemoryContextsStatsTupleStore(tupstore, tupdesc,
-   
TopMemoryContext, NULL, 0);
+   if (pid == -1)
+   {
+   /*
+* Since pid -1 indicates target is the local process, simply
+* traverse memory contexts.
+*/
+   PutMemoryContextsStatsTupleStore(tupstore, tupdesc,
+   TopMemoryContext, 
"", 0, NULL);
+   }
+   else
+   {
+   /*
+* Send signal for dumping memory contexts to the target 
process,
+* and read the dumped file.
+*/
+   FILE   *fpin;
+   chardumpfile[MAXPGPATH];
+
+   SendProcSignal(pid, PROCSIG_DUMP_MEMORY, InvalidBackendId);
+
+   snprintf(dumpfile, sizeof(dumpfile), "pg_memusage/%d", pid);
+
+   while (true)
+   {
+   CHECK_FOR_INTERRUPTS();
+
+   pg_usleep(1L);
+


Need better signalling back/forth here.


Do you mean I should also send another signal from the dumped
process to the caller of the pg_get_backend_memory_contexts()
when it finishes dumping?

Regards,



--
Atsushi Torikoshi
NTT DATA CORPORATION






+/*
+ * dump_memory_contexts
+ * Dumping local memory contexts to a file.
+ * 	This function does not delete the file as it is intended to be 

Re: Get memory contexts of an arbitrary backend process

2020-09-03 Thread torikoshia

On 2020-09-01 03:29, Pavel Stehule wrote:

Hi

po 31. 8. 2020 v 17:03 odesílatel Kasahara Tatsuhito
 napsal:


Hi,

On Mon, Aug 31, 2020 at 8:22 PM torikoshia
 wrote:

As discussed in the thread[1], it'll be useful to make it
possible to get the memory contexts of an arbitrary backend
process.

+1


Attached PoC patch makes pg_get_backend_memory_contexts()
display meory contexts of the specified PID of the process.

Thanks, it's a very good patch for discussion.


It doesn't display contexts of all the backends but only
the contexts of specified process.

or we can  "SELECT (pg_get_backend_memory_contexts(pid)).* FROM
pg_stat_activity WHERE ...",
so I don't think it's a big deal.


The rough idea of implementation is like below:

1. send  a signal to the specified process
2. signaled process dumps its memory contexts to a file
3. read the dumped file and display it to the user

I agree with the overview of the idea.
Here are some comments and questions.


Thanks for the comments!



- Currently, "the signal transmission for dumping memory
information"
and "the read & output of dump information "
are on the same interface, but I think it would be better to
separate them.
How about providing the following three types of functions for
users?
- send a signal to specified pid
- check the status of the signal sent and received
- read the dumped information


Is this for future extensibility to make it possible to get
other information like the current execution plan which was
suggested by Pavel?

If so, I agree with considering extensibility, but I'm not
sure it's necessary whether providing these types of
functions for 'users'.


- How about managing the status of signal send/receive and dump
operations on a shared hash or others ?
Sending and receiving signals, dumping memory information, and
referencing dump information all work asynchronously.
Therefore, it would be good to have management information to
check
the status of each process.
A simple idea is that ..
- send a signal to dump to a PID, it first record following
information into the shared hash.
pid (specified pid)
loc (dump location, currently might be ASAP)
recv (did the pid process receive a signal? first false)
dumped (did the pid process dump a mem information?  first
false)
- specified process receive the signal, update the status in the
shared hash, then dumped at specified location.
- specified process finish dump mem information,  update the
status
in the shared hash.


Adding management information on shared memory seems necessary
when we want to have more controls over dumping like 'dump
location' or any other information such as 'current execution
plan'.
I'm going to consider this.



- Does it allow one process to output multiple dump files?
It appears to be a specification to overwrite at present, but I
thought it would be good to be able to generate
multiple dump files in different phases (e.g., planning phase and
execution phase) in the future.
- How is the dump file cleaned up?


 For a very long time there has been similar discussion about taking
session query and session execution plans from other sessions.

I am not sure how necessary information is in the memory dump, but I
am sure so taking the current execution plan and complete text of the
current query is pretty necessary information.

but can be great so this infrastructure can be used for any debugging
purpose.


Thanks!
It would be good if some part of this effort can be an infrastructure
of other debugging.
It may be hard, but I will keep your comment in mind.


Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION



Regards

Pavel


Best regards,

--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com [1]



Links:
--
[1] http://gmail.com





Re: A micro-optimisation for walkdir()

2020-09-03 Thread Thomas Munro
On Thu, Sep 3, 2020 at 5:36 PM Tom Lane  wrote:
> [request for better comments]

Ack.

> Both of these concerns would abate if we had get_dirent_type()
> just throw an error itself when stat() fails, thereby removing the
> PGFILETYPE_ERROR result code.  I'm not 100% sold either way on
> that, but it's something to think about.  Is there ever going
> to be a reason for the caller to ignore an error?

Hmm.  Well I had it like that in an earlier version, but then I
couldn't figure out the right way to write code that would work in
both frontend and backend code, without writing two copies in two
translation units, or putting the whole thing in a header.  What
approach do you prefer?