Re: TRUNCATE on foreign table

2021-04-04 Thread Bharath Rupireddy
On Sun, Apr 4, 2021 at 10:23 PM Kazutaka Onishi  wrote:
> Sure. I've replaced with the test command "SELECT * FROM ..." to
> "SELECT COUNT(*) FROM ..."
> However, for example, the "id" column is used to check after running
> TRUNCATE with ONLY clause to the inherited table.
> Thus, I use "sum(id)" instead of "count(*)"  to check the result when
> the table has records.

I still don't understand why we need sum(id), not count(*). Am I
missing something here?

Here are some more comments on the v12 patch:
1) Instead of switch case, a simple if else clause would reduce the code a bit:
if (behavior == DROP_RESTRICT)
appendStringInfoString(buf, " RESTRICT");
else if (behavior == DROP_CASCADE)
appendStringInfoString(buf, " CASCADE");

2) Some coding style comments:
It's better to have a new line after variable declarations,
assignments, function calls, before if blocks, after if blocks for
better readability of the code.
+appendStringInfoString(buf, "TRUNCATE ");   ---> new line after this
+forboth (lc1, frels_list,

+} ---> new line after this
+appendStringInfo(buf, " %s IDENTITY",

+/* ensure the target foreign table is truncatable */
+truncatable = server_truncatable;---> new line after this
+foreach (cell, ft->options)

+}---> new line after this
+if (!truncatable)

+}---> new line after this
+/* set up remote query */
+initStringInfo();
+deparseTruncateSql(, frels_list, frels_extra, behavior,
restart_seqs);---> new line after this
+/* run remote query */
+if (!PQsendQuery(conn, sql.data))
+pgfdw_report_error(ERROR, NULL, conn, false, sql.data);
---> new line after this
+res = pgfdw_get_result(conn, sql.data);---> new line after this
+if (PQresultStatus(res) != PGRES_COMMAND_OK)
+pgfdw_report_error(ERROR, res, conn, true, sql.data);--->
new line after this
+/* clean-up */
+PQclear(res);
+pfree(sql.data);
+}

and so on.

a space after "false," and before "NULL"
+conn = GetConnection(user, false,NULL);

bring lc2, frels_extra to the same of lc1, frels_list
+forboth (lc1, frels_list,
+ lc2, frels_extra)

3) I think we need truncatable behaviour that is consistent with updatable.
With your patch, seems like below is the behaviour for truncatable:
both server and foreign table are truncatable = truncated
server is not truncatable and foreign table is truncatable = not
truncated and error out
server is truncatable and foreign table is not truncatable = not
truncated and error out
server is not truncatable and foreign table is not truncatable = not
truncated and error out

Below is the behaviour for updatable:
both server and foreign table are updatable = updated
server is not updatable and foreign table is updatable = updated
server is updatable and foreign table is not updatable = not updated
server is not updatable and foreign table is not updatable = not updated

And also see comment in postgresIsForeignRelUpdatable
/*
 * By default, all postgres_fdw foreign tables are assumed updatable. This
 * can be overridden by a per-server setting, which in turn can be
 * overridden by a per-table setting.
 */

IMO, you could do the same thing for truncatable, change is required
in your patch:
both server and foreign table are truncatable = truncated
server is not truncatable and foreign table is truncatable = truncated
server is truncatable and foreign table is not truncatable = not
truncated and error out
server is not truncatable and foreign table is not truncatable = not
truncated and error out

4) GetConnection needs to be done after all the error checking code
otherwise on error we would have opened a new connection without
actually using it. Just move below code outside of the for loop in
postgresExecForeignTruncate
+user = GetUserMapping(GetUserId(), server_id);
+conn = GetConnection(user, false,NULL);
to here:
+Assert (OidIsValid(server_id)));
+
+/* get a connection to the server */
+user = GetUserMapping(GetUserId(), server_id);
+conn = GetConnection(user, false, NULL);
+
+/* set up remote query */
+initStringInfo();
+deparseTruncateSql(, frels_list, frels_extra, behavior, restart_seqs);
+/* run remote query */
+if (!PQsendQuery(conn, sql.data))
+pgfdw_report_error(ERROR, NULL, conn, false, sql.data);
+res = pgfdw_get_result(conn, sql.data);
+if (PQresultStatus(res) != PGRES_COMMAND_OK)
+pgfdw_report_error(ERROR, res, conn, true, sql.data);

5) This assertion is bogus, because GetForeignServerIdByRelId will
return valid server id and otherwise it fails with "cache lookup
error", so please remove it.
+else
+{
+/* postgresExecForeignTruncate() is invoked for each server */
+Assert(server_id == GetForeignServerIdByRelId(frel_oid));
+}

6) You can add a comment here 

Re: Proposal: Save user's original authenticated identity for logging

2021-04-04 Thread Michael Paquier
On Sat, Apr 03, 2021 at 09:30:25PM +0900, Michael Paquier wrote:
> Slight rebase for this one to take care of the updates with the SSL
> error messages.

I have been looking again at that and applied it as c50624cd after
some slight modifications.  Attached is the main, refactored, patch
that plugs on top of the existing infrastructure.  connect_ok() and
connect_fails() gain two parameters each to match or to not match the
logs of the backend, with a truncation of the logs done before any
connection attempt.

I have spent more time reviewing the backend code while on it and
there was one thing that stood out:
+   ereport(FATAL,
+   (errmsg("connection was re-authenticated"),
+errdetail_log("previous ID: \"%s\"; new ID: \"%s\"",
+  port->authn_id, id)));
This message would not actually trigger because auth_failed() is the
code path in charge of showing an error here, so this could just be
replaced by an assertion on authn_id being NULL?  The contents of this
log were a bit in contradiction with the comments a couple of lines
above anyway.  Jacob, what do you think?
--
Michael
From d8df487fb85ddd1a6ea2f9d7d5f30b72462117ea Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 5 Apr 2021 14:46:21 +0900
Subject: [PATCH v20] Log authenticated identity from all auth backends

The "authenticated identity" is the string used by an auth method to
identify a particular user. In many common cases this is the same as the
Postgres username, but for some third-party auth methods, the identifier
in use may be shortened or otherwise translated (e.g. through pg_ident
mappings) before the server stores it.

To help DBAs see who has actually interacted with the system, store the
original identity when authentication succeeds, and expose it via the
log_connections setting. The log entries look something like this
example (where a local user named "pchampion" is connecting to the
database as the "admin" user):

LOG:  connection received: host=[local]
LOG:  connection authenticated: identity="pchampion" method=peer (/data/pg_hba.conf:88)
LOG:  connection authorized: user=admin database=postgres application_name=psql

port->authn_id is set according to the auth method:

bsd: the Postgres username (which is the local username)
cert: the client's Subject DN
gss: the user principal
ident: the remote username
ldap: the final bind DN
pam: the Postgres username (which is the PAM username)
password (and all pw-challenge methods): the Postgres username
peer: the peer's pw_name
radius: the Postgres username (which is the RADIUS username)
sspi: either the down-level (SAM-compatible) logon name, if
  compat_realm=1, or the User Principal Name if compat_realm=0

The trust auth method does not set an authenticated identity. Neither
does using clientcert=verify-full (use the cert auth method instead).

PostgresNode::connect_ok/fails() have been modified to let tests check
the logfiles for required or prohibited patterns, using the respective
log_like and log_unlike parameters.
---
 src/include/libpq/hba.h   |   1 +
 src/include/libpq/libpq-be.h  |  13 +++
 src/backend/libpq/auth.c  | 122 --
 src/backend/libpq/hba.c   |  24 +
 src/test/authentication/t/001_password.pl |  59 +++
 src/test/kerberos/t/001_auth.pl   |  73 +
 src/test/ldap/t/001_auth.pl   |  29 +++--
 src/test/perl/PostgresNode.pm |  72 +
 src/test/ssl/t/001_ssltests.pl|  36 +--
 src/test/ssl/t/002_scram.pl   |  10 +-
 doc/src/sgml/config.sgml  |   3 +-
 11 files changed, 374 insertions(+), 68 deletions(-)

diff --git a/src/include/libpq/hba.h b/src/include/libpq/hba.h
index 1ec8603da7..63f2962139 100644
--- a/src/include/libpq/hba.h
+++ b/src/include/libpq/hba.h
@@ -137,6 +137,7 @@ typedef struct Port hbaPort;
 
 extern bool load_hba(void);
 extern bool load_ident(void);
+extern const char *hba_authname(hbaPort *port);
 extern void hba_getauthmethod(hbaPort *port);
 extern int	check_usermap(const char *usermap_name,
 		  const char *pg_role, const char *auth_user,
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 713c34fedd..02015efe13 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -159,6 +159,19 @@ typedef struct Port
 	 */
 	HbaLine*hba;
 
+	/*
+	 * Authenticated identity.  The meaning of this identifier is dependent on
+	 * hba->auth_method; it is the identity (if any) that the user presented
+	 * during the authentication cycle, before they were assigned a database
+	 * role.  (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
+	 * -- though the exact string in use may be different, depending on pg_hba
+	 * options.)
+	 *
+	 * authn_id is NULL if the user has not actually been 

Re: badly calculated width of emoji in psql

2021-04-04 Thread Kyotaro Horiguchi
At Fri, 2 Apr 2021 11:51:26 +0200, Pavel Stehule  
wrote in 
> with this patch, the formatting is correct

I think the hardest point of this issue is that we don't have a
reasonable authoritative source that determines character width. And
that the presentation is heavily dependent on environment.

Unicode 9 and/or 10 defines the character properties "Emoji" and
"Emoji_Presentation", and tr51[1] says that

> Emoji are generally presented with a square aspect ratio, which
> presents a problem for flags.
...
> Current practice is for emoji to have a square aspect ratio, deriving
> from their origin in Japanese. For interoperability, it is recommended
> that this practice be continued with current and future emoji. They
> will typically have about the same vertical placement and advance
> width as CJK ideographs. For example:

Ok, even putting aside flags, the first table in [2] asserts that "#",
"*", "0-9" are emoji characters.  But we and I think no-one never
present them in two-columns.  And the table has many mysterious holes
I haven't looked into.

We could Emoji_Presentation=yes for the purpose, but for example,
U+23E9(BLACK RIGHT-POINTING DOUBLE TRIANGLE) has the property
Emoji_Presentation=yes but U+23E9(BLACK RIGHT-POINTING DOUBLE TRIANGLE
WITH VERTICAL BAR) does not for a reason uncertaion to me.  It doesn't
look like other than some kind of mistake.

About environment, for example, U+23E9 is an emoji, and
Emoji_Presentation=yes, but it is shown in one column on my
xterm. (I'm not sure what font am I using..)

[1] http://www.unicode.org/reports/tr51/
[2] https://unicode.org/Public/13.0.0/ucd/emoji/emoji-data.txt

A possible compromise is that we treat all Emoji=yes characters
excluding ASCII characters as double-width and manually merge the
fragmented regions into reasonably larger chunks.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: New Table Access Methods for Multi and Single Inserts

2021-04-04 Thread Bharath Rupireddy
On Wed, Mar 10, 2021 at 10:21 AM Bharath Rupireddy
 wrote:
> Attaching the v4 patch set. Please review it further.

Attaching v5 patch set after rebasing onto the latest master. Please
review it further.


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From 6518212583e24b017375512701d9fefa6de20e42 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 10 Mar 2021 09:53:48 +0530
Subject: [PATCH v5 1/3] New Table AMs for Multi and Single Inserts

This patch introduces new table access methods for multi and
single inserts. Also implements/rearranges the outside code for
heap am into these new APIs.

Main design goal of these new APIs is to give flexibility to
tableam developers in implementing multi insert logic dependent on
the underlying storage engine. Currently, for all the underlying
storage engines, we follow the same multi insert logic such as when
and how to flush the buffered tuples, tuple size calculation, and
this logic doesn't take into account the underlying storage engine
capabilities.

We can also avoid duplicating multi insert code (for existing COPY,
and upcoming CTAS, CREATE/REFRESH MAT VIEW and INSERT SELECTs). We
can also move bulk insert state allocation and deallocation inside
these APIs.
---
 src/backend/access/heap/heapam.c | 212 +++
 src/backend/access/heap/heapam_handler.c |   5 +
 src/backend/access/table/tableamapi.c|   7 +
 src/backend/executor/execTuples.c|  83 -
 src/include/access/heapam.h  |  49 +-
 src/include/access/tableam.h |  87 ++
 src/include/executor/tuptable.h  |   1 +
 7 files changed, 438 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 3b435c107d..d8bfe17f22 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -67,6 +67,7 @@
 #include "utils/datum.h"
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
+#include "utils/memutils.h"
 #include "utils/relcache.h"
 #include "utils/snapmgr.h"
 #include "utils/spccache.h"
@@ -2669,6 +2670,217 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	pgstat_count_heap_insert(relation, ntuples);
 }
 
+/*
+ * heap_insert_begin - allocate and initialize TableInsertState
+ *
+ * For single inserts:
+ *  1) Specify is_multi as false, then multi insert state will be NULL.
+ *
+ * For multi inserts:
+ *  1) Specify is_multi as true, then multi insert state will be allocated and
+ * 	   initialized.
+ *
+ *  Other input parameters i.e. relation, command id, options are common for
+ *  both single and multi inserts.
+ */
+TableInsertState*
+heap_insert_begin(Relation rel, CommandId cid, int options, bool is_multi)
+{
+	TableInsertState *state;
+
+	state = palloc(sizeof(TableInsertState));
+	state->rel = rel;
+	state->cid = cid;
+	state->options = options;
+	/* Below parameters are not used for single inserts. */
+	state->mi_slots = NULL;
+	state->mistate = NULL;
+	state->mi_cur_slots = 0;
+	state->flushed = false;
+
+	if (is_multi)
+	{
+		HeapMultiInsertState *mistate;
+
+		mistate = palloc(sizeof(HeapMultiInsertState));
+		state->mi_slots =
+palloc0(sizeof(TupleTableSlot *) * MAX_BUFFERED_TUPLES);
+		mistate->max_slots = MAX_BUFFERED_TUPLES;
+		mistate->max_size = MAX_BUFFERED_BYTES;
+		mistate->cur_size = 0;
+		/*
+		 * Create a temporary memory context so that we can reset once per
+		 * multi insert batch.
+		 */
+		mistate->context = AllocSetContextCreate(CurrentMemoryContext,
+ "heap_multi_insert",
+ ALLOCSET_DEFAULT_SIZES);
+		state->mistate = mistate;
+	}
+
+	return state;
+}
+
+/*
+ * heap_insert_v2 - insert single tuple into a heap
+ *
+ * Insert tuple from slot into table. This is like heap_insert(), the only
+ * difference is that the parameters for insertion are inside table insert
+ * state structure.
+ */
+void
+heap_insert_v2(TableInsertState *state, TupleTableSlot *slot)
+{
+	bool		shouldFree = true;
+	HeapTuple	tuple = ExecFetchSlotHeapTuple(slot, true, );
+
+	Assert(state);
+
+	/* Update tuple with table oid */
+	slot->tts_tableOid = RelationGetRelid(state->rel);
+	tuple->t_tableOid = slot->tts_tableOid;
+
+	/* Perform insertion, and copy the resulting ItemPointer */
+	heap_insert(state->rel, tuple, state->cid, state->options, state->bistate);
+	ItemPointerCopy(>t_self, >tts_tid);
+
+	if (shouldFree)
+		pfree(tuple);
+}
+
+/*
+ * heap_multi_insert_v2 - insert multiple tuples into a heap
+ *
+ * Compute size of tuple. See if the buffered slots can hold the tuple. If yes,
+ * store it in the buffers, otherwise flush i.e. insert the so far buffered
+ * tuples into heap.
+ *
+ * Flush can happen:
+ *  1) either if all the buffered slots are filled up
+ *  2) or if total tuple size of the currently buffered slots are >= max_size
+ */
+void
+heap_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot)
+{
+	TupleTableSlot  *batchslot;

Re: Can we remove extra memset in BloomInitPage, GinInitPage and SpGistInitPage when we have it in PageInit?

2021-04-04 Thread Bharath Rupireddy
On Sat, Apr 3, 2021 at 3:09 PM vignesh C  wrote:
>
> On Mon, Mar 22, 2021 at 10:16 AM Bharath Rupireddy
>  wrote:
> >
> > Hi,
> >
> > We are memset-ting the special space page that's already set to zeros
> > by PageInit in BloomInitPage, GinInitPage and SpGistInitPage. We have
> > already removed the memset after PageInit in gistinitpage (see the
> > comment there).  Unless I'm missing something, IMO they are redundant.
> > I'm attaching a small patch that gets rid of the extra memset calls.
> >
> > While on it, I removed MAXALIGN(sizeof(SpGistPageOpaqueData)) in
> > SpGistInitPage because the PageInit will anyways align the
> > specialSize. This change is inline with other places (such as
> > BloomInitPage, brin_page_init GinInitPage, gistinitpage,
> > _hash_pageinit and so on) where we just pass the size of special space
> > data structure.
> >
> > I didn't see any regression test failure on my dev system with the
> > attached patch.
> >
> > Thoughts?
>
> The changes look fine to me.

Thanks!

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Get memory contexts of an arbitrary backend process

2021-04-04 Thread Fujii Masao




On 2021/04/05 12:20, Zhihong Yu wrote:

+        * This is just a warning so a loop-through-resultset will not abort
+        * if one backend logged its memory contexts during the run.

The pid given by arg 0 is not a PostgreSQL server process. Which other backend 
could it be ?


This is the comment that I added wrongly. So the comment should be
"This is just a warning so a loop-through-resultset will not abort
if one backend terminated on its own during the run.",
like pg_signal_backend(). Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Stronger safeguard for archive recovery not to miss data

2021-04-04 Thread Fujii Masao



On 2021/04/04 11:58, osumi.takami...@fujitsu.com wrote:

IMO it's better to comment why this server restart is necessary.
As far as I understand correctly, this is necessary to ensure the WAL file
containing the record about the change of wal_level (to minimal) is archived,
so that the subsequent archive recovery will be able to replay it.

OK, added some comments. Further, I felt the way I wrote this part was not good 
at all and self-evident
and developers who read this test would feel uneasy about that point.
So, a little bit fixed that test so that we can get clearer conviction for wal 
archive.


LGTM. Thanks for updating the patch!

Attached is the updated version of the patch. I applied the following changes.
Could you review this version? Barring any objection, I'm thinking to
commit this.

+sub check_wal_level
+{
+   my ($target_wal_level, $explanation) = @_;
+
+   is( $node->safe_psql(
+   'postgres', q{show wal_level}),
+$target_wal_level,
+$explanation);

Do we really need this test? This test doesn't seem to be directly related
to the test purpose. It seems to be testing the behavior of the PostgresNode
functions. So I removed this from the patch.

+# This protection should apply to recovery mode
+my $another_node = get_new_node('another_node');

The same test is performed twice with different settings. So isn't it better
to merge the code for both tests into one common function, to simplify
the code? I did that.

I also applied some cosmetic changes.



By the way, when I build postgres with this patch and enable-coverage
option, the results of RT becomes unstable. Does someone know the

reason ?

When it fails, I get stderr like below


I have no idea about this. Does this happen even without the patch?

Unfortunately, no. I get this only with --enable-coverage and with my patch,
althought regression tests have passed with this patch.
OSS HEAD doesn't produce the stderr even with --enable-coverage.


Could you check whether the latest patch still causes this issue or not?
If it still causes, could you check which part (the change of xlog.c or
the addition of regression test) caused the issue?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 6f8810e149..27d9ec9f91 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6403,9 +6403,11 @@ CheckRequiredParameterValues(void)
 */
if (ArchiveRecoveryRequested && ControlFile->wal_level == 
WAL_LEVEL_MINIMAL)
{
-   ereport(WARNING,
-   (errmsg("WAL was generated with 
wal_level=minimal, data may be missing"),
-errhint("This happens if you temporarily set 
wal_level=minimal without taking a new base backup.")));
+   ereport(FATAL,
+   (errmsg("WAL was generated with 
wal_level=minimal, cannot continue recovering"),
+errdetail("This happens if you temporarily set 
wal_level=minimal on the server."),
+errhint("Use a backup taken after setting 
wal_level to higher than minimal "
+"or recover to the point in 
time before wal_level was changed to minimal even though it may cause data 
loss.")));
}
 
/*
@@ -6414,11 +6416,6 @@ CheckRequiredParameterValues(void)
 */
if (ArchiveRecoveryRequested && EnableHotStandby)
{
-   if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
-   ereport(ERROR,
-   (errmsg("hot standby is not possible 
because wal_level was not set to \"replica\" or higher on the primary server"),
-errhint("Either set wal_level to 
\"replica\" on the primary, or turn off hot_standby here.")));
-
/* We ignore autovacuum_max_workers when we make this test. */
RecoveryRequiresIntParameter("max_connections",
 
MaxConnections,
diff --git a/src/test/recovery/t/024_archive_recovery.pl 
b/src/test/recovery/t/024_archive_recovery.pl
new file mode 100644
index 00..a00314ddc6
--- /dev/null
+++ b/src/test/recovery/t/024_archive_recovery.pl
@@ -0,0 +1,95 @@
+# Test for archive recovery of WAL generated with wal_level=minimal
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 2;
+use Time::HiRes qw(usleep);
+
+# Initialize and start node with wal_level = replica and WAL archiving
+# enabled.
+my $node = get_new_node('orig');
+$node->init(has_archiving => 1);
+my $replica_config = q[
+wal_level = replica
+archive_mode = on
+max_wal_senders = 10
+hot_standby = off
+];

Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation

2021-04-04 Thread Bharath Rupireddy
On Fri, Mar 26, 2021 at 9:25 AM Bharath Rupireddy
 wrote:
> Here's the v3 patch rebased on the latest master.

Here's the v4 patch reabsed on the latest master, please review it further.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


v4-0001-Improve-error-message-while-adding-tables-to-publ.patch
Description: Binary data


Re: A new function to wait for the backend exit after termination

2021-04-04 Thread Bharath Rupireddy
On Fri, Mar 19, 2021 at 11:37 AM Bharath Rupireddy
 wrote:
> Attaching v11 patch that removed the wait boolean flag in the
> pg_terminate_backend and timeout 0 indicates "no wait", negative value
> "errors out", positive value "waits for those many milliseconds". Also
> addressed other review comments that I received upthread. Please
> review v11 further.

Attaching v12 patch after rebasing onto the latest master.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


v12-0001-pg_terminate_backend-with-wait-and-timeout.patch
Description: Binary data


Re: Get memory contexts of an arbitrary backend process

2021-04-04 Thread Zhihong Yu
On Sun, Apr 4, 2021 at 7:56 PM torikoshia 
wrote:

> On 2021-04-01 19:13, Fujii Masao wrote:
> > On 2021/03/31 15:16, Kyotaro Horiguchi wrote:
> >>> + The memory contexts will be logged based on the log configuration
> >>> set. For example:
> >>>
> >>> How do you think?
> >>
> >> How about "The memory contexts will be logged in the server log" ?
> >> I think "server log" doesn't suggest any concrete target.
> >
> > Or just using "logged" is enough?
> >
> > Also I'd like to document that one message for each memory context is
> > logged.
> > So what about the following?
> >
> > One message for each memory context will be logged. For example,
>
>
> Agreed.
>
> BTW, there was a conflict since c30f54ad732(Detect POLLHUP/POLLRDHUP
> while
> running queries), attached v9.
>
>
> Regards,


Hi,

+ * On receipt of this signal, a backend sets the flag in the signal
+ * handler, and then which causes the next CHECK_FOR_INTERRUPTS()

I think the 'and then' is not needed:

  handler which causes the next ...

+* This is just a warning so a loop-through-resultset will not abort
+* if one backend logged its memory contexts during the run.

The pid given by arg 0 is not a PostgreSQL server process. Which other
backend could it be ?

Thanks


Re: Get memory contexts of an arbitrary backend process

2021-04-04 Thread torikoshia

On 2021-04-01 19:13, Fujii Masao wrote:

On 2021/03/31 15:16, Kyotaro Horiguchi wrote:

+ The memory contexts will be logged based on the log configuration
set. For example:

How do you think?


How about "The memory contexts will be logged in the server log" ?
I think "server log" doesn't suggest any concrete target.


Or just using "logged" is enough?

Also I'd like to document that one message for each memory context is 
logged.

So what about the following?

One message for each memory context will be logged. For example,



Agreed.

BTW, there was a conflict since c30f54ad732(Detect POLLHUP/POLLRDHUP 
while

running queries), attached v9.


Regards,diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 3cf243a16a..a20be435ca 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24913,6 +24913,23 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backend_memory_contexts
+
+pg_log_backend_memory_contexts ( pid integer )
+boolean
+   
+   
+Logs the memory contexts whose backend process has the specified
+process ID.
+Memory contexts will be logged based on the log configuration set.
+See  for more information.
+Only superusers can log the memory contexts.
+   
+  
+
   

 
@@ -24983,6 +25000,36 @@ SELECT collation for ('foo' COLLATE "de_DE");
 pg_stat_activity view.

 
+   
+pg_log_backend_memory_contexts can be used
+to log the memory contexts of the backend process. For example,
+
+postgres=# SELECT pg_log_backend_memory_contexts(pg_backend_pid());
+ pg_log_backend_memory_contexts 
+
+ t
+(1 row)
+
+One message for each memory context will be logged. For example:
+
+LOG:  logging memory contexts of PID 10377
+STATEMENT:  SELECT pg_log_backend_memory_contexts(pg_backend_pid());
+LOG:  level: 0; TopMemoryContext: 80800 total in 6 blocks; 14432 free (5 chunks); 66368 used
+LOG:  level: 1; pgstat TabStatusArray lookup hash table: 8192 total in 1 blocks; 1408 free (0 chunks); 6784 used
+LOG:  level: 1; TopTransactionContext: 8192 total in 1 blocks; 7720 free (1 chunks); 472 used
+LOG:  level: 1; RowDescriptionContext: 8192 total in 1 blocks; 6880 free (0 chunks); 1312 used
+LOG:  level: 1; MessageContext: 16384 total in 2 blocks; 5152 free (0 chunks); 11232 used
+LOG:  level: 1; Operator class cache: 8192 total in 1 blocks; 512 free (0 chunks); 7680 used
+LOG:  level: 1; smgr relation table: 16384 total in 2 blocks; 4544 free (3 chunks); 11840 used
+LOG:  level: 1; TransactionAbortContext: 32768 total in 1 blocks; 32504 free (0 chunks); 264 used
+...
+LOG:  level: 1; ErrorContext: 8192 total in 1 blocks; 7928 free (3 chunks); 264 used
+LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560 used
+
+For more than 100 child contexts under the same parent one,
+100 child contexts and a summary of the remaining ones will be logged.
+   
+
   
 
   
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index c6a8d4611e..eac6895141 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -30,6 +30,7 @@
 #include "storage/shmem.h"
 #include "storage/sinval.h"
 #include "tcop/tcopprot.h"
+#include "utils/memutils.h"
 
 /*
  * The SIGUSR1 signal is multiplexed to support signaling multiple event
@@ -657,6 +658,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
 	if (CheckProcSignal(PROCSIG_BARRIER))
 		HandleProcSignalBarrierInterrupt();
 
+	if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT))
+		HandleLogMemoryContextInterrupt();
+
 	if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_DATABASE))
 		RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_DATABASE);
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ad351e2fd1..330ec5b028 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3327,6 +3327,9 @@ ProcessInterrupts(void)
 
 	if (ParallelMessagePending)
 		HandleParallelMessages();
+
+	if (LogMemoryContextPending)
+		ProcessLogMemoryContextInterrupt();
 }
 
 
diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
index c02fa47550..fe9b7979e2 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -18,6 +18,8 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "mb/pg_wchar.h"
+#include "storage/proc.h"
+#include "storage/procarray.h"
 #include "utils/builtins.h"
 
 /* --
@@ -61,7 +63,7 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
 
 	/* Examine the context itself */
 	memset(, 0, sizeof(stat));
-	(*context->methods->stats) (context, NULL, (void *) , );
+	(*context->methods->stats) (context, NULL, (void *) , , true);
 
 	memset(values, 0, sizeof(values));
 	memset(nulls, 0, sizeof(nulls));
@@ -155,3 +157,59 @@ 

Re: [PATCH] typo fix in collationcmds.c: "if they are distinct"

2021-04-04 Thread Michael Paquier
On Sun, Apr 04, 2021 at 03:49:35PM +0300, Anton Voloshin wrote:
> just a quick patch for a single-letter typo in a comment
> in src/backend/commands/collationcmds.c
> ...

Thanks, fixed.  This came from 51e225d.
--
Michael


signature.asc
Description: PGP signature


Re: Improve error matching patterns in the SSL tests

2021-04-04 Thread Michael Paquier
On Thu, Apr 01, 2021 at 11:59:15AM +0900, Michael Paquier wrote:
> Please find attached a patch to tighten a bit all that.  The errors
> produced by OpenSSL down to 1.0.1 are the same.  I have noticed one
> extra place where we just check for a FATAL, where the trust
> authentication failed after a CN mismatch.

Sorry for the late reply here.  This has been applied as of 8d3a4c3.
--
Michael


signature.asc
Description: PGP signature


Re: Autovacuum on partitioned table (autoanalyze)

2021-04-04 Thread Tomas Vondra



On 4/4/21 9:08 PM, Tomas Vondra wrote:
> On 4/3/21 9:42 PM, Alvaro Herrera wrote:
>> Thanks for the quick rework.  I like this design much better and I think
>> this is pretty close to committable.  Here's a rebased copy with some
>> small cleanups (most notably, avoid calling pgstat_propagate_changes
>> when the partition doesn't have a tabstat entry; also, free the lists
>> that are allocated in a couple of places).
>>
>> I didn't actually verify that it works.
>>> ...
> 
> 3) pgstat_recv_analyze
> 
> Shouldn't it propagate the counters before resetting them? I understand
> that for the just-analyzed relation we can't do better, but why not to
> propagate the counters to parents? (Not necessarily from this place in
> the stat collector, maybe the analyze process should do that.)
> 

FWIW the scenario I had in mind is something like this:

create table t (a int, b int) partition by hash (a);
create table p0 partition of t for values with (modulus 2, remainder 0);
create table p1 partition of t for values with (modulus 2, remainder 1);

insert into t select i, i from generate_series(1,100) s(i);

select relname, n_mod_since_analyze from pg_stat_user_tables;

test=# select relname, n_mod_since_analyze from pg_stat_user_tables;
 relname | n_mod_since_analyze
-+-
 t   |   0
 p0  |  499375
 p1  |  500625
(3 rows)

test=# analyze p0, p1;
ANALYZE
test=# select relname, n_mod_since_analyze from pg_stat_user_tables;
 relname | n_mod_since_analyze
-+-
 t   |   0
 p0  |   0
 p1  |   0
(3 rows)

This may seem a bit silly - who would analyze the hash partitions
directly? However, with other partitioning schemes (list, range) it's
quite plausible that people load data directly into partition. They can
analyze the parent explicitly too, but with multi-level partitioning
that probably requires analyzing all the ancestors.

The other possible scenario is about rows inserted while p0/p1 are being
processed by autoanalyze. That may actually take quite a bit of time,
depending on vacuum cost limit. So I still think we should propagate the
delta after the analyze, before we reset the counters.


I also realized relation_needs_vacanalyze is not really doing what I
suggested - it propagates the counts, but does so in the existing loop
which checks which relations need vacuum/analyze.

That means we may skip the parent table in the *current* round, because
it'll see the old (not yet updated) counts. It's likely to be processed
in the next autovacuum round, but that may actually not happen. The
trouble is the reltuples for the parent is calculated using *current*
child reltuples values, but we're comparing it to the *old* value of
changes_since_analyze. So e.g. if enough rows were inserted into the
partitions, it may still be below the analyze threshold.

What I proposed is adding a separate loop that *only* propagates the
counts, and then re-read the current stats (perhaps only if we actually
propagated anything). And then decide which relations need analyze.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pg_amcheck contrib application

2021-04-04 Thread Mark Dilger


> On Apr 1, 2021, at 1:08 PM, Robert Haas  wrote:
> 
> 
> 
> - There are a total of two (2) calls in the current source code to
> palloc0fast, and hundreds of calls to palloc0. So I think you should
> forget about using the fast variant and just do what almost every
> other caller does.

Done.

> - If you want to make this code faster, a better idea would be to
> avoid doing all of this allocate and free work and just allocate an
> array that's guaranteed to be big enough, and then keep track of how
> many elements of that array are actually in use.

Sounds like premature optimization to me.  I only used palloc0fast because the 
argument is compile-time known.  I wasn't specifically attempting to speed 
anything up.

> - #ifdef DECOMPRESSION_CORRUPTION_CHECKING is not a useful way of
> introducing such a feature. Either we do it for real and expose it via
> SQL and pg_amcheck as an optional behavior, or we rip it out and
> revisit it later. Given the nearness of feature freeze, my vote is for
> the latter.
> 
> - I'd remove the USE_LZ4 bit, too. Let's not define the presence of
> LZ4 data in a non-LZ4-enabled cluster as corruption. If we do, then
> people will expect to be able to use this to find places where they
> are dependent on LZ4 if they want to move away from it -- and if we
> don't recurse into composite datums, that will not actually work.

Ok, I have removed this bit.  I also removed the part of the patch that 
introduced a new corruption check, decompressing the data to see if it 
decompresses without error.

> - check_toast_tuple() has an odd and slightly unclear calling
> convention for which there are no comments. I wonder if it would be
> better to reverse things and make bool *error the return value and
> what is now the return value into a pointer argument, but whatever we
> do I think it needs a few words in the comments. We don't need to
> slavishly explain every argument -- I think toasttup and ctx and tctx
> are reasonably clear -- but this is not.
...
> - Is there a reason we need a cross-check on both the number of chunks
> and on the total size? It seems to me that we should check that each
> individual chunk has the size we expect, and that the total number of
> chunks is what we expect. The overall size is then necessarily
> correct.

Good point.  I've removed the extra check on the total size, since it cannot be 
wrong if the checks on the individual chunk sizes were all correct.  This 
eliminates the need for the odd calling convention for check_toast_tuple(), so 
I've changed that to return void and not take any return-by-reference arguments.

> - To me it would be more logical to reverse the order of the
> toast_pointer.va_toastrelid != ctx->rel->rd_rel->reltoastrelid and
> VARATT_EXTERNAL_GET_EXTSIZE(toast_pointer) > toast_pointer.va_rawsize
> - VARHDRSZ checks. Whether we're pointing at the correct relation
> feels more fundamental.

Done.

> - If we moved the toplevel foreach loop in check_toasted_attributes()
> out to the caller, say renaming the function to just
> check_toasted_attribute(), we'd save a level of indentation in that
> whole function and probably add a tad of clarity, too. You wouldn't
> feel the need to Assert(ctx.toasted_attributes == NIL) in the caller
> if the caller had just done list_free(ctx->toasted_attributes);
> ctx->toasted_attributes = NIL.

You're right.  It looks nicer that way.  Changed.

> - Why are all the changes to the tests in this patch? What do they
> have to do with getting the TOAST checks out from under the buffer
> lock? I really need you to structure the patch series so that each
> patch is about one topic and, equally, so that each topic is only
> covered by one patch. Otherwise it's just way too confusing.

v18-0001 - Finishes work started in commit 3b6c1259f9 that was overlooked owing 
to how I had separated the changes in v17-0002 vs. v17-0003

v18-0002 - Postpones the toast checks for a page until after the main table 
page lock is released

v18-0003 - Improves the corruption messages in ways already discussed earlier 
in this thread.  Changes the tests to expect the new messages, but adds no new 
checks

v18-0004 - Adding corruption checks of toast pointers.  Extends the regression 
tests to cover the new checks.

> 
> - I think some of these messages need a bit of word-smithing, too, but
> we can leave that for when we're closer to being done with this.

Ok.




v18-0001-amcheck-remove-duplicate-xid-bounds-checks.patch
Description: Binary data


v18-0002-amcheck-avoid-extra-work-while-holding-buffer-lo.patch
Description: Binary data


v18-0003-amcheck-improving-corruption-messages.patch
Description: Binary data


v18-0004-amcheck-adding-toast-pointer-corruption-checks.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: Autovacuum on partitioned table (autoanalyze)

2021-04-04 Thread Alvaro Herrera
On 2021-Apr-04, Tomas Vondra wrote:

> In fact, one of the first posts in this threads links to this:
> 
> https://www.postgresql.org/message-id/4823.1262132964%40sss.pgh.pa.us
> 
> i.e. Tom actually proposed doing something like this back in 2009, so
> presumably he though it's desirable back then.
> 
> OTOH he argued against adding another per-table counter and proposed
> essentially what the patch did before, i.e. propagating the counter
> after analyze. But we know that may trigger analyze too often ...

Yeah, I think that's a doomed approach.  The reason to avoid another
column is to avoid bloat, which is good but if we end up with an
unworkable design then we know we have to backtrack on it.

I was thinking that we could get away with having a separate pgstat
struct for partitioned tables, to avoid enlarging the struct for all
tables, but if we're going to also include legacy inheritance in the
feature clearly that doesn't work.

-- 
Álvaro Herrera   Valdivia, Chile
"After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was
amazing when I first started using it at 7.2, and I'm continually astounded by
learning new features and techniques made available by the continuing work of
the development team."
Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php




Re: ALTER TABLE ADD COLUMN fast default

2021-04-04 Thread Tom Lane
Andrew Dunstan  writes:
> On 4/4/21 12:05 PM, Tom Lane wrote:
>> I made CheckConstraintFetch likewise not install its array until
>> it's done.  I notice that it is throwing elog(ERROR) not WARNING
>> for the equivalent cases of not finding the right number of
>> entries.  I wonder if we ought to back that off to a WARNING too.
>> elog(ERROR) during relcache entry load is kinda nasty, because
>> it prevents essentially *all* use of the relation.  On the other
>> hand, failing to enforce check constraints isn't lovely either.
>> Anybody have opinions about that?

> None of this is supposed to happen, is it? If you can't fetch the
> constraints (and I think of a default as almost a kind of constraint)
> your database is already somehow corrupted. So maybe if any of these
> things happen we should ERROR out. Anything else seems likely to corrupt
> the database further.

Meh.  "pg_class.relchecks is inconsistent with the number of entries
in pg_constraint" does not seem to me like a scary enough situation
to justify a panic response.  Maybe there's an argument for failing
at the point where we'd need to actually apply the CHECK constraints
(similarly to what my patch is doing for missing defaults).
But preventing the user from, say, dumping the data in the table
seems to me to be making the situation worse not better.

regards, tom lane




Re: Extensions not dumped when --schema is used

2021-04-04 Thread Noah Misch
On Wed, Mar 31, 2021 at 09:37:44AM +0900, Michael Paquier wrote:
> On Tue, Mar 30, 2021 at 12:02:45PM +0900, Michael Paquier wrote:
> > Okay.  So I have looked at that stuff in details, and after fixing
> > all the issues reported upthread in the code, docs and tests I am
> > finishing with the attached.  The tests have been moved out of
> > src/bin/pg_dump/ to src/test/modules/test_pg_dump/, and include both
> > positive and negative tests (used the trick with plpgsql for the
> > latter to avoid the dump of the extension test_pg_dump or any data
> > related to it).
> 
> I have double-checked this stuff this morning, and did not notice any
> issues.  So, applied.

I noticed the patch's behavior for relations that are members of non-dumped
extensions and are also registered using pg_extension_config_dump().  It
depends on the schema:

- If extschema='public', "pg_dump -e plpgsql" makes no mention of the
  relations.
- If extschema='public', "pg_dump -e plpgsql --schema=public" includes
  commands to dump the relation data.  This surprised me.  (The
  --schema=public argument causes selectDumpableNamespace() to set
  nsinfo->dobj.dump=DUMP_COMPONENT_ALL instead of DUMP_COMPONENT_ACL.)
- If extschema is not any sort of built-in schema, "pg_dump -e plpgsql"
  includes commands to dump the relation data.  This surprised me.

I'm attaching a test case patch that demonstrates this.  Is this behavior
intentional?
commit d210c01 (demo-dumpext-public)
Author: Noah Misch 
AuthorDate: Thu Apr 1 17:36:13 2021 -0700
Commit: Noah Misch 
CommitDate: Thu Apr 1 17:36:13 2021 -0700

demo
---
 src/test/modules/test_pg_dump/t/001_base.pl | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/test/modules/test_pg_dump/t/001_base.pl 
b/src/test/modules/test_pg_dump/t/001_base.pl
index 7c053c4..600015e 100644
--- a/src/test/modules/test_pg_dump/t/001_base.pl
+++ b/src/test/modules/test_pg_dump/t/001_base.pl
@@ -203,11 +203,23 @@ my %pgdump_runs = (
},
 
# plgsql in the list blocks the dump of extension test_pg_dump
+   # TODO --no-sync needn't appear twice
without_extension => {
dump_cmd => [
'pg_dump', '--no-sync', 
"--file=$tempdir/without_extension.sql",
'--extension=plpgsql', '--no-sync', 'postgres',
],
+   },
+
+   without_extension_explicit_schema => {
+   dump_cmd => [
+   'pg_dump',
+   '--no-sync',
+   "--file=$tempdir/without_extension_explicit_schema.sql",
+   '--extension=plpgsql',
+   '--schema=public',
+   'postgres',
+   ],
},);
 
 ###
@@ -632,6 +644,8 @@ my %tests = (
pg_dumpall_globals => 1,
section_data   => 1,
section_pre_data   => 1,
+   # excludes this schema
+   without_extension_explicit_schema => 1,
},
},
 
@@ -646,6 +660,8 @@ my %tests = (
pg_dumpall_globals => 1,
section_data   => 1,
section_pre_data   => 1,
+   # excludes this schema
+   without_extension_explicit_schema => 1,
},
},
 


Re: debian bugrept involving fast default crash in pg11.7

2021-04-04 Thread Andrew Dunstan


On 4/9/20 4:39 PM, Justin Pryzby wrote:
> On Thu, Apr 09, 2020 at 02:36:26PM -0400, Tim Bishop wrote:
>> SELECT attrelid::regclass, * FROM pg_attribute WHERE atthasmissing;
>> -[ RECORD 1 ]-+-
>> attrelid  | download
>> attrelid  | 22749
>> attname   | filetype
> But that table isn't involved in the crashing query, right ?
> Are data_stage() and income_index() locally defined functions ?  PLPGSQL ??
> Do they access the download table (or view or whatever it is) ?
>

As requested I have reviewed this old thread. You are correct, this
table is not involved in the query. That doesn't mean that the changes
made by the fast default code haven't caused a problem, but it makes it
a bit less likely. If the OP is still interested and can provide a
self-contained recipe to reproduce the problem I can investigate.
Without that it's difficult to know what to look at.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Autovacuum on partitioned table (autoanalyze)

2021-04-04 Thread Tomas Vondra



On 4/4/21 10:05 PM, Alvaro Herrera wrote:
> On 2021-Apr-04, Tomas Vondra wrote:
> 
>> 1) I still don't understand why inheritance and declarative partitioning
>> are treated differently. Seems unnecessary nad surprising, but maybe
>> there's a good reason?
> 
> I suppose the rationale is that for inheritance we have always done it
> that way -- similar things have been done that way for inheritance
> historically, to avoid messing with long-standing behavior.  We have
> done that in a bunch of places -- DDL behavior, FKs, etc.  Maybe in this
> case it's not justified.  It *will* change behavior, in the sense that
> we are going to capture stats that have never been captured before.
> That might or might not affect query plans for designs using regular
> inheritance.  But it seems reasonable to think that those changes will
> be for the good; and if it does break plans for some people and they
> want to revert to the original behavior, they can just set
> autovacuum_enabled to off for the parent tables.
> 
> So, I agree that we should enable this new feature for inheritance
> parents too.
> 

Not sure. AFAICS the missing stats on parents are an issue both for
inheritance and partitioning. Maybe there is a reason to maintain the
current behavior with inheritance, but I don't see it.

With the other features, I think the reason for not implementing that
for inheritance was that it'd be more complex, compared to declarative
partitioning (which has stricter limitations on the partitions, etc.).
But in this case I think there's no difference in complexity, the same
code can handle both cases.

In fact, one of the first posts in this threads links to this:

https://www.postgresql.org/message-id/4823.1262132964%40sss.pgh.pa.us

i.e. Tom actually proposed doing something like this back in 2009, so
presumably he though it's desirable back then.

OTOH he argued against adding another per-table counter and proposed
essentially what the patch did before, i.e. propagating the counter
after analyze. But we know that may trigger analyze too often ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Autovacuum on partitioned table (autoanalyze)

2021-04-04 Thread Alvaro Herrera
On 2021-Apr-04, Tomas Vondra wrote:

> 1) I still don't understand why inheritance and declarative partitioning
> are treated differently. Seems unnecessary nad surprising, but maybe
> there's a good reason?

I suppose the rationale is that for inheritance we have always done it
that way -- similar things have been done that way for inheritance
historically, to avoid messing with long-standing behavior.  We have
done that in a bunch of places -- DDL behavior, FKs, etc.  Maybe in this
case it's not justified.  It *will* change behavior, in the sense that
we are going to capture stats that have never been captured before.
That might or might not affect query plans for designs using regular
inheritance.  But it seems reasonable to think that those changes will
be for the good; and if it does break plans for some people and they
want to revert to the original behavior, they can just set
autovacuum_enabled to off for the parent tables.

So, I agree that we should enable this new feature for inheritance
parents too.


I can't comment on the other issues.  I hope to give this a closer look
tomorrow my time; with luck Hosoya-san will have commented by then.


-- 
Álvaro Herrera39°49'30"S 73°17'W
"La rebeldía es la virtud original del hombre" (Arthur Schopenhauer)




Re: Flaky vacuum truncate test in reloptions.sql

2021-04-04 Thread Arseny Sher

On Fri, Apr 2, 2021 at 9:46 AM Michael Paquier  wrote:

> Okay, applied and back-patched down to 12 then.

Thank you both. Unfortunately and surprisingly, the test still fails
(perhaps even rarer, once in several hundred runs) under
multimaster. After scratching the head for some more time, it seems to
me the following happens: not only vacuum encounters locked page, but
also there exist a concurrent backend (as the parallel schedule is run)
who holds back oldestxmin keeping it less than xid of transaction which
did the insertion

INSERT INTO reloptions_test VALUES (1, NULL), (NULL, NULL);

FreezeLimit can't be higher than oldestxmin, so lazy_check_needs_freeze
decides there is nothing to freeze on the page. multimaster commits are
quite heavy, which apparently shifts the timings making the issue more
likely.

Currently we are testing the rather funny attached patch which forces
all such old-snapshot-holders to finish. It is crutchy, but I doubt we
want to change vacuum logic (e.g. checking tuple liveness in
lazy_check_needs_freeze) due to this issue. (it is especially crutchy in
xid::bigint casts, but wraparound is hardly expected in regression tests
run).


-- cheers, arseny

diff --git a/src/test/regress/expected/reloptions.out b/src/test/regress/expected/reloptions.out
index bb7bd6e1e7e..78bbf4a5255 100644
--- a/src/test/regress/expected/reloptions.out
+++ b/src/test/regress/expected/reloptions.out
@@ -128,6 +128,20 @@ SELECT reloptions FROM pg_class WHERE oid = 'reloptions_test'::regclass;
 INSERT INTO reloptions_test VALUES (1, NULL), (NULL, NULL);
 ERROR:  null value in column "i" of relation "reloptions_test" violates not-null constraint
 DETAIL:  Failing row contains (null, null).
+do $$
+declare
+  my_xid bigint;
+  oldest_xmin bigint;
+begin
+  my_xid := txid_current();
+  while true loop
+oldest_xmin := min(backend_xmin::text::bigint) from pg_stat_activity where pid != pg_backend_pid();
+exit when oldest_xmin is null or oldest_xmin >= my_xid;
+perform pg_sleep(0.1);
+perform pg_stat_clear_snapshot();
+  end loop;
+end
+$$;
 -- Do an aggressive vacuum to prevent page-skipping.
 VACUUM FREEZE reloptions_test;
 SELECT pg_relation_size('reloptions_test') = 0;
diff --git a/src/test/regress/sql/reloptions.sql b/src/test/regress/sql/reloptions.sql
index 95f7ab4189e..96fb59d16ad 100644
--- a/src/test/regress/sql/reloptions.sql
+++ b/src/test/regress/sql/reloptions.sql
@@ -72,6 +72,20 @@ SELECT reloptions FROM pg_class WHERE oid =
 ALTER TABLE reloptions_test RESET (vacuum_truncate);
 SELECT reloptions FROM pg_class WHERE oid = 'reloptions_test'::regclass;
 INSERT INTO reloptions_test VALUES (1, NULL), (NULL, NULL);
+do $$
+declare
+  my_xid bigint;
+  oldest_xmin bigint;
+begin
+  my_xid := txid_current();
+  while true loop
+oldest_xmin := min(backend_xmin::text::bigint) from pg_stat_activity where pid != pg_backend_pid();
+exit when oldest_xmin is null or oldest_xmin >= my_xid;
+perform pg_sleep(0.1);
+perform pg_stat_clear_snapshot();
+  end loop;
+end
+$$;
 -- Do an aggressive vacuum to prevent page-skipping.
 VACUUM FREEZE reloptions_test;
 SELECT pg_relation_size('reloptions_test') = 0;


Re: policies with security definer option for allowing inline optimization

2021-04-04 Thread Noah Misch
On Fri, Apr 02, 2021 at 02:24:59PM -0700, Dan Lynch wrote:
> Does anyone know details of, or where to find more information about the
> implications of the optimizer on the quals/checks for the policies being
> functions vs inline?

Roughly, the PostgreSQL optimizer treats LANGUAGE SQL functions like a C
compiler treats "extern inline" functions.  Other PostgreSQL functions behave
like C functions in a shared library.  Non-SQL functions can do arbitrary
things, and the optimizer knows only facts like their volatility and the value
given in CREATE FUNCTION ... COST.

> I suppose if the
> get_group_ids_of_current_user() function is marked as STABLE, would the
> optimizer cache this value for every row in a SELECT that returned
> multiple rows?

While there was a patch to implement caching, it never finished.  The
optimizer is allowed to, and sometimes does, choose plan shapes that reduce
the number of function calls.

> Is it possible that if the function is sql vs plpgsql it
> makes a difference?

Yes; see inline_function() in the PostgreSQL source.  The hard part of
$SUBJECT is creating the infrastructure to inline across a SECURITY DEFINER
boundary.  Currently, a single optimizable statement operates under just one
user identity.  Somehow, the optimizer would need to translate the SECURITY
DEFINER call into a list of moments where the executor shall switch user ID,
then maintain that list across further optimization steps.  security_barrier
views are the most-similar thing, but as Joe Conway mentioned, views differ
from SECURITY DEFINER in crucial ways.




Re: GSoC 2021 - Student looking for a mentor - Magzum Assanbayev

2021-04-04 Thread Gavin Flower

On 03/04/2021 06:14, Magzum Assanbayev wrote:

Dear Sirs,


Note that there are some females that hack pg!




My name is Magzum Assanbayev, I am a Master Student at KIMEP 
University in Kazakhstan, expected to graduate in Spring 2022.


Having made some research into your organization I have deduced that 
my current skill set might be suitable to your needs.

[...]

Please let me know by replying to this email.

Thank you!







Re: Autovacuum on partitioned table (autoanalyze)

2021-04-04 Thread Tomas Vondra
On 4/3/21 9:42 PM, Alvaro Herrera wrote:
> Thanks for the quick rework.  I like this design much better and I think
> this is pretty close to committable.  Here's a rebased copy with some
> small cleanups (most notably, avoid calling pgstat_propagate_changes
> when the partition doesn't have a tabstat entry; also, free the lists
> that are allocated in a couple of places).
> 
> I didn't actually verify that it works.
> 

Yeah, this approach seems much simpler, I think. That being said, I
think there's a couple issues:

1) I still don't understand why inheritance and declarative partitioning
are treated differently. Seems unnecessary nad surprising, but maybe
there's a good reason?


2) pgstat_recv_tabstat

Should it really reset changes_since_analyze_reported in both branches?
AFAICS if the "found" branch does this

tabentry->changes_since_analyze_reported = 0;

that means we lose the counter any time tabstats are received, no?
That'd be wrong, because we'd propagate the changes repeatedly.


3) pgstat_recv_analyze

Shouldn't it propagate the counters before resetting them? I understand
that for the just-analyzed relation we can't do better, but why not to
propagate the counters to parents? (Not necessarily from this place in
the stat collector, maybe the analyze process should do that.)


4) pgstat_recv_reportedchanges

I think this needs to be more careful when updating the value - the
stats collector might have received other messages modifying those
counters (including e.g. PGSTAT_MTYPE_ANALYZE with a reset), so maybe we
can get into situation with

  (changes_since_analyze_reported > changes_since_analyze)

if we just blindly increment the value. I'd bet would lead to funny
stuff. So maybe this should be careful to never exceed this?



regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: ALTER TABLE ADD COLUMN fast default

2021-04-04 Thread Justin Pryzby
On Sun, Apr 04, 2021 at 02:18:34PM -0400, Andrew Dunstan wrote:
> On 4/4/21 11:21 AM, Andrew Dunstan wrote:
> > On 4/4/21 9:19 AM, Justin Pryzby wrote:
> >> It reminded me of this thread, but nothing ever came of it.
> >> https://www.postgresql.org/message-id/20200328223052.GK20103%40telsasoft.com
> >>
> >> https://www.postgresql.org/message-id/87pmzaq4gx.fsf%40news-spur.riddles.org.uk
> >>
> > I don't recall seeing this. Around that time I was busy returning from
> > Australia at the start of the pandemic, so my I might have missed it in
> > the hubbub.
> 
> If this is still an issue, is it possible to get a self-contained test
> to reproduce it?

Could you follow through with Tim on the other thread ?

-- 
Justin




Re: [PATCH] Implement motd for PostgreSQL

2021-04-04 Thread Dagfinn Ilmari Mannsåker
Fabien COELHO  writes:

>>> The actual source looks pretty straightforward. I'm wondering whether pg
>>> style would suggest to write motd != NULL instead of just motd.
>>
>> That's what I had originally, but when reviewing my code verifying code 
>> style,
>> I noticed the other case it more common:
>>
>> if \([a-z]* != NULL &&
>> 119 results in 72 files
>>
>> if \([a-z]* &&
>> 936 results in 311 files
>
> If other cases are indeed pointers. For pgbench, all direct "if (xxx &&"
> cases are simple booleans or integers, pointers seem to have "!=
> NULL". While looking quickly at the grep output, ISTM that most obvious
> pointers have "!= NULL" and other cases often look like booleans:
>
>   catalog/pg_operator.c:  if (isDelete && t->oprcom == baseId)
>   catalog/toasting.c: if (check && lockmode != AccessExclusiveLock)
>   commands/async.c:   if (amRegisteredListener && listenChannels == NIL)
>   commands/explain.c: if (es->analyze && es->timing)
>   …
>
> I'm sure there are exceptions, but ISTM that the local style is "!= NULL".

Looking specifically at code checking an expression before dereferencing
it, we get:

$ ag '(?:if|Assert)\s*\(\s*(\S+)\s*&&\s*\1->\w+' | wc -l
247

$ ag '(?:if|Assert)\s*\(\s*(\S+)\s*!=\s*NULL\s*&&\s*\1->\w+' | wc -l
74

So the shorter 'foo && foo->bar' form (which I personally prefer) is
considerably more common than the longer 'foo != NULL && foo->bar' form.

- ilmari
-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law




Re: SP-GiST confusion: indexed column's type vs. index column type

2021-04-04 Thread Tom Lane
I wrote:
> I propose changing things so that
> (B) We enforce that leafType agrees with the opclass opckeytype,
> ensuring the index tupdesc can be used for leaf tuples.

After looking at PostGIS I realized that a hard restriction of this
sort won't fly, because it'd make upgrades impossible for them.
They have some lossy SPGiST opclasses, in which leafType is returned
as different from the original input datatype.  Since up to now
we've disallowed the STORAGE clause for user-defined SPGiST
opclasses, they are unable to declare these opclasses honestly in
existing releases, but it didn't matter.  If we enforce that STORAGE
matches leafType then their existing opclass definitions won't work
in v14, but they can't fix them before upgrading either.

So I backed off the complaint about that to be just an amvalidate
warning, and pushed it.

This means the INCLUDE patch will still have to account for the
possibility that the index tupdesc is an inaccurate representation
of the actual leaf tuple contents, but we can treat that case less
efficiently without feeling bad about it.  So we should be able to
do something similar for the leaf tupdesc as for the index-only-scan
output tupdesc, that is use the relcache's tupdesc if it's got the
right first column type, otherwise copy-and-modify that tupdesc.

regards, tom lane




Re: pgbench - add pseudo-random permutation function

2021-04-04 Thread Dean Rasheed
On Fri, 2 Apr 2021 at 06:38, Fabien COELHO  wrote:
>
> >>   r = (uint64) (pg_erand48(random_state.xseed) * size);
> >>
> >> I do not understand why the random values are multiplied by anything in
> >> the first place…
> >
> > These are just random integers in the range [0,mask] and [0,size-1],
> > formed in exactly the same way as getrand().
>
> Indeed, erand returns a double, this was the part I was missing. I did not
> realize that you had switched to doubles in your approach.
>
> I think that permute should only use integer operations. I'd suggest to
> use one of the integer variants instead of going through a double
> computation and casting back to int. The internal state is based on
> integers, I do not see the added value of going through floats, possibly
> enduring floating point issues (undeflow, rounding, normalization,
> whatever) on the way, whereas from start to finish we just need ints.
>

This is the already-established coding pattern used in getrand() to
pick a random number uniformly in some range that's not necessarily a
power of 2.

Floating point underflow and normalisation issues are not possible
because erand48() takes a 48-bit integer N and uses ldexp() to store
N/2^48 in a double, which is an exact operation since IEEE doubles
have something like 56-bit mantissas. This is then turned back into an
integer in the required range by multiplying by the desired maximum
value, so there's never any risk of underflow or normalisation issues.
If you didn't do it that way, you'd need to rely on some larger
integer datatype, such as 128-bit integers.

I guess that there may be rounding variations once the required
maximum value exceeds something like 2^56 (although the comment in
getrand() is much more conservative than that), so it's possible that
a pgbench script calling random() with (ub-lb) larger than that might
give different results on different platforms. For the non-uniform
random functions, that effect might well kick in sooner. I'm not aware
of any field complaints about that though, possibly because real-world
data sizes are significantly smaller than that.

In practice, permute() is likely to take its input from one of the
non-uniform random functions, so it won't be permute() that first
introduces rounding issues.

> See attached v27 proposal.
>

This update has a number of flaws. For example, this:

+static uint64
+randu64(RandomState * state)
+{
+uint64 r1 = pg_jrand48((*state).xseed),
+   r2 = pg_jrand48((*state).xseed);
+return r1 << 51 | r2 << 13 | r1 >> 13;
+}

It still uses a 48-bit RandomState, so it doesn't improve on getrand()
in that respect.

It replaces a single erand48() call with 2 jrand48() calls, which
comes at a cost in terms of performance. (In fact, changing the number
of rounds in the previous version of permute() from 4 to 6 has a
smaller performance impact than this -- more about that below.)

jrand48() returns a signed 32-bit integer, which has a 50% chance of
being negative, so when that is cast to a uint64, there is a 50%
chance that the 32 most significant bits will be 1. When the various
parts are OR'ed together, that will then mask out any randomness from
the other parts. For example, 50% of the time, the jrand48() value
used for r1 will be negative, and so 32 bits in the middle of the
final result will all be set.

There is essentially no randomness at all in bits 45..50, and the r1
and r2 values overlap in bits 13..18, giving them a 75% chance of
being set.

So overall, the results will be highly non-uniform, with less
randomness and poorer performance than erand48().

In addition, it returns a result in the range [0,2^64), which is not
really what's wanted. For example:

+/* Random offset */
+r = randu64(_state2);
+v = (v + r) % size;

The previous code used r = getrand(0, size-1), which gave a uniformly
random offset. However, the new code risks introducing additional
non-uniformity when size is not a power of 2.

Finally, worst of all, this random offset is no longer bijective, due
to 64-bit integer wrap-around. For example, suppose that size=100 and
r=(2^64-10), then the following 2 values both map to the same result:

  v = 20 -> (v + r) % size
  = (20 + (2^64 - 10)) % 100
  = (2^64 + 10) % 100
  = (10) % 100
  = 10

  v = 4 -> (v + r) % size
 = (4 + (2^64 - 10)) % 100
 = (2^64 - 6) % 100
 = (18446744073709551610) % 100
 = 10

So not only are the results no longer uniformly random, they might not
even form a permutation.

I did some more testing of the previous version (v26), this time
looking at much larger sizes, all the way up to the maximum, which is
2^63-1 since it comes from a signed int64. In general, the results
were very good, though I did notice some slight non-uniformity in the
way adjacent inputs were separated from another when the size was just
under a power of 2. I think that's the hardest case for this
algorithm, because 

Re: ALTER TABLE ADD COLUMN fast default

2021-04-04 Thread Andrew Dunstan


On 4/4/21 11:21 AM, Andrew Dunstan wrote:
> On 4/4/21 9:19 AM, Justin Pryzby wrote:
>> It reminded me of this thread, but nothing ever came of it.
>> https://www.postgresql.org/message-id/20200328223052.GK20103%40telsasoft.com
>>
>>
>> https://www.postgresql.org/message-id/87pmzaq4gx.fsf%40news-spur.riddles.org.uk
>>
>>
> I don't recall seeing this. Around that time I was busy returning from
> Australia at the start of the pandemic, so my I might have missed it in
> the hubbub.
>
>

If this is still an issue, is it possible to get a self-contained test
to reproduce it?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: possible repalloc() in icu_convert_case()

2021-04-04 Thread Anton Voloshin

On 04.04.2021 19:20, Tom Lane wrote:

repalloc is likely to be more expensive, since it implies copying
data which isn't helpful here.  I think this code is fine as-is.


Oh, you are right, thanks. I did not think properly about copying in 
repalloc.


--
Anton Voloshin
Postgres Professional: https://www.postgrespro.com
Russian Postgres Company




Re: Crash in BRIN minmax-multi indexes

2021-04-04 Thread Tomas Vondra
BTW, for the inet data type, I considered simply calling the "minus"
function, but that does not work because of this strange behavior:


int4=# select '10.1.1.102/32'::inet > '10.1.1.142/24'::inet;
 ?column?
--
 t
(1 row)

int4=# select '10.1.1.102/32'::inet - '10.1.1.142/24'::inet;
 ?column?
--
  -40
(1 row)


That is, (a>b) but then (a-b) < 0. AFAICS it's due to comparator
considering the mask, while the minus ignores it. I find it a bit
strange, but I assume it's intentional.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Crash in BRIN minmax-multi indexes

2021-04-04 Thread Tomas Vondra
On 4/4/21 7:25 AM, Jaime Casanova wrote:
> ...
> Changing to using month of 30 days on the formula fixed it.
> 

I've pushed fixes for all the bugs reported in this thread so far
(mostly distance calculations, ...), and one bug (swapped operator
parameters in one place) I discovered while working on the fixes.

> and I found another issue, this time involves autovacuum which makes it
> a little more complicated to reproduce.
> 
> Currently the only stable way to reproduce it is using pgbench:
> 
> pgbench -i postgres
> psql -c "CREATE INDEX ON pgbench_history USING brin (tid 
> int4_minmax_multi_ops);" postgres
> pgbench -c2 -j2 -T 300 -n postgres
> 

Fixed and pushed too.

Turned out to be a silly bug in forgetting to remember the number of
ranges after deduplication, which sometimes resulted in assert failure.
It's a bit hard to trigger because concurrency / good timing is needed
while summarizing the range, requiring a call to "union" function. Just
running the pgbench did not trigger the issue for me, I had to manually
call the brin_summarize_new_values().

For the record, I did a lot of testing with data randomized in various
ways - the scripts are available here:

https://github.com/tvondra/brin-randomized-tests

It was focused on discovering issues in the distance functions, and
comparing results with/without the index. Maybe the next step should be
adding some changes to the data, which might trigger more issues like
this one.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: ALTER TABLE ADD COLUMN fast default

2021-04-04 Thread Zhihong Yu
Hi,
Thanks for the cleanup.

if (found != ncheck)
elog(ERROR, "%d constraint record(s) missing for rel %s",
 ncheck - found, RelationGetRelationName(relation));

Since there is check on found being smaller than ncheck inside the loop,
the if condition can be written as:
if (found < ncheck)

Actually the above is implied by the expression, ncheck - found.

One minor comment w.r.t. the variable name is that, found is normally a
bool variable.
Maybe rename the variable nfound which would be clearer in its meaning.

+   if (found != ndef)
+   elog(WARNING, "%d attrdef record(s) missing for rel %s",
+ndef - found, RelationGetRelationName(relation));

Since only warning is logged, there seems to be some wasted space in
attrdef. Would such space accumulate, resulting in some memory leak ?
I think the attrdef should be copied to another array of the proper size so
that there is no chance of memory leak.

Thanks

On Sun, Apr 4, 2021 at 9:05 AM Tom Lane  wrote:

> Andrew Dunstan  writes:
> > On 4/3/21 10:09 PM, Tom Lane wrote:
> >> Looking around at the other touches of AttrDefault.adbin in the backend
> >> (of which there are not that many), some of them are prepared for it to
> be
> >> NULL and some are not.  I don't immediately have a strong opinion
> whether
> >> that should be an allowed state; but if it is not allowed then it's not
> >> okay for AttrDefaultFetch to leave such a state behind.  Alternatively,
> >> if it is allowed then equalTupleDesc is in the wrong.  We should choose
> >> one definition and make all the relevant code match it.
>
> > There's already a warning if it sets an explicit NULL value, so I'm
> > inclined to say your suggestion "it's not okay for AttrDefaultFetch to
> > leave such a state behind" is what we should go with.
>
> Yeah, I came to the same conclusion after looking around a bit more.
> There are two places in tupdesc.c that defend against adbin being NULL,
> which seems a bit silly when their sibling equalTupleDesc does not.
> Every other touch in the backend will segfault on a NULL value,
> if it doesn't Assert first :-(
>
> Another thing that annoyed me while I was looking at this is the
> potentially O(N^2) behavior in equalTupleDesc due to not wanting
> to assume that the AttrDefault arrays are in the same order.
> It seems far smarter to have AttrDefaultFetch sort the arrays.
>
> So attached is a proposed patch to clean all this up.
>
> * Don't link the AttrDefault array into the relcache entry until
> it's fully built and valid.
>
> * elog, don't just Assert, if we fail to find an expected default
> value.  (Perhaps there's a case for ereport instead.)
>
> * Remove now-useless null checks in tupdesc.c.
>
> * Sort the AttrDefault array, remove double loops in equalTupleDesc.
>
> I made CheckConstraintFetch likewise not install its array until
> it's done.  I notice that it is throwing elog(ERROR) not WARNING
> for the equivalent cases of not finding the right number of
> entries.  I wonder if we ought to back that off to a WARNING too.
> elog(ERROR) during relcache entry load is kinda nasty, because
> it prevents essentially *all* use of the relation.  On the other
> hand, failing to enforce check constraints isn't lovely either.
> Anybody have opinions about that?
>
> regards, tom lane
>
>


Re: Additional Chapter for Tutorial - arch-dev.sgml

2021-04-04 Thread Alvaro Herrera
On 2021-Apr-04, Jürgen Purtz wrote:

> The small patch 'arch-dev.sgml.20210121.diff' contains only some clearing up
> concerning the used terminology and its alignments with the glossary. The
> patch was rejected by Heikki.

This comment is not helpful, because it's not obvious where would I find
that patch.  Also, you say "the patch was rejected by Heikki" but
upthread he said he committed it.  His comment was that he left out some
paragraphs because of a style issue.  Did you re-post that patch after
fixing the style issues?  If you did, I couldn't find it.


> The latest version of the huge patch '0013-architecture.patch' is valid and
> doesn't contain merge conflicts.

Yeah, OK, but I have to dive deep in the thread to find it.  Please post
it again.  When you have a patch series, please post it as a whole every
time -- that makes it easier for a committer to review it.

You seem to be making your life hard by not using git to assist you.  Do
you know you can have several commits in a branch of your own, rebase it
to latest master, merge master to it, rebase on top of master, commit
fixups, "rebase -i" and change commit ordering to remove unnecessary
fixup commits, and so on?  Such techniques are extremely helpful when
dealing with a patch series.  When you want to post a new version to the
list, you can just do "git format-patch -v14 origin/master" to produce a
set of patch files.  You don't need to manually give names to your patch
files, or come up with a versioning scheme.  Just increment the argument
to -v by +1 each time you (or somebody else) posts a new version of the
patch series.

-- 
Álvaro Herrera   Valdivia, Chile




Re: ALTER TABLE ADD COLUMN fast default

2021-04-04 Thread Andrew Dunstan


On 4/4/21 12:05 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 4/3/21 10:09 PM, Tom Lane wrote:
>>> Looking around at the other touches of AttrDefault.adbin in the backend
>>> (of which there are not that many), some of them are prepared for it to be
>>> NULL and some are not.  I don't immediately have a strong opinion whether
>>> that should be an allowed state; but if it is not allowed then it's not
>>> okay for AttrDefaultFetch to leave such a state behind.  Alternatively,
>>> if it is allowed then equalTupleDesc is in the wrong.  We should choose
>>> one definition and make all the relevant code match it.
>> There's already a warning if it sets an explicit NULL value, so I'm
>> inclined to say your suggestion "it's not okay for AttrDefaultFetch to
>> leave such a state behind" is what we should go with.
> Yeah, I came to the same conclusion after looking around a bit more.
> There are two places in tupdesc.c that defend against adbin being NULL,
> which seems a bit silly when their sibling equalTupleDesc does not.
> Every other touch in the backend will segfault on a NULL value,
> if it doesn't Assert first :-(
>
> Another thing that annoyed me while I was looking at this is the
> potentially O(N^2) behavior in equalTupleDesc due to not wanting
> to assume that the AttrDefault arrays are in the same order.
> It seems far smarter to have AttrDefaultFetch sort the arrays.


+1 The O(N^2) loops also bothered me.


>
> So attached is a proposed patch to clean all this up.
>
> * Don't link the AttrDefault array into the relcache entry until
> it's fully built and valid.
>
> * elog, don't just Assert, if we fail to find an expected default
> value.  (Perhaps there's a case for ereport instead.)
>
> * Remove now-useless null checks in tupdesc.c.
>
> * Sort the AttrDefault array, remove double loops in equalTupleDesc.


This all looks a lot cleaner and simpler to follow. I like not
allocating the array until AttrDefaultFetch.


>
> I made CheckConstraintFetch likewise not install its array until
> it's done.  I notice that it is throwing elog(ERROR) not WARNING
> for the equivalent cases of not finding the right number of
> entries.  I wonder if we ought to back that off to a WARNING too.
> elog(ERROR) during relcache entry load is kinda nasty, because
> it prevents essentially *all* use of the relation.  On the other
> hand, failing to enforce check constraints isn't lovely either.
> Anybody have opinions about that?



None of this is supposed to happen, is it? If you can't fetch the
constraints (and I think of a default as almost a kind of constraint)
your database is already somehow corrupted. So maybe if any of these
things happen we should ERROR out. Anything else seems likely to corrupt
the database further.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: TRUNCATE on foreign table

2021-04-04 Thread Kazutaka Onishi
Thank you for checking v11!
I've updated it and attached v12.

> I usually follow these steps:
> 1) write code 2) git diff --check (will give if there are any white
> space or indentation errors) 3) git add -u 4) git commit (will enter a
> commit message) 5) git format-patch -1 <> -v
> <> 6) to apply patch, git apply <>.patch

thanks, I've removed these whitespaces and confirmed no warnings
occurred when I run "git apply <>.patch"

> If you don't destroy the hash, you are going to cause a memory leak.
> Because, the pointer to hash tableft_htab is local to
> ExecuteTruncateGuts (note that it's not a static variable) and you are
> creating a memory for the hash table and leaving the function without
> cleaning it up. IMHO, we should destroy the hash memory at the end of
> ExecuteTruncateGuts.

Sure. I've added head_destroy().

> Another comment for tests, why do we need to do full outer join of two
> tables to just show up there are some rows in the table? I would
> suggest that all the tests introduced in the patch can be something
> like this: 1) before truncate, just show the count(*) from the table
> 2) truncate the foreign tables 3) after truncate, just show the
> count(*) which should be 0. Because we don't care what the data is in
> the tables, we only care about whether truncate is happened or not.

Sure. I've replaced with the test command "SELECT * FROM ..." to
"SELECT COUNT(*) FROM ..."
However, for example, the "id" column is used to check after running
TRUNCATE with ONLY clause to the inherited table.
Thus, I use "sum(id)" instead of "count(*)"  to check the result when
the table has records.

2021年4月5日(月) 0:20 Bharath Rupireddy :
>
> On Sun, Apr 4, 2021 at 12:00 PM Kazutaka Onishi  wrote:
> > > 5) Can't we use do_sql_command function after making it non static? We
> > > could go extra mile, that is we could make do_sql_command little more
> > > generic by passing some enum for each of PQsendQuery,
> > > PQsendQueryParams, PQsendQueryPrepared and PQsendPrepare and replace
> > > the respective code chunks with do_sql_command in postgres_fdw.c.
> >
> > I've skipped this for now.
> > I feel it sounds cool, but not easy.
> > It should be added by another patch because it's not only related to 
> > TRUNCATE.
>
> Fair enough! I will give it a thought and provide a patch separately.
>
> > > 6) A white space error when the patch is applied.
> > > contrib/postgres_fdw/postgres_fdw.c:2913: trailing whitespace.
> >
> > I've checked the patch and clean spaces.
> > But I can't confirmed this message by attaching(patch -p1 < ...) my v8 
> > patch.
> > If this still occurs, please tell me how you attach the patch.
>
> I usually follow these steps:
> 1) write code 2) git diff --check (will give if there are any white
> space or indentation errors) 3) git add -u 4) git commit (will enter a
> commit message) 5) git format-patch -1 <> -v
> <> 6) to apply patch, git apply <>.patch
>
> > > 7) I may be missing something here. Why do we need a hash table at
> > > all? We could just do it with a linked list right? Is there a specific
> > > reason to use a hash table? IIUC, the hash table entries will be lying
> > > around until the local session exists since we are not doing
> > > hash_destroy.
> >
> > I've skipped this for now.
>
> If you don't destroy the hash, you are going to cause a memory leak.
> Because, the pointer to hash tableft_htab is local to
> ExecuteTruncateGuts (note that it's not a static variable) and you are
> creating a memory for the hash table and leaving the function without
> cleaning it up. IMHO, we should destroy the hash memory at the end of
> ExecuteTruncateGuts.
>
> Another comment for tests, why do we need to do full outer join of two
> tables to just show up there are some rows in the table? I would
> suggest that all the tests introduced in the patch can be something
> like this: 1) before truncate, just show the count(*) from the table
> 2) truncate the foreign tables 3) after truncate, just show the
> count(*) which should be 0. Because we don't care what the data is in
> the tables, we only care about whether truncate is happened or not.
>
> +SELECT * FROM tru_ftable a FULL OUTER JOIN tru_pk_ftable b ON a.id =
> b.id ORDER BY a.id;
> + id |x | id |z
> ++--++--
> +  1 | eccbc87e4b5ce2fe28308fd9f2a7baf3 ||
> +  2 | a87ff679a2f3e71d9181a67b7542122c ||
> +  3 | e4da3b7fbbce2345d7772b0674a318d5 |  3 | 
> 1679091c5a880faf6fb5e6087eb1b2dc
> +  4 | 1679091c5a880faf6fb5e6087eb1b2dc |  4 | 
> 8f14e45fceea167a5a36dedd4bea2543
> +  5 | 8f14e45fceea167a5a36dedd4bea2543 |  5 | 
> c9f0f895fb98ab9159f51fd0297e236d
> +  6 | c9f0f895fb98ab9159f51fd0297e236d |  6 | 
> 45c48cce2e2d7fbdea1afc51c7c6ad26
> +  7 | 45c48cce2e2d7fbdea1afc51c7c6ad26 |  7 | 
> d3d9446802a44259755d38e6d163e820
> +  8 | d3d9446802a44259755d38e6d163e820 |  8 | 
> 6512bd43d9caa6e02c990b0a82652dca
> +| 

Re: ModifyTable overheads in generic plans

2021-04-04 Thread Tom Lane
Amit Langote  writes:
> On Sun, Apr 4, 2021 at 10:20 AM Tom Lane  wrote:
>> In some desultory performance testing here, it seemed like a
>> significant part of the cost is ExecOpenIndices, and I don't see
>> a reason offhand why we could not delay/skip that.  I also concur
>> with delaying construction of ri_ChildToRootMap and the
>> partition_tuple_routing data structures, since many queries will
>> never need those at all.

> As I mentioned in [1], creating ri_projectNew can be expensive too,
> especially as column count (and partition count for the generic plan
> case) grows.  I think we should have an static inline
> initialize-on-first-access accessor function for that field too.

> Actually, I remember considering having such accessor functions (all
> static inline) for ri_WithCheckOptionExprs, ri_projectReturning,
> ri_onConflictArbiterIndexes, and ri_onConflict (needed by ON CONFLICT
> UPDATE) as well, prompted by Heikki's comments earlier in the
> discussion.  I also remember, before even writing this patch, not
> liking that WCO and RETURNING expressions are initialized in their own
> separate loops, rather than being part of the earlier loop that says:

Sure, we might as well try to improve the cosmetics here.

>> Anyway, I think the way to proceed for now is to grab the low-hanging
>> fruit of things that clearly won't change any semantics.  But tail end
>> of the dev cycle is no time to be making really fundamental changes
>> in how FDW direct modify works.

> I agree.

OK.  Do you want to pull out the bits of the patch that we can still
do without postponing BeginDirectModify?

Another thing we could consider, perhaps, is keeping the behavior
the same for foreign tables but postponing init of local ones.
To avoid opening the relations to figure out which kind they are,
we'd have to rely on the RTE copies of relkind, which is a bit
worrisome --- I'm not certain that those are guaranteed to be
up-to-date --- but it's probably okay since there is no way to
convert a regular table to foreign or vice versa.  Anyway, that
idea seems fairly messy so I'm inclined to just pursue the
lower-hanging fruit for now.

regards, tom lane




Re: possible repalloc() in icu_convert_case()

2021-04-04 Thread Tom Lane
Anton Voloshin  writes:
> in src/backend/utils/adt/formatting.c, in icu_convert_case() I see:
>  if (status == U_BUFFER_OVERFLOW_ERROR)
>  {
>  /* try again with adjusted length */
>  pfree(*buff_dest);
>  *buff_dest = palloc(len_dest * sizeof(**buff_dest));
>  ...

> Is there any reason why this should not be repalloc()?

repalloc is likely to be more expensive, since it implies copying
data which isn't helpful here.  I think this code is fine as-is.

regards, tom lane




Re: Allow batched insert during cross-partition updates

2021-04-04 Thread Zhihong Yu
Hi,
In the description:

cross-partition update of partitioned tables can't use batching
because ExecInitRoutingInfo() which initializes the insert target

'which' should be dropped since 'because' should start a sentence.

+-- Check that batched inserts also works for inserts made during

inserts also works -> inserts also work

+   Assert(node->rootResultRelInfo->ri_RelationDesc->rd_rel->relkind ==
+  RELKIND_PARTITIONED_TABLE);

The level of nested field accesses is quite deep. If the assertion fails,
it would be hard to know which field is null.
Maybe use several assertions:
   Assert(node->rootResultRelInfo)
   Assert(node->rootResultRelInfo->ri_RelationDesc)
   Assert(node->rootResultRelInfo->ri_RelationDesc->rd_rel->relkind ==
...

Cheers

On Sun, Apr 4, 2021 at 8:06 AM Amit Langote  wrote:

> On Tue, Mar 16, 2021 at 6:13 PM  wrote:
> > Status updated to RfC in the commitfest app.
>
> Patch fails to apply per cfbot, so rebased.
>
> --
> Amit Langote
> EDB: http://www.enterprisedb.com
>


Re: Unused variable found in AttrDefaultFetch

2021-04-04 Thread Tom Lane
Zhihong Yu  writes:
> Andrew:
> Can you let me know which thread you were referring to?

I assume he meant
https://www.postgresql.org/message-id/flat/31e2e921-7002-4c27-59f5-51f08404c858%402ndQuadrant.com

whih was last added to just moments ago.

regards, tom lane




Re: ALTER TABLE ADD COLUMN fast default

2021-04-04 Thread Tom Lane
Andrew Dunstan  writes:
> On 4/3/21 10:09 PM, Tom Lane wrote:
>> Looking around at the other touches of AttrDefault.adbin in the backend
>> (of which there are not that many), some of them are prepared for it to be
>> NULL and some are not.  I don't immediately have a strong opinion whether
>> that should be an allowed state; but if it is not allowed then it's not
>> okay for AttrDefaultFetch to leave such a state behind.  Alternatively,
>> if it is allowed then equalTupleDesc is in the wrong.  We should choose
>> one definition and make all the relevant code match it.

> There's already a warning if it sets an explicit NULL value, so I'm
> inclined to say your suggestion "it's not okay for AttrDefaultFetch to
> leave such a state behind" is what we should go with.

Yeah, I came to the same conclusion after looking around a bit more.
There are two places in tupdesc.c that defend against adbin being NULL,
which seems a bit silly when their sibling equalTupleDesc does not.
Every other touch in the backend will segfault on a NULL value,
if it doesn't Assert first :-(

Another thing that annoyed me while I was looking at this is the
potentially O(N^2) behavior in equalTupleDesc due to not wanting
to assume that the AttrDefault arrays are in the same order.
It seems far smarter to have AttrDefaultFetch sort the arrays.

So attached is a proposed patch to clean all this up.

* Don't link the AttrDefault array into the relcache entry until
it's fully built and valid.

* elog, don't just Assert, if we fail to find an expected default
value.  (Perhaps there's a case for ereport instead.)

* Remove now-useless null checks in tupdesc.c.

* Sort the AttrDefault array, remove double loops in equalTupleDesc.

I made CheckConstraintFetch likewise not install its array until
it's done.  I notice that it is throwing elog(ERROR) not WARNING
for the equivalent cases of not finding the right number of
entries.  I wonder if we ought to back that off to a WARNING too.
elog(ERROR) during relcache entry load is kinda nasty, because
it prevents essentially *all* use of the relation.  On the other
hand, failing to enforce check constraints isn't lovely either.
Anybody have opinions about that?

regards, tom lane

diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index cb76465050..d81f6b8a60 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -174,10 +174,7 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 			cpy->defval = (AttrDefault *) palloc(cpy->num_defval * sizeof(AttrDefault));
 			memcpy(cpy->defval, constr->defval, cpy->num_defval * sizeof(AttrDefault));
 			for (i = cpy->num_defval - 1; i >= 0; i--)
-			{
-if (constr->defval[i].adbin)
-	cpy->defval[i].adbin = pstrdup(constr->defval[i].adbin);
-			}
+cpy->defval[i].adbin = pstrdup(constr->defval[i].adbin);
 		}
 
 		if (constr->missing)
@@ -328,10 +325,7 @@ FreeTupleDesc(TupleDesc tupdesc)
 			AttrDefault *attrdef = tupdesc->constr->defval;
 
 			for (i = tupdesc->constr->num_defval - 1; i >= 0; i--)
-			{
-if (attrdef[i].adbin)
-	pfree(attrdef[i].adbin);
-			}
+pfree(attrdef[i].adbin);
 			pfree(attrdef);
 		}
 		if (tupdesc->constr->missing)
@@ -412,7 +406,6 @@ bool
 equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
 {
 	int			i,
-j,
 n;
 
 	if (tupdesc1->natts != tupdesc2->natts)
@@ -488,22 +481,13 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
 		n = constr1->num_defval;
 		if (n != (int) constr2->num_defval)
 			return false;
+		/* We assume here that both AttrDefault arrays are in adnum order */
 		for (i = 0; i < n; i++)
 		{
 			AttrDefault *defval1 = constr1->defval + i;
-			AttrDefault *defval2 = constr2->defval;
-
-			/*
-			 * We can't assume that the items are always read from the system
-			 * catalogs in the same order; so use the adnum field to identify
-			 * the matching item to compare.
-			 */
-			for (j = 0; j < n; defval2++, j++)
-			{
-if (defval1->adnum == defval2->adnum)
-	break;
-			}
-			if (j >= n)
+			AttrDefault *defval2 = constr2->defval + i;
+
+			if (defval1->adnum != defval2->adnum)
 return false;
 			if (strcmp(defval1->adbin, defval2->adbin) != 0)
 return false;
@@ -534,25 +518,21 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
 		n = constr1->num_check;
 		if (n != (int) constr2->num_check)
 			return false;
+
+		/*
+		 * Similarly, we rely here on the ConstrCheck entries being sorted by
+		 * name.  If there are duplicate names, the outcome of the comparison
+		 * is uncertain, but that should not happen.
+		 */
 		for (i = 0; i < n; i++)
 		{
 			ConstrCheck *check1 = constr1->check + i;
-			ConstrCheck *check2 = constr2->check;
-
-			/*
-			 * Similarly, don't assume that the checks are always read in the
-			 * same order; match them up by name and contents. (The name
-			 * *should* be unique, but...)
-			 */
-			for (j = 0; j < n; check2++, j++)
-			{
-			

Re: Unused variable found in AttrDefaultFetch

2021-04-04 Thread Zhihong Yu
I found the recent thread under 'ALTER TABLE ADD COLUMN fast default' which
hasn't appeared in the message chain yet.

I will watch that thread.

Cheers



On Sun, Apr 4, 2021 at 8:47 AM Zhihong Yu  wrote:

> Andrew:
> Can you let me know which thread you were referring to?
>
> I navigated the thread mentioned in your commit. It has been more than 3
> years since the last response:
>
>
> https://www.postgresql.org/message-id/CAA8%3DA7-OPsGeazXxiojQNMus51odNZVn8EVNSoWZ2y9yRL%2BBvQ%40mail.gmail.com
>
> Can you let me know the current plan ?
>
> Cheers
>
> On Sun, Apr 4, 2021 at 8:13 AM Andrew Dunstan  wrote:
>
>>
>> On 4/4/21 9:39 AM, Zhihong Yu wrote:
>> > Andrew:
>> > Can you chime in which direction to go ?
>> >
>> > Once consensus is reached, I can provide a new patch, if needed.
>> >
>> > Cheers
>> >
>> >
>>
>> [ please don't top-post ]
>>
>>
>> I don't think we need a new patch. We'll clean this up one way or
>> another as part of the cleanup on the other thread.
>>
>>
>> cheers
>>
>>
>> andrew
>>
>> --
>> Andrew Dunstan
>> EDB: https://www.enterprisedb.com
>>
>>


Re: Unused variable found in AttrDefaultFetch

2021-04-04 Thread Zhihong Yu
Andrew:
Can you let me know which thread you were referring to?

I navigated the thread mentioned in your commit. It has been more than 3
years since the last response:

https://www.postgresql.org/message-id/CAA8%3DA7-OPsGeazXxiojQNMus51odNZVn8EVNSoWZ2y9yRL%2BBvQ%40mail.gmail.com

Can you let me know the current plan ?

Cheers

On Sun, Apr 4, 2021 at 8:13 AM Andrew Dunstan  wrote:

>
> On 4/4/21 9:39 AM, Zhihong Yu wrote:
> > Andrew:
> > Can you chime in which direction to go ?
> >
> > Once consensus is reached, I can provide a new patch, if needed.
> >
> > Cheers
> >
> >
>
> [ please don't top-post ]
>
>
> I don't think we need a new patch. We'll clean this up one way or
> another as part of the cleanup on the other thread.
>
>
> cheers
>
>
> andrew
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>


possible repalloc() in icu_convert_case()

2021-04-04 Thread Anton Voloshin

Hello,

in src/backend/utils/adt/formatting.c, in icu_convert_case() I see:
if (status == U_BUFFER_OVERFLOW_ERROR)
{
/* try again with adjusted length */
pfree(*buff_dest);
*buff_dest = palloc(len_dest * sizeof(**buff_dest));
...

Is there any reason why this should not be repalloc()?

In case it should be, I've attached a corresponding patch.

--
Anton Voloshin
Postgres Professional: https://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/utils/adt/formatting.c 
b/src/backend/utils/adt/formatting.c
index 783c7b5e7a..409067e4a0 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -1588,8 +1588,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t 
mylocale,
if (status == U_BUFFER_OVERFLOW_ERROR)
{
/* try again with adjusted length */
-   pfree(*buff_dest);
-   *buff_dest = palloc(len_dest * sizeof(**buff_dest));
+   *buff_dest = repalloc(*buff_dest, len_dest * sizeof(**buff_dest));
status = U_ZERO_ERROR;
len_dest = func(*buff_dest, len_dest, buff_source, len_source,
mylocale->info.icu.locale, );


Re: ALTER TABLE ADD COLUMN fast default

2021-04-04 Thread Andrew Dunstan


On 4/4/21 9:19 AM, Justin Pryzby wrote:
> It reminded me of this thread, but nothing ever came of it.
> https://www.postgresql.org/message-id/20200328223052.GK20103%40telsasoft.com
>
>
> https://www.postgresql.org/message-id/87pmzaq4gx.fsf%40news-spur.riddles.org.uk
>
>

I don't recall seeing this. Around that time I was busy returning from
Australia at the start of the pandemic, so my I might have missed it in
the hubbub.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: TRUNCATE on foreign table

2021-04-04 Thread Bharath Rupireddy
On Sun, Apr 4, 2021 at 12:00 PM Kazutaka Onishi  wrote:
> > 5) Can't we use do_sql_command function after making it non static? We
> > could go extra mile, that is we could make do_sql_command little more
> > generic by passing some enum for each of PQsendQuery,
> > PQsendQueryParams, PQsendQueryPrepared and PQsendPrepare and replace
> > the respective code chunks with do_sql_command in postgres_fdw.c.
>
> I've skipped this for now.
> I feel it sounds cool, but not easy.
> It should be added by another patch because it's not only related to TRUNCATE.

Fair enough! I will give it a thought and provide a patch separately.

> > 6) A white space error when the patch is applied.
> > contrib/postgres_fdw/postgres_fdw.c:2913: trailing whitespace.
>
> I've checked the patch and clean spaces.
> But I can't confirmed this message by attaching(patch -p1 < ...) my v8 patch.
> If this still occurs, please tell me how you attach the patch.

I usually follow these steps:
1) write code 2) git diff --check (will give if there are any white
space or indentation errors) 3) git add -u 4) git commit (will enter a
commit message) 5) git format-patch -1 <> -v
<> 6) to apply patch, git apply <>.patch

> > 7) I may be missing something here. Why do we need a hash table at
> > all? We could just do it with a linked list right? Is there a specific
> > reason to use a hash table? IIUC, the hash table entries will be lying
> > around until the local session exists since we are not doing
> > hash_destroy.
>
> I've skipped this for now.

If you don't destroy the hash, you are going to cause a memory leak.
Because, the pointer to hash tableft_htab is local to
ExecuteTruncateGuts (note that it's not a static variable) and you are
creating a memory for the hash table and leaving the function without
cleaning it up. IMHO, we should destroy the hash memory at the end of
ExecuteTruncateGuts.

Another comment for tests, why do we need to do full outer join of two
tables to just show up there are some rows in the table? I would
suggest that all the tests introduced in the patch can be something
like this: 1) before truncate, just show the count(*) from the table
2) truncate the foreign tables 3) after truncate, just show the
count(*) which should be 0. Because we don't care what the data is in
the tables, we only care about whether truncate is happened or not.

+SELECT * FROM tru_ftable a FULL OUTER JOIN tru_pk_ftable b ON a.id =
b.id ORDER BY a.id;
+ id |x | id |z
++--++--
+  1 | eccbc87e4b5ce2fe28308fd9f2a7baf3 ||
+  2 | a87ff679a2f3e71d9181a67b7542122c ||
+  3 | e4da3b7fbbce2345d7772b0674a318d5 |  3 | 1679091c5a880faf6fb5e6087eb1b2dc
+  4 | 1679091c5a880faf6fb5e6087eb1b2dc |  4 | 8f14e45fceea167a5a36dedd4bea2543
+  5 | 8f14e45fceea167a5a36dedd4bea2543 |  5 | c9f0f895fb98ab9159f51fd0297e236d
+  6 | c9f0f895fb98ab9159f51fd0297e236d |  6 | 45c48cce2e2d7fbdea1afc51c7c6ad26
+  7 | 45c48cce2e2d7fbdea1afc51c7c6ad26 |  7 | d3d9446802a44259755d38e6d163e820
+  8 | d3d9446802a44259755d38e6d163e820 |  8 | 6512bd43d9caa6e02c990b0a82652dca
+|  |  9 | c20ad4d76fe97759aa27a0c99bff6710
+|  | 10 | c51ce410c124a10e0db5e4b97fc2af39
+(10 rows)
+
+TRUNCATE tru_ftable, tru_pk_ftable CASCADE;
+SELECT * FROM tru_ftable a FULL OUTER JOIN tru_pk_ftable b ON a.id =
b.id ORDER BY a.id;  -- empty


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: TRUNCATE on foreign table

2021-04-04 Thread Kazutaka Onishi
Oops... sorry.
I haven't merged my working git branch with remote master branch.
Please check this v11.

2021年4月4日(日) 23:56 Bharath Rupireddy :
>
> On Sun, Apr 4, 2021 at 12:48 PM Kazutaka Onishi  wrote:
> >
> > v9 has also typo because I haven't checked about doc... sorry.
>
> I think v9 has some changes not related to the foreign table truncate
> feature. If yes, please remove those changes and provide a proper
> patch.
>
> diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
> diff --git a/src/backend/bootstrap/bootstrap.c
> b/src/backend/bootstrap/bootstrap.c
> diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
> 
> 
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com


pgsql14-truncate-on-foreign-table.v11.patch
Description: Binary data


Re: Unused variable found in AttrDefaultFetch

2021-04-04 Thread Andrew Dunstan


On 4/4/21 9:39 AM, Zhihong Yu wrote:
> Andrew:
> Can you chime in which direction to go ?
>
> Once consensus is reached, I can provide a new patch, if needed.
>
> Cheers
>
>

[ please don't top-post ]


I don't think we need a new patch. We'll clean this up one way or
another as part of the cleanup on the other thread.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: ALTER TABLE ADD COLUMN fast default

2021-04-04 Thread Andrew Dunstan


On 4/3/21 10:09 PM, Tom Lane wrote:
> Andrew Gierth  writes:
>> I just got through diagnosing a SEGV crash with someone on IRC, and the
>> cause turned out to be exactly this - a table had (for some reason we
>> could not determine within the available resources) lost its pg_attrdef
>> record for the one column it had with a default (which was a serial
>> column, so none of the fast-default code is actually implicated). 


I don't recall why the check was removed. In general the fast default
code doesn't touch adbin.


I'd be curious to know how this state came about. From the description
it sounds like something removed the pg_attrdef entry without adjusting
pg_attribute, which shouldn't happen.


>> Any
>> attempt to alter the table resulted in a crash in equalTupleDesc on this
>> line:
>> if (strcmp(defval1->adbin, defval2->adbin) != 0)
>> due to trying to compare adbin values which were NULL pointers.
> Ouch.
>
>> Does equalTupleDesc need to be more defensive about this, or does the
>> above check need to be reinstated?
> Looking around at the other touches of AttrDefault.adbin in the backend
> (of which there are not that many), some of them are prepared for it to be
> NULL and some are not.  I don't immediately have a strong opinion whether
> that should be an allowed state; but if it is not allowed then it's not
> okay for AttrDefaultFetch to leave such a state behind.  Alternatively,
> if it is allowed then equalTupleDesc is in the wrong.  We should choose
> one definition and make all the relevant code match it.
>
>   


There's already a warning if it sets an explicit NULL value, so I'm
inclined to say your suggestion "it's not okay for AttrDefaultFetch to
leave such a state behind" is what we should go with.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Allow batched insert during cross-partition updates

2021-04-04 Thread Amit Langote
On Tue, Mar 16, 2021 at 6:13 PM  wrote:
> Status updated to RfC in the commitfest app.

Patch fails to apply per cfbot, so rebased.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


v3-0001-Allow-batching-of-inserts-during-cross-partition-.patch
Description: Binary data


Re: TRUNCATE on foreign table

2021-04-04 Thread Bharath Rupireddy
On Sun, Apr 4, 2021 at 12:48 PM Kazutaka Onishi  wrote:
>
> v9 has also typo because I haven't checked about doc... sorry.

I think v9 has some changes not related to the foreign table truncate
feature. If yes, please remove those changes and provide a proper
patch.

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
diff --git a/src/backend/bootstrap/bootstrap.c
b/src/backend/bootstrap/bootstrap.c
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c



With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: ModifyTable overheads in generic plans

2021-04-04 Thread Amit Langote
On Sun, Apr 4, 2021 at 10:20 AM Tom Lane  wrote:
> Amit Langote  writes:
> > On Thu, Apr 1, 2021 at 3:12 AM Tom Lane  wrote:
> >> Amit Langote  writes:
> > [ v14-0002-Initialize-result-relation-information-lazily.patch ]
> >> Needs YA rebase over 86dc90056.
>
> > Done.
>
> I spent some time looking this over.

Thanks.

> There are bits of it we can
> adopt without too much trouble, but I'm afraid that 0001 (delay
> FDW BeginDirectModify until the first actual update) is a nonstarter,
> which makes the main idea of delaying ExecInitResultRelation unworkable.
>
> My fear about 0001 is that it will destroy any hope of direct updates
> on different remote partitions executing with consistent semantics
> (i.e. compatible snapshots), because some row updates triggered by the
> local query may have already happened before a given partition gets to
> start its remote query.  Maybe we can work around that, but I do not
> want to commit a major restructuring that assumes we can dodge this
> problem when we don't yet even have a fix for cross-partition updates
> that does rely on the assumption of synchronous startup.

Hmm, okay, I can understand the concern.

> In some desultory performance testing here, it seemed like a
> significant part of the cost is ExecOpenIndices, and I don't see
> a reason offhand why we could not delay/skip that.  I also concur
> with delaying construction of ri_ChildToRootMap and the
> partition_tuple_routing data structures, since many queries will
> never need those at all.

As I mentioned in [1], creating ri_projectNew can be expensive too,
especially as column count (and partition count for the generic plan
case) grows.  I think we should have an static inline
initialize-on-first-access accessor function for that field too.

Actually, I remember considering having such accessor functions (all
static inline) for ri_WithCheckOptionExprs, ri_projectReturning,
ri_onConflictArbiterIndexes, and ri_onConflict (needed by ON CONFLICT
UPDATE) as well, prompted by Heikki's comments earlier in the
discussion.  I also remember, before even writing this patch, not
liking that WCO and RETURNING expressions are initialized in their own
separate loops, rather than being part of the earlier loop that says:

/*
 * Do additional per-result-relation initialization.
 */
for (i = 0; i < nrels; i++)
{

I guess ri_RowIdAttNo initialization can go into the same loop.

> > * PartitionTupleRouting.subplan_resultrel_htab is removed in favor
> > of using ModifyTableState.mt_resultOidHash to look up an UPDATE
> > result relation by OID.
>
> Hmm, that sounds promising too, though I didn't look at the details.
>
> Anyway, I think the way to proceed for now is to grab the low-hanging
> fruit of things that clearly won't change any semantics.  But tail end
> of the dev cycle is no time to be making really fundamental changes
> in how FDW direct modify works.

I agree.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

[1] 
https://www.postgresql.org/message-id/CA+HiwqHLUNhMxy46Mrb04VJpN=hudm9bd7xdz6f5h2o4imx...@mail.gmail.com




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-04-04 Thread Julien Rouhaud
On Fri, Apr 02, 2021 at 01:33:28PM +0800, Julien Rouhaud wrote:
> On Thu, Apr 01, 2021 at 03:27:11PM -0400, Bruce Momjian wrote:
> > 
> > OK, I am happy with your design decisions, thanks.
> 
> Thanks!  While double checking I noticed that I failed to remove a (now)
> useless include of pgstat.h in nodeGatherMerge.c in last version.  I'm
> attaching v22 to fix that, no other change.

There was a conflict since e1025044c (Split backend status and progress related
functionality out of pgstat.c).

Attached v23 is a rebase against current HEAD, and I also added a few
UINT64CONST() macro usage for consistency.
>From 29eda2d08f3ed38bbf443898dfad645f5d279d96 Mon Sep 17 00:00:00 2001
From: Bruce Momjian 
Date: Mon, 22 Mar 2021 17:43:22 -0400
Subject: [PATCH v23 1/3] Move pg_stat_statements query jumbling to core.

A new compute_query_id GUC is also added, to control whether a query identifier
should be computed by the core.  It's thefore now possible to disable core
queryid computation and use pg_stat_statements with a different algorithm to
compute the query identifier by using third-party module.

To ensure that a single source of query identifier can be used and is well
defined, modules that calculate a query identifier should throw an error if
compute_query_id is enabled or if a query idenfitier was already calculated.
---
 .../pg_stat_statements/pg_stat_statements.c   | 805 +
 .../pg_stat_statements.conf   |   1 +
 doc/src/sgml/config.sgml  |  25 +
 doc/src/sgml/pgstatstatements.sgml|  20 +-
 src/backend/parser/analyze.c  |  14 +-
 src/backend/tcop/postgres.c   |   6 +-
 src/backend/utils/misc/Makefile   |   1 +
 src/backend/utils/misc/guc.c  |  10 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/backend/utils/misc/queryjumble.c  | 834 ++
 src/include/parser/analyze.h  |   4 +-
 src/include/utils/guc.h   |   1 +
 src/include/utils/queryjumble.h   |  58 ++
 13 files changed, 995 insertions(+), 785 deletions(-)
 create mode 100644 src/backend/utils/misc/queryjumble.c
 create mode 100644 src/include/utils/queryjumble.h

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 1141d2b067..0f8bac0cca 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -8,24 +8,9 @@
  * a shared hashtable.  (We track only as many distinct queries as will fit
  * in the designated amount of shared memory.)
  *
- * As of Postgres 9.2, this module normalizes query entries.  Normalization
- * is a process whereby similar queries, typically differing only in their
- * constants (though the exact rules are somewhat more subtle than that) are
- * recognized as equivalent, and are tracked as a single entry.  This is
- * particularly useful for non-prepared queries.
- *
- * Normalization is implemented by fingerprinting queries, selectively
- * serializing those fields of each query tree's nodes that are judged to be
- * essential to the query.  This is referred to as a query jumble.  This is
- * distinct from a regular serialization in that various extraneous
- * information is ignored as irrelevant or not essential to the query, such
- * as the collations of Vars and, most notably, the values of constants.
- *
- * This jumble is acquired at the end of parse analysis of each query, and
- * a 64-bit hash of it is stored into the query's Query.queryId field.
- * The server then copies this value around, making it available in plan
- * tree(s) generated from the query.  The executor can then use this value
- * to blame query costs on the proper queryId.
+ * Starting in Postgres 9.2, this module normalized query entries.  As of
+ * Postgres 14, the normalization is done by the core if compute_query_id is
+ * enabled, or optionally by third-party modules.
  *
  * To facilitate presenting entries to users, we create "representative" query
  * strings in which constants are replaced with parameter symbols ($n), to
@@ -116,8 +101,6 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define USAGE_DEALLOC_PERCENT	5	/* free this % of entries at once */
 #define IS_STICKY(c)	((c.calls[PGSS_PLAN] + c.calls[PGSS_EXEC]) == 0)
 
-#define JUMBLE_SIZE1024	/* query serialization buffer size */
-
 /*
  * Extension version number, for supporting older extension versions' objects
  */
@@ -237,40 +220,6 @@ typedef struct pgssSharedState
 	pgssGlobalStats stats;		/* global statistics for pgss */
 } pgssSharedState;
 
-/*
- * Struct for tracking locations/lengths of constants during normalization
- */
-typedef struct pgssLocationLen
-{
-	int			location;		/* start offset in query text */
-	int			length;			/* length in bytes, or -1 to ignore */
-} pgssLocationLen;
-
-/*
- * Working state for computing a query 

Re: Unused variable found in AttrDefaultFetch

2021-04-04 Thread Zhihong Yu
Andrew:
Can you chime in which direction to go ?

Once consensus is reached, I can provide a new patch, if needed.

Cheers

On Sat, Apr 3, 2021 at 9:54 PM Michael Paquier  wrote:

> On Sun, Apr 04, 2021 at 10:13:26AM +0530, Bharath Rupireddy wrote:
> > +1 to remove it and the patch LGTM.
>
> Indeed, there is no point to keep that around.  I'll go clean up that
> as you propose.
> --
> Michael
>


Re: ALTER TABLE ADD COLUMN fast default

2021-04-04 Thread Justin Pryzby
It reminded me of this thread, but nothing ever came of it.
https://www.postgresql.org/message-id/20200328223052.GK20103%40telsasoft.com


https://www.postgresql.org/message-id/87pmzaq4gx.fsf%40news-spur.riddles.org.uk




[PATCH] typo fix in collationcmds.c: "if they are distinct"

2021-04-04 Thread Anton Voloshin

Hello,

just a quick patch for a single-letter typo in a comment
in src/backend/commands/collationcmds.c
...
 * set of language+region combinations, whereas the latter only returns
-* language+region combinations of they are distinct from the language's
+* language+region combinations if they are distinct from the language's
 * base collation.  So there might not be a de-DE or en-GB, which 
would be

...
(please see the attached patch).

--
Anton Voloshin
Postgres Professional: https://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/commands/collationcmds.c 
b/src/backend/commands/collationcmds.c
index 55a0e24a35..b8ec6f5756 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -577,7 +577,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 * We use uloc_countAvailable()/uloc_getAvailable() rather than
 * ucol_countAvailable()/ucol_getAvailable().  The former returns a full
 * set of language+region combinations, whereas the latter only returns
-* language+region combinations of they are distinct from the language's
+* language+region combinations if they are distinct from the language's
 * base collation.  So there might not be a de-DE or en-GB, which would be
 * confusing.
 */


Re: Additional Chapter for Tutorial - arch-dev.sgml

2021-04-04 Thread Jürgen Purtz

On 03.04.21 21:01, Alvaro Herrera wrote:

On 2021-Apr-03, Jürgen Purtz wrote:


On 03.04.21 15:39, Alvaro Herrera wrote:

Yes, there is.  AFAICS Heikki committed a small wordsmithing patch --
not the large patch with the additional chapter.

What can i do to move the matter forward?

Please post a version that applies to the current sources.  If the
latest version posted does, please state so.

The small patch 'arch-dev.sgml.20210121.diff' contains only some 
clearing up concerning the used terminology and its alignments with the 
glossary. The patch was rejected by Heikki.


The latest version of the huge patch '0013-architecture.patch' is valid 
and doesn't contain merge conflicts.


--
Jürgen Purtz





Re: proposal: schema variables

2021-04-04 Thread Pavel Stehule
Hi

so 13. 3. 2021 v 7:01 odesílatel Pavel Stehule 
napsal:

> Hi
>
> fresh rebase
>

only rebase

Regards

Pavel


>
> Pavel
>


schema-variables-20210404.patch.gz
Description: application/gzip


Re: simplifying foreign key/RI checks

2021-04-04 Thread Amit Langote
On Fri, Apr 2, 2021 at 11:55 PM Zhihong Yu  wrote:
>
> Hi,
>
> +   skip = !ExecLockTableTuple(erm->relation, , markSlot,
> +  estate->es_snapshot, estate->es_output_cid,
> +  lockmode, erm->waitPolicy, _needed);
> +   if (skip)
>
> It seems the variable skip is only used above. The variable is not needed - 
> if statement can directly check the return value.
>
> + * Locks tuple with given TID with given lockmode following given 
> wait
>
> given appears three times in the above sentence. Maybe the following is bit 
> easier to read:
>
> Locks tuple with the specified TID, lockmode following given wait policy
>
> + * Checks whether a tuple containing the same unique key as extracted from 
> the
> + * tuple provided in 'slot' exists in 'pk_rel'.
>
> I think 'same' is not needed here since the remaining part of the sentence 
> has adequately identified the key.
>
> +   if (leaf_pk_rel == NULL)
> +   goto done;
>
> It would be better to avoid goto by including the cleanup statements in the 
> if block and return.
>
> +   if (index_getnext_slot(scan, ForwardScanDirection, outslot))
> +   found = true;
> +
> +   /* Found tuple, try to lock it in key share mode. */
> +   if (found)
>
> Since found is only assigned in one place, the two if statements can be 
> combined into one.

Thanks for taking a look.  I agree with most of your suggestions and
have incorporated them in the v8 just posted.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: simplifying foreign key/RI checks

2021-04-04 Thread Amit Langote
Hi Alvaro,

On Sat, Apr 3, 2021 at 12:01 AM Alvaro Herrera  wrote:
> On 2021-Apr-02, Amit Langote wrote:
>
> > On Sat, Mar 20, 2021 at 10:21 PM Amit Langote  
> > wrote:
> > > Updated patches attached.  Sorry about the delay.
> >
> > Rebased over the recent DETACH PARTITION CONCURRENTLY work.
> > Apparently, ri_ReferencedKeyExists() was using the wrong snapshot.
>
> Hmm, I wonder if that stuff should be using a PartitionDirectory?  (I
> didn't actually understand what your code is doing, so please forgive if
> this is a silly question.)

No problem, I wondered about that too when rebasing.

My instinct *was* that maybe there's no need for it, because
find_leaf_pk_rel()'s use of a PartitionDesc is pretty limited in
duration and scope of the kind of things it calls that there's no need
to worry about it getting invalidated while in use.  But I may be
wrong about that, because get_partition_for_tuple() can call arbitrary
user-defined functions, which may result in invalidation messages
being processed and an unguarded PartitionDesc getting wiped out under
us.

So, I've added PartitionDirectory protection in find_leaf_pk_rel() in
the attached updated version.

--
Amit Langote
EDB: http://www.enterprisedb.com


v8-0001-Export-get_partition_for_tuple.patch
Description: Binary data


v8-0002-Avoid-using-SPI-for-some-RI-checks.patch
Description: Binary data


Re: [PATCH] Implement motd for PostgreSQL

2021-04-04 Thread Joel Jacobson
On Sun, Apr 4, 2021, at 09:16, Fabien COELHO wrote:
> If other cases are indeed pointers. For pgbench, all direct "if (xxx &&" 
> cases are simple booleans or integers, pointers seem to have "!= NULL". 
> While looking quickly at the grep output, ISTM that most obvious pointers 
> have "!= NULL" and other cases often look like booleans:
> 
>catalog/pg_operator.c:  if (isDelete && t->oprcom == baseId)
>catalog/toasting.c: if (check && lockmode != AccessExclusiveLock)
>commands/async.c:   if (amRegisteredListener && listenChannels == NIL)
>commands/explain.c: if (es->analyze && es->timing)
>…
> 
> I'm sure there are exceptions, but ISTM that the local style is "!= NULL".

Many thanks for explaining.

> 
> >> I'm wondering whether it should be possible to designate (1) a file the
> >> content of which would be shown, or (2) a command, the output of which
> >> would be shown [ok, there might be security implications on this one].
> >
> > Can't we just do that via plpgsql and EXECUTE somehow?
> 
> Hmmm.
> 
> Should we want to execute forcibly some PL/pgSQL on any new connection? 

Oh, of course, you want the command to be execute for each new connection.

My idea was to use PL/pgSQL to execute only when you wanted to update the 
stored motd value,
but of course, if you want a new value from the command for each new connection,
then that doesn't work (and it doesn't work anyway due to not being able to 
execute ALTER SYSTEM from functions).

> Not sure this is really desirable. I was thinking of something more 
> trivial, like the "motd" directeve could designate a file instead of the 
> message itself.
> 
> There could be a hook system to execute some user code on new connections 
> and other events. It could be a new kind of event trigger, eg with 
> connection_start, connection_end… That could be nice for other purposes,
> i.e. auditing. Now, event triggers are not global, they work inside a 
> database, which would suggest that if extended a new connection event 
> would be fired per database connection, not just once per connection. Not 
> sure it would be a bad thing.

Such a hook sounds like a good idea.
If we would have such a hook, then another possibility would be to implement 
motd as an extension, right?

/Joel

Re: [PATCH] Implement motd for PostgreSQL

2021-04-04 Thread Fabien COELHO


Hello Joel,

My 0.02€:


If such a feature gets considered, I'm not sure I'd like to actually edit
pg configuration file to change the message.


For the ALTER SYSTEM case, the value would be written to postgresql.auto.conf,
and that file we shouldn't edit manually anyway, right?


Sure. I meant change the configuration in any way, through manual editing, 
alter system, whatever.



The actual source looks pretty straightforward. I'm wondering whether pg
style would suggest to write motd != NULL instead of just motd.


That's what I had originally, but when reviewing my code verifying code style,
I noticed the other case it more common:


If other cases are indeed pointers. For pgbench, all direct "if (xxx &&" 
cases are simple booleans or integers, pointers seem to have "!= NULL". 
While looking quickly at the grep output, ISTM that most obvious pointers 
have "!= NULL" and other cases often look like booleans:


  catalog/pg_operator.c:  if (isDelete && t->oprcom == baseId)
  catalog/toasting.c: if (check && lockmode != AccessExclusiveLock)
  commands/async.c:   if (amRegisteredListener && listenChannels == NIL)
  commands/explain.c: if (es->analyze && es->timing)
  …

I'm sure there are exceptions, but ISTM that the local style is "!= NULL".


I'm wondering whether it should be possible to designate (1) a file the
content of which would be shown, or (2) a command, the output of which
would be shown [ok, there might be security implications on this one].


Can't we just do that via plpgsql and EXECUTE somehow?


Hmmm.

Should we want to execute forcibly some PL/pgSQL on any new connection? 
Not sure this is really desirable. I was thinking of something more 
trivial, like the "motd" directeve could designate a file instead of the 
message itself.


There could be a hook system to execute some user code on new connections 
and other events. It could be a new kind of event trigger, eg with 
connection_start, connection_end… That could be nice for other purposes,
i.e. auditing. Now, event triggers are not global, they work inside a 
database, which would suggest that if extended a new connection event 
would be fired per database connection, not just once per connection. Not 
sure it would be a bad thing.


--
Fabien.

Re: [PATCH] Implement motd for PostgreSQL

2021-04-04 Thread Joel Jacobson
On Sun, Apr 4, 2021, at 07:55, Joel Jacobson wrote:
> On Sat, Apr 3, 2021, at 17:50, Fabien COELHO wrote:
>> I'm wondering whether it should be possible to designate (1) a file the 
>> content of which would be shown, or (2) a command, the output of which 
>> would be shown [ok, there might be security implications on this one].
> 
> Can't we just do that via plpgsql and EXECUTE somehow?

Right, we can't since

ERROR:  ALTER SYSTEM cannot be executed from a function

I wrongly thought a PROCEDURE would work, but it gives the same error.

/Joel