Re: Initial Schema Sync for Logical Replication

2023-03-20 Thread Euler Taveira
On Mon, Mar 20, 2023, at 10:10 PM, Kumar, Sachin wrote:
> > From: Alvaro Herrera 
> > Subject: RE: [EXTERNAL]Initial Schema Sync for Logical Replication
> > On 2023-Mar-15, Kumar, Sachin wrote:
> > 
> > > 1. In  CreateSubscription()  when we create replication
> > > slot(walrcv_create_slot()), should use CRS_EXPORT_SNAPSHOT, So that we
> > can use this snapshot later in the pg_dump.
> > >
> > > 2.  Now we can call pg_dump with above snapshot from CreateSubscription.
> > 
> > Overall I'm not on board with the idea that logical replication would 
> > depend on
> > pg_dump; that seems like it could run into all sorts of trouble (what if 
> > calling
> > external binaries requires additional security setup?  what about pg_hba
> > connection requirements? what about max_connections in tight
> > circumstances?).
> > what if calling external binaries requires additional security setup
> I am not sure what kind of security restriction would apply in this case, 
> maybe pg_dump
> binary can be changed ? 
Using pg_dump as part of this implementation is not acceptable because we
expect the backend to be decoupled from the client. Besides that, pg_dump
provides all table dependencies (such as tablespaces, privileges, security
labels, comments); not all dependencies shouldn't be replicated. You should
exclude them removing these objects from the TOC before running pg_restore or
adding a few pg_dump options to exclude these objects. Another issue is related
to different version. Let's say the publisher has a version ahead of the
subscriber version, a new table syntax can easily break your logical
replication setup. IMO pg_dump doesn't seem like a good solution for initial
synchronization.

Instead, the backend should provide infrastructure to obtain the required DDL
commands for the specific (set of) tables. This can work around the issues from
the previous paragraph:

* you can selectively choose dependencies;
* don't require additional client packages;
* don't need to worry about different versions.

This infrastructure can also be useful for other use cases such as:

* client tools that provide create commands (such as psql, pgAdmin);
* other logical replication solutions;
* other logical backup solutions.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Allow logical replication to copy tables in binary format

2023-03-15 Thread Euler Taveira
On Wed, Mar 8, 2023, at 11:50 PM, Amit Kapila wrote:
> On Wed, Mar 8, 2023 at 6:17 PM Melih Mutlu  wrote:
> >
> > On 7 Mar 2023 Tue at 04:10 Amit Kapila  wrote:
> >>
> >> As per what I could read in this thread, most people prefer to use the
> >> existing binary option rather than inventing a new way (option) to
> >> binary copy in the initial sync phase. Do you agree?
> >
> >
> > I agree.
> > What do you think about the version checks? I removed any kind of check 
> > since it’s currently a different option. Should we check publisher version 
> > before doing binary copy to ensure that the publisher node supports binary 
> > option of COPY command?
> >
> 
> It is not clear to me which version check you wanted to add because we
> seem to have a binary option in COPY from the time prior to logical
> replication. I feel we need a publisher version 14 check as that is
> where we start to support binary mode transfer in logical replication.
> See the check in function libpqrcv_startstreaming().
... then you are breaking existent cases. Even if you have a convincing
argument, you are introducing a behavior change in prior versions (commit
messages should always indicate that you are breaking backward compatibility).

+
+   /*
+* The binary option for replication is supported since v14
+*/
+   if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 14 &&
+   MySubscription->binary)
+   {
+   appendStringInfo(, " WITH (FORMAT binary)");
+   options = lappend(options, makeDefElem("format", (Node *) 
makeString("binary"), -1));
+   }
+

What are the arguments to support since v14 instead of the to-be-released
version? I read the thread but it is not clear. It was said about the
restrictive nature of this feature and it will be frustrating to see that the
same setup (with the same commands) works with v14 and v15 but it doesn't with
v16. IMO it should be >= 16 and documentation should explain that v14/v15 uses
text format during initial table synchronization even if binary = true.

Should there be a fallback mode (text) if initial table synchronization failed
because of the binary option? Maybe a different setting (auto) to support such
behavior.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: zstd compression for pg_dump

2023-02-25 Thread Euler Taveira
On Sat, Feb 25, 2023, at 7:31 AM, Tomas Vondra wrote:
> On 2/24/23 20:18, Justin Pryzby wrote:
> > This is a draft patch - review is welcome and would help to get this
> > ready to be considererd for v16, if desired.
> > 
> > I'm going to add this thread to the old CF entry.
> > https://commitfest.postgresql.org/31/2888/
> > 
> 
> Thanks. Sadly cfbot is unhappy - the windows and cplusplus builds failed
> because of some issue in pg_backup_archiver.h. But it's a bit bizarre
> because the patch does not modify that file at all ...
cpluspluscheck says

# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.

Hence, the patch should exclude the new header file from it.

--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -153,6 +153,7 @@ do
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
test "$f" = src/bin/pg_dump/compress_lz4.h && continue
+   test "$f" = src/bin/pg_dump/compress_zstd.h && continue
test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-23 Thread Euler Taveira
ious code because it provides a unit for external representation. I
understand that using the same representation as recovery_min_apply_delay is
good but the current code does not handle the external representation
accordingly. (recovery_min_apply_delay uses the GUC machinery to adds the unit
but for min_apply_delay, it doesn't).

# Setup for streaming case
-$node_publisher->append_conf('postgres.conf',
+$node_publisher->append_conf('postgresql.conf',
'logical_decoding_mode = immediate');
$node_publisher->reload;

Fix configuration file name.

Maybe tests should do a better job. I think check_apply_delay_time is fragile
because it does not guarantee that time is not shifted. Time-delayed
replication is a subscriber feature and to check its correctness it should
check the logs.

# Note that we cannot call check_apply_delay_log() here because there is a
# possibility that the delay is skipped. The event happens when the WAL
# replication between publisher and subscriber is delayed due to a mechanical
# problem. The log output will be checked later - substantial delay-time case.

If you might not use the logs for it, it should adjust the min_apply_delay, no?

It does not exercise the min_apply_delay vs parallel streaming mode.

+   /*
+* The combination of parallel streaming mode and
+* min_apply_delay is not allowed.
+*/
+   if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+   if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) 
&& opts.min_apply_delay > 0) ||
+   (!IsSet(opts.specified_opts, 
SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+   ereport(ERROR,
+   
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+   errmsg("cannot enable %s mode for 
subscription with %s",
+  "streaming = parallel", 
"min_apply_delay"));
+

Is this code correct? I also didn't like this message. "cannot enable streaming
= parallel mode for subscription with min_apply_delay" is far from a good error
message. How about refer parallelism to "parallel streaming mode".


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 5024325284ee3b4a4dc0a6a1cc6457ed5608cb46 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Mon, 23 Jan 2023 15:52:55 -0300
Subject: [PATCH] Euler's review

---
 doc/src/sgml/catalogs.sgml |   2 +-
 doc/src/sgml/config.sgml   |  20 ++-
 doc/src/sgml/logical-replication.sgml  |   7 +-
 doc/src/sgml/ref/create_subscription.sgml  |  13 +-
 src/backend/commands/subscriptioncmds.c|  13 +-
 src/backend/replication/logical/worker.c   |  40 +++---
 src/bin/psql/describe.c|   2 +-
 src/test/regress/expected/subscription.out | 160 ++---
 src/test/subscription/t/032_apply_delay.pl |   8 +-
 9 files changed, 133 insertions(+), 132 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index bf3c05241c..0bdb683296 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7878,7 +7878,7 @@ SCRAM-SHA-256$iteration count:
subminapplydelay int8
   
   
-   The length of time (ms) to delay the application of changes.
+   Total time spent delaying the application of changes, in milliseconds
   
  
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 39244bf64a..a15723d74f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4788,17 +4788,15 @@ ANY num_sync (  for details.
+   For time-delayed logical replication, the apply worker sends a feedback
+   message to the publisher every
+   wal_receiver_status_interval milliseconds. Make sure
+   to set wal_receiver_status_interval less than the
+   wal_sender_timeout on the publisher, otherwise, the
+   walsender will repeatedly terminate due to timeout
+   error. If wal_receiver_status_interval is set to
+   zero, the apply worker doesn't send any feedback messages during the
+   min_apply_delay interval.
   
   
  
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 863af11a47..d8ae93f88d 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -248,10 +248,9 @@
   
 
   
-   The subscriber replication can be instructed to lag behind the publisher
-   side changes by specifying the min_apply_delay
-   subscription parameter. See  for
-   details.
+   A logical replication subscription can delay the application of changes by
+   specifying the min_apply_delay subscription parameter.
+   See  for details.
   
 
   
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subsc

Re: is_superuser is not documented

2022-09-09 Thread Euler Taveira
On Fri, Sep 9, 2022, at 2:28 AM, bt22kawamotok wrote:
> is_superuser function checks whether a user is a superuser or not, and 
> is commonly used. However, is_superuser is not documented and is set to 
> UNGROUPED in guc.c. I think is_superuser should be added to the 
> documentation and set to PRESET OPTIONS.What are you thought on this?
There is no such function. Are you referring to the GUC? I agree that it should
be added to the documentation. The main reason is that it is reported by
ParameterStatus along with some other GUCs described in the Preset Options
section.

postgres=# \dfS is_superuser
   List of functions
Schema | Name | Result data type | Argument data types | Type 
+--+--+-+--
(0 rows)

postgres=# SHOW is_superuser;
is_superuser 
--
on
(1 row)

Do you mind writing a patch?


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: including pid's for `There are XX other sessions using the database`

2022-08-19 Thread Euler Taveira
On Fri, Aug 19, 2022, at 2:10 PM, Zhihong Yu wrote:
> I want to poll the community on whether including proc->pid's in the error 
> message would be useful for troubleshooting.
Such message is only useful for a parameter into a pg_stat_activity query. You
don't need the PID list if you already have the most important information:
database name. I don't think revealing the current session PIDs from the
database you want to drop will buy you anything. It could be a long list and it
does not help you to solve the issue: why wasn't that database removed?
 
Besides that, if you know that there is a possibility that a connection is
open, you can always use the FORCE option. The old/other alternative is to use
a query like
 
   SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'foo';
 
(possibly combined with a REVOKE CONNECT or pg_hba.conf modification) before
executing DROP DATABASE.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: fix typos

2022-08-12 Thread Euler Taveira
On Fri, Aug 12, 2022, at 3:59 AM, John Naylor wrote:
> This is really a straw-man proposal, since I'm not volunteering to do
> the work, or suggest anybody else should do the same. That being the
> case, it seems we should just go ahead with Justin's patch for
> consistency. Possibly we could also change the messages to say "ID"?
... or say

could not drop replication origin %u, in use by PID %d

AFAICS there is no "with ID" but there is "with identifier". I personally
prefer to omit these additional words; it seems clear without them.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Allow logical replication to copy tables in binary format

2022-08-11 Thread Euler Taveira
On Thu, Aug 11, 2022, at 10:46 AM, Melih Mutlu wrote:
> If such restrictions are already the case for replication phase after initial 
> table sync, then it shouldn't prevent us from enabling binary option for 
> table sync. Right?
I didn't carefully examine the COPY code but I won't expect significant
differences (related to text vs binary mode) from the logical replication
protocol. After inspecting the git history, I took my argument back after
checking the commit 670c0a1d474. The initial commit 9de77b54531 imposes some
restrictions (user-defined arrays and composite types) as mentioned in the
commit message but it was removed in 670c0a1d474. My main concern is to break a
scenario that was previously working (14 -> 15) but after a subscriber upgrade
it won't (14 -> 16). I would say that you should test some scenarios:
014_binary.pl and also custom data types, same column with different data
types, etc.

> How is any issue that might occur due to version mismatch being handled right 
> now in repliaction after table sync?
> What I understand from the documentation is if replication can fail due to 
> using different pg versions, it just fails. So binary option cannot be used. 
> [1]
> Do you think that this is more serious for table sync and we need to restrict 
> binary option with different publisher and subscriber versions? But not for 
> replication?
It is a conservative argument. If we didn't allow a publisher to run COPY in
binary mode while using previous Postgres versions, we know that it works. (At
least there aren't bug reports for logical replication using binary option.)
Since one of the main use cases for logical replication is migration, I'm
concerned that it may not work (even if the binary option defaults to false,
someone can decide to use it for performance reasons).

I did a quick test and the failure while using binary mode is not clear. Since
you are modifying this code, you could probably provide additional patch(es) to
make it clear that there is an error (due to some documented restriction).


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Allow logical replication to copy tables in binary format

2022-08-11 Thread Euler Taveira
On Thu, Aug 11, 2022, at 8:04 AM, Amit Kapila wrote:
> On Thu, Aug 11, 2022 at 7:34 AM Euler Taveira  wrote:
> >
> > The reason to use text format is that it is error prone. There are 
> > restrictions
> > while using the binary format. For example, if your schema has different 
> > data
> > types for a certain column, the copy will fail.
> >
> 
> Won't such restrictions hold true even during replication?
I expect that the COPY code matches the proto.c code. The point is that table
sync is decoupled from the logical replication. Hence, we should emphasize in
the documentation that the restrictions *also* apply to the initial table
synchronization.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Allow logical replication to copy tables in binary format

2022-08-10 Thread Euler Taveira
On Wed, Aug 10, 2022, at 12:03 PM, Melih Mutlu wrote:
> I see that logical replication subscriptions have an option to enable binary 
> [1]. 
> When it's enabled, subscription requests publisher to send data in binary 
> format. 
> But this is only the case for apply phase. In tablesync, tables are still 
> copied as text.
This option could have been included in the commit 9de77b54531; it wasn't.
Maybe it wasn't considered because the initial table synchronization can be a
separate step in your logical replication setup idk. I agree that the binary
option should be available for the initial table synchronization.

> To copy tables, COPY command is used and that command supports copying in 
> binary. So it seemed to me possible to copy in binary for tablesync too.
> I'm not sure if there is a reason to always copy tables in text format. But I 
> couldn't see why not to do it in binary if it's enabled.
The reason to use text format is that it is error prone. There are restrictions
while using the binary format. For example, if your schema has different data
types for a certain column, the copy will fail. Even with such restrictions, I
think it is worth adding it.

> You can find the small patch that only enables binary copy attached.  
I have a few points about your implementation.

* Are we considering to support prior Postgres versions too? These releases
  support binary mode but it could be an unexpected behavior (initial sync in
  binary mode) for a publisher using 14 or 15 and a subscriber using 16. IMO
  you should only allow it for publisher on 16 or later.
* Docs should say that the binary option also applies to initial table
  synchronization and possibly emphasize some of the restrictions.
* Tests. Are the current tests enough? 014_binary.pl.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: logical replication restrictions

2022-08-10 Thread Euler Taveira
On Wed, Aug 10, 2022, at 9:39 AM, osumi.takami...@fujitsu.com wrote:
> Minor review comments for v6.
Thanks for your review. I'm attaching v7.

> "If the subscriber sets min_apply_delay parameter, ..."
> 
> I suggest we use subscription rather than subscriber, because
> this parameter refers to and is used for one subscription.
> My suggestion is
> "If one subscription sets min_apply_delay parameter, ..."
> In case if you agree, there are other places to apply this change.
I changed the terminology to subscription. I also checked other "subscriber"
occurrences but I don't think it should be changed. Some of them are used as
publisher/subscriber pair. If you think there is another sentence to consider,
point it out.

> It might be better to write a note for committer
> like "Bump catalog version" at the bottom of the commit message.
It is a committer task to bump the catalog number. IMO it is easy to notice
(using a git hook?) that it must bump it when we are modifying the catalog.
AFAICS there is no recommendation to add such a warning.

> The former interprets input number as milliseconds in case of no units,
> while the latter takes it as seconds without units.
> I feel it would be better to make them aligned.
In a previous version I decided not to add a code to attach a unit when there
isn't one. Instead, I changed the documentation to reflect what interval_in
uses (seconds as unit). Under reflection, let's use ms as default unit if the
user doesn't specify one.

I fixed all the other suggestions too.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From a28987c8adb70d6932558f5e39f9dd4c55223a30 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v7] Time-delayed logical replication subscriber

Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).

If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.

The delay is calculated between the WAL time stamp and the current time
on the subscriber.

The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.

Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=xtyaxkih5ctm_vm8kjkh7aaitckvmch4rzr...@mail.gmail.com
---
 doc/src/sgml/catalogs.sgml |  10 ++
 doc/src/sgml/ref/alter_subscription.sgml   |   5 +-
 doc/src/sgml/ref/create_subscription.sgml  |  43 +-
 src/backend/catalog/pg_subscription.c  |   1 +
 src/backend/catalog/system_views.sql   |   7 +-
 src/backend/commands/subscriptioncmds.c|  59 +++-
 src/backend/replication/logical/worker.c   | 100 +
 src/backend/utils/adt/timestamp.c  |  32 
 src/bin/pg_dump/pg_dump.c  |  17 ++-
 src/bin/pg_dump/pg_dump.h  |   1 +
 src/bin/psql/describe.c|   9 +-
 src/bin/psql/tab-complete.c|   4 +-
 src/include/catalog/pg_subscription.h  |   3 +
 src/include/datatype/timestamp.h   |   2 +
 src/include/utils/timestamp.h  |   2 +
 src/test/regress/expected/subscription.out | 165 -
 src/test/regress/sql/subscription.sql  |  20 +++
 src/test/subscription/t/032_apply_delay.pl | 129 
 18 files changed, 526 insertions(+), 83 deletions(-)
 create mode 100644 src/test/subscription/t/032_apply_delay.pl

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index cd2cc37aeb..291ebdafad 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7833,6 +7833,16 @@ SCRAM-SHA-256$iteration count:
   
  
 
+ 
+  
+   subapplydelay int8
+  
+  
+   Application delay of changes by a specified amount of time. The
+   unit is in milliseconds.
+  
+ 
+
  
   
subname name
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 64efc21f53..8901e1361c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -208,8 +208,9 @@ ALTER SUBSCRIPTION name RENAME TO <
   are slot_name,
   synchronous_commit,
   binary, streaming,
-  disable_on_error, and
-  origin.
+  disable_on_error,
+  origin, and
+  min_apply_delay.
  
 

diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 7390c715bc..a794c07042 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -317,7 +317,36 @@ CREATE SUBSCRIPTION subscri

Re: logical replication restrictions

2022-08-08 Thread Euler Taveira
On Wed, Aug 3, 2022, at 10:27 AM, Amit Kapila wrote:
> Your explanation makes sense to me. The other point to consider is
> that there can be cases where we may not apply operation for the
> transaction because of empty transactions (we don't yet skip empty
> xacts for prepared transactions). So, won't it be better to apply the
> delay just before we apply the first change for a transaction? Do we
> want to apply the delay during table sync as we sometimes do need to
> enter apply phase while doing table sync?
I thought about the empty transactions but decided to not complicate the code
mainly because skipping transactions is not a code path that will slow down
this feature. As explained in the documentation, there is no harm in delaying a
transaction for more than min_apply_delay; it cannot apply earlier. Having said
that I decided to do nothing. I'm also not sure if it deserves a comment or if
this email is a possible explanation for this decision.

Regarding the table sync that was mention by Melih, I sent a new version (v6)
that fixed this oversight. The current logical replication worker design make
it difficult to apply the delay in the catchup phase; tablesync workers are not
closed as soon as the COPY finishes (which means possibly running out of
workers sooner). After all tablesync workers have reached READY state, the
apply delay is activated. The documentation was correct; the code wasn't.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: logical replication restrictions

2022-08-08 Thread Euler Taveira
On Wed, Jul 13, 2022, at 2:34 PM, Melih Mutlu wrote:

[Sorry for the delay...]

> 22. src/test/subscription/t/032_apply_delay.pl
>> 
>> I received the following error when trying to run these 'subscription' tests:
>> 
>> t/032_apply_delay.pl ... No such class log_location at
>> t/032_apply_delay.pl line 49, near "my log_location"
>> syntax error at t/032_apply_delay.pl line 49, near "my log_location ="
> 
> I'm having these errors too. Seems like some declarations are missing.
Fixed in v5.

> 
>>> +  specified amount of time. If this value is specified without 
>>> units,
>>> +  it is taken as milliseconds. The default is zero, adding no 
>>> delay.
>>> + 
> I'm also having an issue when I give min_apply_delay parameter without units.
> I expect that if I set  min_apply_delay to 5000 (without any unit), it will 
> be interpreted as 5000 ms.
Good catch. I fixed it in v5.

> 
> Lastly, I have a question about this delay during tablesync. 
> It's stated here that apply delays are not for initial tablesync.
> 
>>>  
>>> +  The delay occurs only on WAL records for transaction begins and 
>>> after
>>> +  the initial table synchronization. It is possible that the
>>> +  replication delay between publisher and subscriber exceeds the 
>>> value
>>> +  of this parameter, in which case no delay is added. Note that the
>>> +  delay is calculated between the WAL time stamp as written on
>>> +  publisher and the current time on the subscriber. Delays in 
>>> logical
>>> +  decoding and in transfer the transaction may reduce the actual 
>>> wait
>>> +  time. If the system clocks on publisher and subscriber are not
>>> +  synchronized, this may lead to apply changes earlier than 
>>> expected.
>>> +  This is not a major issue because a typical setting of this 
>>> parameter
>>> +  are much larger than typical time deviations between servers.
>>> + 
> 
> There might be a case where tablesync workers are in SYNCWAIT state and 
> waiting for apply worker to tell them to CATCHUP. 
> And if apply worker is waiting in apply_delay function, tablesync workers 
> will be stuck at SYNCWAIT state and this might delay tablesync at least 
> "min_apply_delay" amount of time or more.
> Is it something we would want? What do you think?
Good catch. That's an oversight. It should wait for the initial table
synchronization before starting to apply the delay. The main reason is the
current logical replication worker design. It only closes the tablesync workers
after the catchup phase. As you noticed we cannot impose the delay as soon as
the COPY finishes because it will take a long time to finish due to possibly
lack of workers.  Instead, let's wait for the READY state for all tables then
apply the delay. I added an explanation for it.

I also modified the test a bit to use the new function
wait_for_subscription_sync introduced in the commit
0c20dd33db1607d6a85ffce24238c1e55e384b49.

I attached a v6.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 465a6d5aa491855e2925da24785103cac0c520a2 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v6] Time-delayed logical replication subscriber

Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).

If the subscriber sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.

The delay is calculated between the WAL time stamp and the current time
on the subscriber.

The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.

Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=xtyaxkih5ctm_vm8kjkh7aaitckvmch4rzr...@mail.gmail.com
---
 doc/src/sgml/catalogs.sgml |  10 ++
 doc/src/sgml/ref/alter_subscription.sgml   |   5 +-
 doc/src/sgml/ref/create_subscription.sgml  |  43 +-
 src/backend/catalog/pg_subscription.c  |   1 +
 src/backend/catalog/system_views.sql   |   7 +-
 src/backend/commands/subscriptioncmds.c|  48 +-
 src/backend/replication/logical/worker.c   | 100 +
 src/backend/utils/adt/timestamp.c  |  32 
 src/bin/pg_dump/pg_dump.c  |  17 ++-
 src/bin/pg_dump/pg_dump.h  |   1 +
 src/bin/psql/describe.c|  

Re: logical replication restrictions

2022-08-01 Thread Euler Taveira
On Tue, Jul 5, 2022, at 9:29 AM, Amit Kapila wrote:
> I wonder why we don't apply the delay on commit/commit_prepared
> records only similar to physical replication. See recoveryApplyDelay.
> One more advantage would be then we don't need to worry about
> transactions that we are going to skip due SKIP feature for
> subscribers.
I added an explanation at the top of apply_delay(). I didn't read the "parallel
apply" patch yet. I'll do soon to understand how the current design for
streamed transactions conflicts with the parallel apply patch.

+ * It applies the delay for the next transaction but before starting the
+ * transaction. The main reason for this design is to avoid a long-running
+ * transaction (which can cause some operational challenges) if the user sets a
+ * high value for the delay. This design is different from the physical
+ * replication (that applies the delay at commit time) mainly because write
+ * operations may allow some issues (such as bloat and locks) that can be
+ * minimized if it does not keep the transaction open for such a long time.
+ */
+static void
+apply_delay(TimestampTz ts)

Regarding the skip transaction feature, we could certainly skip the
transactions combined with the apply delay. However, it introduces complexity
for a rare use case IMO. Besides that, the skip transaction code path is fast,
hence, it is very unlikely that the current patch will impose some issues to
the skip transaction feature. (Remember that the main goal for this feature is
to provide an old state of the database.)

> One more thing that might be worth discussing is whether introducing a
> new subscription parameter for this feature is a better idea or can we
> use guc (either an existing or a new one). Users may want to set this
> only for a particular subscription or set of subscriptions in which
> case it is better to have this as a subscription level parameter.
> OTOH, I was slightly worried that if this will be used for all
> subscriptions on a subscriber then it will be burdensome for users.
That's a good point. Logical replication is per database and it is slightly
different from physical replication that is per cluster. In physical
replication, you have no choice but to have a GUC. It is very unlikely that
someone wants to delay all logical replicas. Therefore, the benefit of having a
GUC is quite small.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: logical replication restrictions

2022-08-01 Thread Euler Taveira
ic bool MySubscriptionValid = false;
> +TimestampTz MySubscriptionMinApplyDelayUntil = 0;
> 
> Looking at the only usage of this variable (in apply_delay) and how it
> is used I did see why this cannot just be a local member of the
> apply_delay function?
Good catch. A previous patch used that variable outside that function scope.

> +/*
> + * Apply the informed delay for the transaction.
> + *
> + * A regular transaction uses the commit time to calculate the delay. A
> + * prepared transaction uses the prepare time to calculate the delay.
> + */
> +static void
> +apply_delay(TimestampTz ts)
> 
> I didn't think it needs to mention here about the different kinds of
> transactions because where it comes from has nothing really to do with
> this function's logic.
Fixed.

> Refer to comment #14 about MySubscriptionMinApplyDelayUntil.
Fixed.

> It seems to rely on the spooling happening at the end. But won't this
> cause a problem later when/if the "parallel apply" patch [1] is pushed
> and the stream bgworkers are doing stuff on the fly instead of
> spooling at the end?
> 
> Or are you expecting that the "parallel apply" feature should be
> disabled if there is any min_apply_delay parameter specified?
I didn't read the "parallel apply" patch yet.

> Let's keep the alphabetical order of the parameters in COMPLETE_WITH, as per 
> [2]
Fixed.

> + int64 subapplydelay; /* Replication apply delay */
> +
> 
> IMO the comment should mention the units "(ms)"
I'm not sure. It should be documented in the catalogs. It is an important
information for user-visible interface. There are a few places in the
documentation that the unit is mentioned.

> There are some test cases for CREATE SUBSCRIPTION but there are no
> test cases for ALTER SUBSCRIPTION changing this new parameter.
I added a test to cover ALTER SUBSCRIPTION and also for the disabling a
subscription that contains a delay set.

> I received the following error when trying to run these 'subscription' tests:
Fixed.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 7a79e6b640826c5602805d2ff27ed226bdb1c940 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v5] Time-delayed logical replication subscriber

Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).

If the subscriber sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.

The delay is calculated between the WAL time stamp and the current time
on the subscriber.

The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.

Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=xtyaxkih5ctm_vm8kjkh7aaitckvmch4rzr...@mail.gmail.com
---
 doc/src/sgml/catalogs.sgml |  10 ++
 doc/src/sgml/ref/alter_subscription.sgml   |   5 +-
 doc/src/sgml/ref/create_subscription.sgml  |  43 +-
 src/backend/catalog/pg_subscription.c  |   1 +
 src/backend/catalog/system_views.sql   |   7 +-
 src/backend/commands/subscriptioncmds.c|  48 +-
 src/backend/replication/logical/worker.c   |  85 +++
 src/backend/utils/adt/timestamp.c  |  32 
 src/bin/pg_dump/pg_dump.c  |  17 ++-
 src/bin/pg_dump/pg_dump.h  |   1 +
 src/bin/psql/describe.c|   9 +-
 src/bin/psql/tab-complete.c|   4 +-
 src/include/catalog/pg_subscription.h  |   3 +
 src/include/datatype/timestamp.h   |   2 +
 src/include/utils/timestamp.h  |   2 +
 src/test/regress/expected/subscription.out | 165 -
 src/test/regress/sql/subscription.sql  |  20 +++
 src/test/subscription/t/032_apply_delay.pl | 130 
 18 files changed, 501 insertions(+), 83 deletions(-)
 create mode 100644 src/test/subscription/t/032_apply_delay.pl

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index cd2cc37aeb..efc745a9ea 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7833,6 +7833,16 @@ SCRAM-SHA-256$iteration count:
   
  
 
+ 
+  
+   subapplydelay int8
+  
+  
+   Delay the application of changes by a specified amount of time. The
+   unit is in milliseconds.
+  
+ 
+
  
   
subname name
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 64efc21f53..8901e1361c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -208,8 +208,9 @@ ALTER SUBSCRIPTION name RENAME TO <

Re: Introduce "log_connection_stages" setting.

2022-07-14 Thread Euler Taveira
On Thu, Jul 14, 2022, at 8:20 AM, Sergey Dudoladov wrote:
> I've taken connection stages and terminology from the existing log messages.
> The reason I have separated "authorized" and "authenticated" are [1]
> and [2] usages of "log_connections";
> "received" is mentioned at [3].
After checking the commit 9afffcb833d, I agree that you need 4 stages:
connected, authorized, authenticated, and disconnected.

> I have thought about enums too, but we need to cover arbitrary
> combinations of message types, for example log only "received" and
> "disconnected".
> Hence the proposed type "string" with individual values within the
> string really drawn from an enum.
Ooops. I said enum but I meant string list.

> Are there any specific deprecation guidelines ? I have not found any
> after a quick search for GUC deprecation in Google and commit history.
> A deprecation scheme could look like that:
> 1. Mention in the docs "log_(dis)connections" are deprecated in favor
> of "log_connection_stages"
> 2. Map "log_(dis)connections" to relevant values of
> "log_connection_stages" in code if the latter is unset.
> 3. Complain in the logs if a conflict arises between the old params
> and "log_connection_stages", with "log_connection_stages"
> taking the precedence.
No. AFAICS in this case, you just remove log_connections and log_disconnections
and create the new one (see for example the commit 88e98230268 that replace
checkpoint_segments with min_wal_size and max_wal_size). We don't generally
keep ConfigureNames* entries for deprecated GUCs. Unless you are renaming a GUC
-- see map_old_guc_names; that's not the case. When we remove a GUC, we are
introducing an incompatibility so the only place it will be mentioned is the
release notes (there is a section called "Migration to Version X" that lists
all incompatibilities). From the developer's point of view, you only need to
mention in the commit message that this commit is introducing an
incompatibility. Hence, when it is time to write the release notes, the
information about the removal and the new replacement will be added.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Introduce "log_connection_stages" setting.

2022-07-12 Thread Euler Taveira
On Tue, Jul 12, 2022, at 10:52 AM, Sergey Dudoladov wrote:
> The problem we face is excessive logging of connection information
> that clutters the logs and in corner cases with many short-lived
> connections leads to disk space exhaustion.
You are proposing a fine-grained control over connection stages reported when
log_connections = on. It seems useful if you're only interested in (a)
malicious access or (b) authorized access (for audit purposes).

> I would like to get feedback on the following idea:
> 
> Add the `log_connection_stages` setting of type “string” with possible
> values "received", "authorized", "authenticated", "disconnected", and
> "all", with "all" being the default.
> The setting would have effect only when `log_connections` is on.
> Example: log_connection_stages=’authorized,disconnected’.
> That also implies there would be no need for a separate
> "log_disconnection" setting.
Your proposal will add more confusion to the already-confused logging-related
GUCs. If you are proposing to introduce a fine-grained control, the first step
should be merge log_connections and log_disconnections into a new GUC (?) and
deprecate them. (I wouldn't introduce a new GUC that depends on the stage of
other GUC as you proposed.) There are 3 stages: connect (received), authorized
(authenticated), disconnect. You can also add 'all' that mimics log_connections
/ log_disconnections enabled. Another question is if we can reuse
log_connections for this improvement instead of a new GUC. In this case, you
would turn the boolean value into an enum value. Will it cause trouble while
upgrading to this new version? It is one of the reasons to create a new GUC. I
would suggest log_connection_messages or log_connection (with the 's' in the
end -- since it is too similar to the current GUC name, I'm afraid it is not a
good name for it).


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: logical replication restrictions

2022-07-04 Thread Euler Taveira
On Wed, Mar 23, 2022, at 6:19 PM, Euler Taveira wrote:
> On Mon, Mar 21, 2022, at 10:09 PM, Euler Taveira wrote:
>> On Mon, Mar 21, 2022, at 10:04 PM, Andres Freund wrote:
>>> On 2022-03-20 21:40:40 -0300, Euler Taveira wrote:
>>> > On Mon, Feb 28, 2022, at 9:18 PM, Euler Taveira wrote:
>>> > > Long time, no patch. Here it is. I will provide documentation in the 
>>> > > next
>>> > > version. I would appreciate some feedback.
>>> > This patch is broken since commit 
>>> > 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33. I
>>> > rebased it.
>>> 
>>> This fails tests, specifically it seems psql crashes:
>>> https://cirrus-ci.com/task/6592281292570624?logs=cores#L46
>> Yeah. I forgot to test this patch with cassert before sending it. :( I didn't
>> send a new patch because there is another issue (with int128) that I'm
>> currently reworking. I'll send another patch soon.
> Here is another version after rebasing it. In this version I fixed the psql
> issue and rewrote interval_to_ms function.
>From the previous version, I added support for streamed transactions. For
streamed transactions, the delay is applied during STREAM COMMIT message.
That's ok if we add the delay before applying the spooled messages. Hence, we
guarantee that the delay is applied *before* each transaction. The same logic
is applied to prepared transactions. The delay is introduced before applying
the spooled messages in STREAM PREPARE message.

Tests were refactored a bit. A test for streamed transaction was included too.

Version 4 is attached.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 7dd7a3523ed8e7a3494e7ec25ddc0af8ed4cf4d3 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v4] Time-delayed logical replication subscriber

Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (specially to fix
errors that might cause data loss).

If the subscriber sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.

The delay is calculated between the WAL time stamp and the current time
on the subscriber.

The delay occurs only on WAL records for transaction begins. Regular and
prepared transactions are covered. Streamed transactions are also
covered.

Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=xtyaxkih5ctm_vm8kjkh7aaitckvmch4rzr...@mail.gmail.com
---
 doc/src/sgml/catalogs.sgml |   9 ++
 doc/src/sgml/ref/alter_subscription.sgml   |   5 +-
 doc/src/sgml/ref/create_subscription.sgml  |  31 -
 src/backend/catalog/pg_subscription.c  |   1 +
 src/backend/catalog/system_views.sql   |   2 +-
 src/backend/commands/subscriptioncmds.c|  46 ++-
 src/backend/replication/logical/worker.c   |  82 
 src/backend/utils/adt/timestamp.c  |  32 +
 src/bin/pg_dump/pg_dump.c  |  16 ++-
 src/bin/pg_dump/pg_dump.h  |   1 +
 src/bin/psql/describe.c|   8 +-
 src/bin/psql/tab-complete.c|   7 +-
 src/include/catalog/pg_subscription.h  |   3 +
 src/include/datatype/timestamp.h   |   2 +
 src/include/utils/timestamp.h  |   2 +
 src/test/regress/expected/subscription.out | 149 -
 src/test/regress/sql/subscription.sql  |  20 +++
 src/test/subscription/t/032_apply_delay.pl | 110 +++
 18 files changed, 455 insertions(+), 71 deletions(-)
 create mode 100644 src/test/subscription/t/032_apply_delay.pl

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 25b02c4e37..9b94b7aef2 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7833,6 +7833,15 @@ SCRAM-SHA-256$iteration count:
   
  
 
+ 
+  
+   subapplydelay int8
+  
+  
+   Delay the application of changes by a specified amount of time.
+  
+ 
+
  
   
subname name
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 353ea5def2..ae9d625f9d 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -207,8 +207,9 @@ ALTER SUBSCRIPTION name RENAME TO <
   information.  The parameters that can be altered
   are slot_name,
   synchronous_commit,
-  binary, streaming, and
-  disable_on_error.
+  binary, streaming,
+  disable_on_error, and
+  min_apply_delay.
  
 

diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 34b3264b26..ae80db8a3d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -302,7 +302,36 @@ CREATE SUBSCRIPTION s

Re: Re-order "disable_on_error" in tab-complete COMPLETE_WITH

2022-07-04 Thread Euler Taveira
On Mon, Jul 4, 2022, at 5:37 AM, Amit Kapila wrote:
> Yeah, it seems we have overlooked this point. I think we can do this
> just for HEAD but as the feature is introduced in PG-15 so there is no
> harm in pushing it to PG-15 as well especially because it is a
> straightforward change. What do you or others think?
No objection. It is a good thing for future backpatches.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Error from the foreign RDBMS on a foreign table I have no privilege on

2022-06-07 Thread Euler Taveira


On Tue, Jun 7, 2022, at 12:03 AM, Laurenz Albe wrote:
> On Sat, 2022-06-04 at 21:18 +, Phil Florent wrote:
> > I opened an issue with an attached code on oracle_fdw git page : 
> > https://github.com/laurenz/oracle_fdw/issues/534 
> > Basically I expected to obtain a "no privilege" error from PostgreSQL when 
> > I have no read privilege
> > on the postgres foreign table but I obtained an Oracle error instead.
> > Laurenz investigated and closed the issue but he suggested perhaps I should 
> > post that on
> > the hackers list since it also occurs with postgres-fdw on some occasion(I 
> > have investigated some more,
> > and postgres_fdw does the same thing when you turn onuse_remote_estimate.). 
> > Hence I do...
> 
> To add more detais: permissions are checked at query execution time, but if 
> "use_remote_estimate"
> is used, the planner already accesses the remote table, even if the user has 
> no permissions
> on the foreign table.
> 
> I feel that that is no bug, but I'd be curious to know if others disagree.
You should expect an error (like in the example) -- probably not at that point.
It is behaving accordingly. However, that error is exposing an implementation
detail (FDW has to access the remote table at that phase). I don't think that
changing the current design (permission check after planning) for FDWs to
provide a good UX is worth it. IMO it is up to the FDW author to hide such
cases if it doesn't cost much to do it.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: tablesync copy ignores publication actions

2022-06-07 Thread Euler Taveira
On Tue, Jun 7, 2022, at 1:10 AM, Peter Smith wrote:
> The logical replication tablesync ignores the publication 'publish'
> operations during the initial data copy.
> 
> This is current/known PG behaviour (e.g. as recently mentioned [1])
> but it was not documented anywhere.
initial data synchronization != replication. publish parameter is a replication
property; it is not a initial data synchronization property. Maybe we should
make it clear like you are suggesting.

> This patch just documents the existing behaviour and gives some examples.
Why did you add this information to that specific paragraph? IMO it belongs to
a separate paragraph; I would add it as the first paragraph in that subsection.

I suggest the following paragraph:


The initial data synchronization does not take into account the 
publish parameter to copy the existing data.


There is no point to link the Initial Snapshot subsection. That subsection is
explaining the initial copy steps and you want to inform about the effect of a
publication parameter on the initial copy. Although both are talking about the
same topic (initial copy), that link to Initial Snapshot subsection won't add
additional information about the publish parameter. You could expand the
suggested sentence to make it clear what publish parameter is or even add a
link to the CREATE PUBLICATION synopsis (that explains about publish
parameter).

You add an empty paragraph. Remove it.

I'm not sure it deserves an example. It is an easy-to-understand concept and a
good description is better than ~ 80 new lines.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Ignore heap rewrites for materialized views in logical replication

2022-05-31 Thread Euler Taveira
On Tue, May 31, 2022, at 11:13 AM, Amit Kapila wrote:
> I think we don't need the retry logical to check error, a simple
> wait_for_caught_up should be sufficient as we are doing in other
> tests. See attached. I have slightly modified the commit message as
> well. Kindly let me know what you think?
Your modification will hang until the test timeout without the patch. That's
why I avoided to use wait_for_caught_up and used a loop for fast exit on success
or failure. I'm fine with a simple test case like you proposed.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Ignore heap rewrites for materialized views in logical replication

2022-05-30 Thread Euler Taveira
On Sat, May 28, 2022, at 7:07 AM, Amit Kapila wrote:
> I agree with your analysis and the fix looks correct to me.
Thanks for checking.

> Instead of waiting for an error, we can try to insert into a new table
> created by the test case after the 'Refresh ..' command and wait for
> the change to be replicated by using wait_for_caught_up.
That's a good idea. [modifying the test...] I used the same table. Whenever the
new row arrives on the subscriber or it reads that error message, it bails out.

> Let's try to see if we can simplify the test so that it can be
> committed along with a fix. If we are not able to find any reasonable
> way then we can think of skipping it.
The new test is attached.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From fe13dc44f210b5f14b97bf9a29a242c2211ba700 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Fri, 27 May 2022 11:35:27 -0300
Subject: [PATCH v2 1/2] Ignore heap rewrites for materialized views in logical
 replication

If you have a publication that specifies FOR ALL TABLES clause, a REFRESH
MATERIALIZED VIEW can break your setup with the following message

ERROR:  logical replication target relation "public.pg_temp_NNN" does not exist

Commit 1a499c2520 introduces a heuristic to prevent that transient heaps (due
to table rewrites) are included for a FOR ALL TABLES publication. However, it
forgot to exclude materialized views (heap rewrites occurs when you execute
REFRESH MATERIALIZED VIEW). Since materialized views are not supported in
logical decoding, it is appropriate to filter them too.

As explained in the commit message, a more robust solution was included in
version 11 (commit 325f2ec555), hence only version 10 is affected.
---
 src/backend/replication/pgoutput/pgoutput.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index f96fde3df0..e03ff4a11e 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -540,7 +540,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 if (sscanf(relname, "pg_temp_%u%n", , ) == 1 &&
 	relname[n] == '\0')
 {
-	if (get_rel_relkind(u) == RELKIND_RELATION)
+	char	relkind = get_rel_relkind(u);
+
+	if (relkind == RELKIND_RELATION || relkind == RELKIND_MATVIEW)
 		break;
 }
 			}
-- 
2.30.2

From 04b6b6566f4cf4ddc8cf3ddc51420e93c599024d Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Fri, 27 May 2022 14:43:10 -0300
Subject: [PATCH v2 2/2] test: heap rewrite for materialized view in logical
 replication

---
 src/test/subscription/t/006_rewrite.pl | 35 +-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/src/test/subscription/t/006_rewrite.pl b/src/test/subscription/t/006_rewrite.pl
index 5e3211aefa..27eb6a1cb7 100644
--- a/src/test/subscription/t/006_rewrite.pl
+++ b/src/test/subscription/t/006_rewrite.pl
@@ -3,7 +3,8 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 2;
+use Time::HiRes qw(usleep);
+use Test::More tests => 3;
 
 sub wait_for_caught_up
 {
@@ -26,6 +27,13 @@ my $ddl = "CREATE TABLE test1 (a int, b text);";
 $node_publisher->safe_psql('postgres', $ddl);
 $node_subscriber->safe_psql('postgres', $ddl);
 
+$ddl = "CREATE TABLE test2 (a int, b text);";
+$node_publisher->safe_psql('postgres', $ddl);
+$node_subscriber->safe_psql('postgres', $ddl);
+
+$node_publisher->safe_psql('postgres', q{INSERT INTO test2 (a, b) VALUES (10, 'ten'), (20, 'twenty');});
+$node_publisher->safe_psql('postgres', 'CREATE MATERIALIZED VIEW test3 AS SELECT a, b FROM test2;');
+
 my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
 my $appname   = 'encoding_test';
 
@@ -69,5 +77,30 @@ is($node_subscriber->safe_psql('postgres', q{SELECT a, b, c FROM test1}),
 3|three|33),
'data replicated to subscriber');
 
+$node_publisher->safe_psql('postgres', 'REFRESH MATERIALIZED VIEW test3;');
+
+# an additional row to check if the REFRESH worked
+$node_publisher->safe_psql('postgres', q{INSERT INTO test2 (a, b) VALUES (30, 'thirty');});
+
+my $max_attempts = 10 * $TestLib::timeout_default;
+my $attempts = 0;
+my $ret  = 0;
+my $errmsg = qr/logical replication target relation.*does not exist/;
+while ($attempts < $max_attempts)
+{
+	# Succeed. Bail out.
+	if ($node_subscriber->safe_psql('postgres', 'SELECT COUNT(1) FROM test2') == 3) {
+		$ret = 1;
+		last;
+	}
+
+	# Failed. Bail out.
+	last if (slurp_file($node_subscriber->logfile) =~ $errmsg);
+
+	usleep(100_000);
+	$attempts++;
+}
+is($ret, 1, 'heap rewrite for materialized view on subscriber');
+
 $node_subscriber->stop;
 $node_publisher->stop;
-- 
2.30.2



Ignore heap rewrites for materialized views in logical replication

2022-05-27 Thread Euler Taveira
Hi,

While investigating an internal report, I concluded that it is a bug. The
reproducible test case is simple (check 0002) and it consists of a FOR ALL
TABLES publication and a non-empty materialized view on publisher. After the
setup, if you refresh the MV, you got the following message on the subscriber:

ERROR:  logical replication target relation "public.pg_temp_N" does not 
exist

That's because the commit 1a499c2520 (that fixes the heap rewrite for tables)
forgot to consider that materialized views can also create transient heaps and
they should also be skipped. The affected version is only 10 because 11
contains a different solution (commit 325f2ec555) that provides a proper fix
for the heap rewrite handling in logical decoding.

0001 is a patch to skip MV too. I attached 0002 to demonstrate the issue but it
doesn't seem appropriate to be included. The test was written to detect the
error and bail out. After this fix, it takes a considerable amount of time to
finish the test because it waits for a message that never arrives. Since nobody
reports this bug in 5 years and considering that version 10 will be EOL in 6
months, I don't think an additional test is crucial here.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 652efe45665d91f2f4ae865dba078fcaffdc0a17 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Fri, 27 May 2022 11:35:27 -0300
Subject: [PATCH v1 1/2] Ignore heap rewrites for materialized views in logical
 replication

If you have a publication that specifies FOR ALL TABLES clause, a REFRESH
MATERIALIZED VIEW can break your setup with the following message

ERROR:  logical replication target relation "public.pg_temp_NNN" does not exist

Commit 1a499c2520 introduces a heuristic to prevent that transient heaps (due
to table rewrites) are included for a FOR ALL TABLES publication. However, it
forgot to exclude materialized views (heap rewrites occurs when you execute
REFRESH MATERIALIZED VIEW). Since materialized views are not supported in
logical decoding, it is appropriate to filter them too.

As explained in the commit message, a more robust solution was included in
version 11 (commit 325f2ec555), hence only version 10 is affected.
---
 src/backend/replication/pgoutput/pgoutput.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index f96fde3df0..e03ff4a11e 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -540,7 +540,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 if (sscanf(relname, "pg_temp_%u%n", , ) == 1 &&
 	relname[n] == '\0')
 {
-	if (get_rel_relkind(u) == RELKIND_RELATION)
+	char	relkind = get_rel_relkind(u);
+
+	if (relkind == RELKIND_RELATION || relkind == RELKIND_MATVIEW)
 		break;
 }
 			}
-- 
2.30.2

From 7ba28dea6800359ae631c1a132f8f6c6d418548b Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Fri, 27 May 2022 14:43:10 -0300
Subject: [PATCH v1 2/2] test: heap rewrite for materialized view in logical
 replication

---
 src/test/subscription/t/006_rewrite.pl | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/src/test/subscription/t/006_rewrite.pl b/src/test/subscription/t/006_rewrite.pl
index 5e3211aefa..e5e0fdabe7 100644
--- a/src/test/subscription/t/006_rewrite.pl
+++ b/src/test/subscription/t/006_rewrite.pl
@@ -3,7 +3,8 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 2;
+use Time::HiRes qw(usleep);
+use Test::More tests => 3;
 
 sub wait_for_caught_up
 {
@@ -26,6 +27,13 @@ my $ddl = "CREATE TABLE test1 (a int, b text);";
 $node_publisher->safe_psql('postgres', $ddl);
 $node_subscriber->safe_psql('postgres', $ddl);
 
+$ddl = "CREATE TABLE test2 (a int, b text);";
+$node_publisher->safe_psql('postgres', $ddl);
+$node_subscriber->safe_psql('postgres', $ddl);
+
+$node_publisher->safe_psql('postgres', q{INSERT INTO test2 (a, b) VALUES (10, 'ten'), (20, 'twenty');});
+$node_publisher->safe_psql('postgres', 'CREATE MATERIALIZED VIEW test3 AS SELECT a, b FROM test2;');
+
 my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
 my $appname   = 'encoding_test';
 
@@ -69,5 +77,21 @@ is($node_subscriber->safe_psql('postgres', q{SELECT a, b, c FROM test1}),
 3|three|33),
'data replicated to subscriber');
 
+$node_publisher->safe_psql('postgres', 'REFRESH MATERIALIZED VIEW test3;');
+
+my $max_attempts = 2 * $TestLib::timeout_default;
+my $attempts = 0;
+my $errmsg = qr/logical replication target relation.*does not exist/;
+{
+	while ($attempts < $max_attempts)
+	{
+		last if (slurp_file($node_subscriber->logfile) =~ $errmsg);
+
+		usleep(100_000);
+		$attempts++;
+	}
+}
+is($attempts, $max_attempts, 'heap rewrite for materialized view on subscriber');
+
 $node_subscriber->stop;
 $node_publisher->stop;
-- 
2.30.2



Re: Privileges on PUBLICATION

2022-05-18 Thread Euler Taveira
On Wed, May 18, 2022, at 6:44 AM, Antonin Houska wrote:
> ok, please see the next version.
The new paragraph looks good to me. I'm not sure if the CREATE PUBLICATION is
the right place to provide such information. As I suggested in a previous email
[1], you could add it to "Logical Replication > Security".

[1] https://postgr.es/m/d96103fe-99e2-4119-bd76-952d326b7...@www.fastmail.com


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Privileges on PUBLICATION

2022-05-18 Thread Euler Taveira
On Wed, May 18, 2022, at 6:16 AM, Antonin Houska wrote:
> The patch is attached to this message.
Great. Add it to the next CF. I'll review it when I have some spare time.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Privileges on PUBLICATION

2022-05-13 Thread Euler Taveira
On Fri, May 13, 2022, at 3:36 AM, Antonin Houska wrote:
> Attached is my proposal. It tries to be more specific and does not mention the
> absence of the privileges explicitly.
You explained the current issue but say nothing about the limitation. This
information will trigger a question possibly in one of the MLs. IMO if you say
something like the sentence above at the end, it will make it clear why that
setup expose all data (there is no access control to publications) and
explicitly say there is a TODO here.

Additional privileges might be added to control access to table data in a
future version of PostgreSQL.

I also wouldn't use the warning tag because it fits in the same category as the
other restrictions listed in the page.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: First draft of the PG 15 release notes

2022-05-12 Thread Euler Taveira
On Thu, May 12, 2022, at 11:22 AM, Bruce Momjian wrote:
> On Thu, May 12, 2022 at 11:12:54AM -0300, Euler Taveira wrote:
> OB> On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote:
> > 
> > I looked at that but thought that everyone would already assume we
> > skipped replication of empty transactions, and I didn't see much impact
> > for the user, so I didn't include it.
> > 
> > It certainly has an impact on heavy workloads that replicate tables with few
> > modifications. It receives a high traffic of 'begin' and 'commit' messages 
> > that
> > the previous Postgres versions have to handle (discard). I would classify 
> > it as
> > a performance improvement for logical replication. Don't have a strong 
> > opinion
> > if it should be mentioned or not.
> 
> Oh, so your point is that a transaction that only has SELECT would
> previously send an empty transaction?  I thought this was only for apps
> that create literal empty transactions, which seem rare.
No. It should be a write transaction. If you have a replication setup that
publish only table foo (that isn't modified often) and most of your
workload does not contain table foo, Postgres sends 'begin' and 'commit'
messages to subscriber even if there is no change to replicate.

Let me show you an example:

postgres=# CREATE TABLE foo (a integer primary key, b text);
CREATE TABLE
postgres=# CREATE TABLE bar (c integer primary key, d text);
CREATE TABLE
postgres=# CREATE TABLE baz (e integer primary key, f text);
CREATE TABLE
postgres=# CREATE PUBLICATION pubfoo FOR TABLE foo;
CREATE PUBLICATION
postgres=# SELECT pg_create_logical_replication_slot('slotfoo', 'pgoutput');
pg_create_logical_replication_slot 

(slotfoo,0/E709AC50)
(1 row)

Let's create a transaction without table foo:

postgres=# BEGIN;
BEGIN
postgres=*# INSERT INTO bar (c, d) VALUES(1, 'blah');
INSERT 0 1
postgres=*# INSERT INTO baz (e, f) VALUES(2, 'xpto');
INSERT 0 1
postgres=*# COMMIT;
COMMIT

As you can see, the replication slot contains messages for that transaction.
Although, table bar and baz are NOT published, the begin (B) and commit (C)
messages that refers to this transaction are sent to subscriber.

postgres=# SELECT chr(get_byte(data, 0)) FROM 
pg_logical_slot_peek_binary_changes('slotfoo', NULL, NULL, 
'proto_version', '1', 'publication_names', 'pubfoo');
chr 
-
B
C
(2 rows)

If you execute another transaction without table foo, there will be another B/C
pair.

postgres=# DELETE FROM baz WHERE e = 2;
DELETE 1
postgres=# SELECT chr(get_byte(data, 0)) FROM 
pg_logical_slot_peek_binary_changes('slotfoo', NULL, NULL, 
'proto_version', '1', 'publication_names', 'pubfoo');
chr 
-
B
C
B
C
(4 rows)

Let's create a transaction that uses table foo but also table bar:

postgres=# BEGIN;
BEGIN
postgres=*# INSERT INTO foo (a, b) VALUES(100, 'asdf');
INSERT 0 1
postgres=*# INSERT INTO bar (c, d) VALUES(200, 'qwert');
INSERT 0 1
postgres=*# COMMIT;
COMMIT

In this case, there will be other messages since the publication pubfoo
publishes table foo. ('I' means there is an INSERT for table foo).

postgres=# SELECT chr(get_byte(data, 0)), length(data) FROM 
pg_logical_slot_peek_binary_changes('slotfoo', NULL, NULL, 
'proto_version', '1', 'publication_names', 'pubfoo');
chr | length 
-+
B   | 21
C   | 26
B   | 21
C   | 26
B   | 21
R   | 41
I   | 25
C   | 26
(8 rows)


In summary, a logical replication setup sends 47 bytes per skipped transaction.
v15 won't send the first 2 B/C pairs. Discussion started here [1].

[1] 
https://postgr.es/m/CAMkU=1yohp9-dv48flosprmqyeyys5zwkazgd41rjr10xin...@mail.gmail.com


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: First draft of the PG 15 release notes

2022-05-12 Thread Euler Taveira
On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote:
> I looked at that but thought that everyone would already assume we
> skipped replication of empty transactions, and I didn't see much impact
> for the user, so I didn't include it.
It certainly has an impact on heavy workloads that replicate tables with few
modifications. It receives a high traffic of 'begin' and 'commit' messages that
the previous Postgres versions have to handle (discard). I would classify it as
a performance improvement for logical replication. Don't have a strong opinion
if it should be mentioned or not.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Privileges on PUBLICATION

2022-05-12 Thread Euler Taveira
On Tue, May 10, 2022, at 5:37 AM, Antonin Houska wrote:
> My understanding is that the rows/columns filtering is a way for the
> *publisher* to control which data is available to particular replica. From
> this point of view, the publication privileges would just make the control
> complete.
I agree. IMO it is a new feature. We already require high privilege for logical
replication. Hence, we expect the replication user to have access to all data.
Unfortunately, nobody mentioned about this requirement during the row filter /
column list development; someone could have written a patch for GRANT ... ON
PUBLICATION.

I understand your concern. Like I said in my last sentence in the previous
email: it is a fine-grained access control on the publisher. Keep in mind that
it will *only* work for non-superusers (REPLICATION attribute). It is not
exposing something that we didn't expose before. In this particular case, there
is no mechanism to prevent the subscriber to obtain data provided by the
various row filters if they know the publication names. We could probably add a
sentence to "Logical Replication > Security" section:

There is no privileges for publications. If you have multiple publications in a
database, a subscription can use all publications available.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Privileges on PUBLICATION

2022-05-09 Thread Euler Taveira
On Mon, May 9, 2022, at 11:09 AM, Antonin Houska wrote:
> Now that the user can specify rows and columns to be omitted from the logical
> replication [1], I suppose hiding rows and columns from the subscriber is an
> important use case. However, since the subscription connection user (i.e. the
> user specified in the CREATE SUBSCRIPTION ... CONNECTION ... command) needs
> SELECT permission on the replicated table (on the publication side), he can
> just use another publication (which has different filters or no filters at
> all) to get the supposedly-hidden data replicated.
The required privileges were not relaxed on publisher after the row filter and 
column list features. It is not just to "create another publication". Create
publications require CREATE privilege on databases (that is *not* granted to
PUBLIC).If you have an untrusted user that could bypass your rules about hidden
data, it is better to review your user privileges.

postgres=# CREATE ROLE foo REPLICATION LOGIN;
CREATE ROLE
postgres=# \c - foo
You are now connected to database "postgres" as user "foo".
postgres=> CREATE PUBLICATION pub1;
ERROR:  permission denied for database postgres

The documentation [1] says

"The role used for the replication connection must have the REPLICATION
attribute (or be a superuser)."

You can use role foo for the replication connection but role foo couldn't be a
superuser. In this case, even if role foo open a connection to database
postgres, a publication cannot be created due to lack of privileges.

> Don't we need privileges on publication (e.g GRANT USAGE ON PUBLICATION ...)
> now?
Maybe. We rely on CREATE privilege on databases right now. If you say that
GRANT USAGE ON PUBLICATION is just a command that will have the same effect as
REPLICATION property [1] has right now, I would say it won't. Are you aiming a
fine-grained access control on publisher?


[1] https://www.postgresql.org/docs/devel/logical-replication-security.html


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: 2022-05-12 release announcement draft

2022-05-09 Thread Euler Taveira
On Mon, May 9, 2022, at 11:54 AM, Jonathan S. Katz wrote:
> You raise a good point, which is in the release announcement, should we 
> prefix contrib modules with "contrib"? Perhaps instead of shorthand we 
> spell it out, e.g. "Several fixes for the `pageinspect` module..."
Nowadays 'contrib' refers to the physical structure (directory). I wouldn't
mention it because it is a development/packaging detail. We use the terminology
additional supplied module/program to refer to this piece of software which is
kept in the Postgres repository but can be optionally installed. pageinspect
(possibly with the URL) is clear enough. However, if you don't like the
shorthand, 'pageinspect extension' or 'pageinspect module' are good options.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: 2022-05-12 release announcement draft

2022-05-09 Thread Euler Taveira
On Sun, May 8, 2022, at 5:51 PM, Jonathan S. Katz wrote:
> Hi,
> 
> Please see attached draft for the 2022-05-12 release announcement.
LGTM.

> Please provide feedback on accuracy and if there are any notable omissions.
* Several fixes for `contrib/pageinspect` to improve overall stability.
* Disable batch insertion in `contrib/postgres_fdw` when
`BEFORE INSERT ... FOR EACH ROW` triggers exist on the foreign table.

Should you omit 'contrib'? You mentioned ltree but don't say 'contrib/ltree'. It
also isn't used in a previous release announcement (see postgres_fdw).


[1] 
https://www.postgresql.org/about/news/postgresql-142-136-1210-1115-and-1020-released-2402/


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Logical replication timeout problem

2022-05-09 Thread Euler Taveira
On Mon, May 9, 2022, at 3:47 AM, Amit Kapila wrote:
> Thanks. The patch LGTM. I'll push and back-patch this after the
> current minor release is done unless there are more comments related
> to this work.
Looks sane to me. (I only tested the HEAD version)

+   boolend_xact = ctx->end_xact;

Do you really need a new variable here? It has the same name and the new one
isn't changed during the execution.

Does this issue deserve a test? A small wal_receiver_timeout. Although, I'm not
sure how stable the test will be.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Configuration Parameter/GUC value validation hook

2022-05-06 Thread Euler Taveira
On Wed, May 4, 2022, at 8:12 AM, Bharath Rupireddy wrote:
> How about we provide a sample extension (limiting some important
> parameters say shared_buffers, work_mem and so on to some
> "reasonable/recommended" limits) in the core along with the
> set_config_option_hook? This way, all the people using open source
> postgres out-of-the-box will benefit and whoever wants, can modify
> that sample extension to suit their needs. The sampe extension can
> also serve as an example to implement set_config_option_hook.
I agree with Robert that providing a feature for it instead of a hook is the
way to go. It is not just one or two vendors that will benefit from it but
almost or if not all vendors will use this feature. Hooks should be used for
niche features; that's not the case here.

The commit a0ffa885e47 introduced the GRANT SET ON PARAMETER command. It could
be used for this purpose. Despite of accepting GRANT on PGC_USERSET GUCs, it
has no use. It doesn't mean that additional properties couldn't be added to the
current syntax. This additional use case should be enforced before or while
executing set_config_option(). Is it ok to extend this SQL command?

The syntax could be:

GRANT SET ON PARAMETER work_mem (MIN '1MB', MAX '512MB') TO bob;

NULL keyword can be used to remove the MIN and MAX limit. The idea is to avoid
a verbose syntax (add an "action" to MIN/MAX -- ADD MIN 1, DROP MAX 234, SET
MIN 456).

The other alternative is to ALTER USER SET and ALTER DATABASE SET. The current
user can set parameter for himself and he could adjust the limits. Besides that
the purpose of these SQL commands are to apply initial settings for a
combination of user/database. I'm afraid it is out of scope to check after the
session is established.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


trivial comment fix

2022-04-27 Thread Euler Taveira
Hi,

While reading worker.c, I noticed that the referred SQL command was wrong.
ALTER SUBSCRIPTION ... REFRESH PUBLICATION instead of ALTER TABLE ... REFRESH
PUBLICATION. Trivial fix attached.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 4171371296..7da7823c35 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -93,7 +93,7 @@
  * ReorderBufferFinishPrepared.
  *
  * If the subscription has no tables then a two_phase tri-state PENDING is
- * left unchanged. This lets the user still do an ALTER TABLE REFRESH
+ * left unchanged. This lets the user still do an ALTER SUBSCRIPTION REFRESH
  * PUBLICATION which might otherwise be disallowed (see below).
  *
  * If ever a user needs to be aware of the tri-state value, they can fetch it


Re: Inconsistent "ICU Locale" output on older server versions

2022-04-15 Thread Euler Taveira
On Fri, Apr 15, 2022, at 11:58 AM, Christoph Berg wrote:
> When running psql 15 against PG 14, the output is this:
> 
> $ psql -l
> List of databases
>Name│  Owner   │ Encoding │  Collate   │   Ctype│ ICU Locale │ 
> Locale Provider │   Access privileges
> ───┼──┼──┼┼┼┼─┼───
> postgres  │ postgres │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ de_DE.utf8 │ libc 
>│
> template0 │ postgres │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ de_DE.utf8 │ libc 
>│ =c/postgres  ↵
>│  │  ││││ 
> │ postgres=CTc/postgres
> template1 │ postgres │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ de_DE.utf8 │ libc 
>│ =c/postgres  ↵
>│  │  ││││ 
> │ postgres=CTc/postgres
> (3 rows)
> 
> The "ICU Locale" column is now populated, that seems wrong.
Good catch!

> The problem is in the else branch in src/bin/psql/describe.c around
> line 900:
> 
> +   if (pset.sversion >= 15)
> +   appendPQExpBuffer(,
> + "   d.daticulocale as \"%s\",\n"
> + "   CASE d.datlocprovider WHEN 'c' THEN 'libc' 
> WHEN 'i' THEN 'icu' END AS \"%s\",\
> + gettext_noop("ICU Locale"),
> + gettext_noop("Locale Provider"));
> +   else
> +   appendPQExpBuffer(,
> + "   d.datcollate as \"%s\",\n"  <--- there
> + "   'libc' AS \"%s\",\n",
> + gettext_noop("ICU Locale"),
> + gettext_noop("Locale Provider"));
> 
> I'd think this should rather be
> 
> + "   '' as \"%s\",\n"
Since dataiculocale allows NULL, my suggestion is to use NULL instead of an
empty string. It is consistent with a cluster whose locale provider is libc.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Skipping schema changes in publication

2022-04-14 Thread Euler Taveira
On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote:
> On 12.04.22 08:23, vignesh C wrote:
> > I have also included the implementation for skipping a few tables from
> > all tables publication, the 0002 patch has the implementation for the
> > same.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > tables.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;
> 
> We have already allocated the "skip" terminology for skipping 
> transactions, which is a dynamic run-time action.  We are also using the 
> term "skip" elsewhere to skip locked rows, which is similarly a run-time 
> action.  I think it would be confusing to use the term SKIP for DDL 
> construction.
I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN
SCHEMA and if I were to suggest a keyword, it would be EXCEPT.

> I would also think about this in broader terms.  For example, sometimes 
> people want features like "all columns except these" in certain places. 
> The syntax for those things should be similar.
The questions are:
What kind of issues does it solve?
Do we have a workaround for it?

> That said, I'm not sure this feature is worth the trouble.  If this is 
> useful, what about "whole database except these schemas"?  What about 
> "create this database from this template except these schemas".  This 
> could get out of hand.  I think we should encourage users to group their 
> object the way they want and not offer these complicated negative 
> selection mechanisms.
I have the same impression too. We already provide a way to:

* include individual tables;
* include all tables;
* include all tables in a certain schema.

Doesn't it cover the majority of the use cases? We don't need to cover all
possible cases in one DDL command. IMO the current grammar for CREATE
PUBLICATION is already complicated after the ALL TABLES IN SCHEMA. You are
proposing to add "ALL TABLES SKIP ALL TABLES" that sounds repetitive but it is
not; doesn't seem well-thought-out. I'm also concerned about possible gotchas
for this proposal. The first command above suggests that it skips all tables in 
a
certain schema. What happen if I decide to include a particular table of the
skipped schema (second command)?

ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
ALTER PUBLICATION pub1 ADD TABLE s1.foo;

Having said that I'm not wedded to this proposal. Unless someone provides
compelling use cases for this additional syntax, I think we should leave the
publication syntax as is.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Support logical replication of DDLs

2022-04-14 Thread Euler Taveira
On Thu, Apr 14, 2022, at 6:26 AM, Dilip Kumar wrote:
> I agree.  But here the bigger question is what is the correct behavior
> in case of the Alter Table?  I mean for example in the publisher the
> table gets rewritten due to the Access Method change then what should
> be the behavior of the subscriber.  One expected behavior is that on
> subscriber also the access method gets changed and the data remains
> the same as on the subscriber table(table can be locally rewritten
> based on new AM).  Which seems quite sensible behavior to me.  But if
> we want this behavior then we can not replay the logical messages
> generated by DML WAL because of table rewrite, otherwise we will get
> duplicate data, unless we plan to get rid of the current data and just
> get all new data from the publisher. And if we do that then the data
> will be as per the latest data in the table based on the publisher, so
> I think first we need to define the correct behavior and then we can
> design it accordingly.
You should forbid it. Unless you can decompose the command into multiple SQL
commands to make it a safe operation for logical replication.

Let's say you want to add a column with a volatile default.

ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random();

If you replicate the DDL command as is, you will have different data
downstream. You should forbid it. However, this operation can be supported if
the DDL command is decomposed in multiple steps.

-- add a new column without DEFAULT to avoid rewrite
ALTER TABLE foo ADD COLUMN bar double precision;

-- future rows could use the DEFAULT expression
-- it also doesn't rewrite the table
ALTER TABLE foo ALTER COLUMN bar SET DEFAULT random();

-- it effectively rewrites the table
-- all rows are built from one source node
-- data will be the same on all nodes
UPDATE foo SET bar = random();

The ALTER TABLE ... ALTER COLUMN ... TYPE has a similar issue. This DDL command
can be decomposed to avoid the rewrite. If you are changing the data type, in
general, you add a new column and updates all rows doing the proper conversion.
(If you are updating in batches, you usually add a trigger to automatically
adjust the new column value for INSERTs and UPDATEs. Another case is when you
are reducing the the typmod (for example, varchar(100) to varchar(20)). In this
case, the DDL command can de decomposed removing the typmod information (ALTER
TABLE ... ALTER COLUMN ... TYPE varchar) and replacing it with a CHECK
constraint.

I didn't review this patch in depth but we certainly need to impose some DDL
restrictions if we are replicating DDLs. There are other cases that should be
treated accordingly such as a TABLESPACE specification or a custom data type.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Logical replication timeout problem

2022-04-14 Thread Euler Taveira
On Wed, Apr 13, 2022, at 7:45 AM, Amit Kapila wrote:
> On Mon, Apr 11, 2022 at 12:09 PM wangw.f...@fujitsu.com
>  wrote:
> >
> > So I skip tracking lag during a transaction just like the current HEAD.
> > Attach the new patch.
> >
> 
> Thanks, please find the updated patch where I have slightly modified
> the comments.
> 
> Sawada-San, Euler, do you have any opinion on this approach? I
> personally still prefer the approach implemented in v10 [1] especially
> due to the latest finding by Wang-San that we can't update the
> lag-tracker apart from when it is invoked at the transaction end.
> However, I am fine if we like this approach more.
It seems v15 is simpler and less error prone than v10. v10 has a mix of
OutputPluginUpdateProgress() and the new function update_progress(). The v10
also calls update_progress() for every change action in pgoutput_change(). It
is not a good approach for maintainability -- new changes like sequences need
extra calls. However, as you mentioned there should handle the track lag case.

Both patches change the OutputPluginUpdateProgress() so it cannot be
backpatched. Are you planning to backpatch it? If so, the boolean variable
(last_write or end_xacts depending of which version you are considering) could
be added to LogicalDecodingContext. (You should probably consider this approach
for skipped_xact too)

+ * For a large transaction, if we don't send any change to the downstream for a
+ * long time then it can timeout. This can happen when all or most of the
+ * changes are either not published or got filtered out.

We should probable mention that "long time" is wal_receiver_timeout on
subscriber.

+* change as that can have overhead. Testing reveals that there is no
+* noticeable overhead in doing it after continuously processing 100 or so
+* changes.

Tests revealed that ...

+* We don't have a mechanism to get the ack for any LSN other than end xact
+* lsn from the downstream. So, we track lag only for end xact lsn's.

s/lsn/LSN/ and s/lsn's/LSNs/

I would say "end of transaction LSN".

+ * If too many changes are processed then try to send a keepalive message to
+ * receiver to avoid timeouts.

In logical replication, if too many changes are processed then try to send a
keepalive message. It might avoid a timeout in the subscriber.

Does this same issue occur for long transactions? I mean keep a long
transaction open and execute thousands of transactions.

BEGIN;
INSERT INTO foo (a) VALUES(1);
-- wait a few hours while executing 10^x transactions
INSERT INTO foo (a) VALUES(2);
COMMIT;


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: PG DOCS - logical replication filtering

2022-04-13 Thread Euler Taveira
On Wed, Apr 13, 2022, at 12:24 AM, Peter Smith wrote:
> PSA patch v10 which addresses the remaining review comments from Euler [1]
Looks good to me.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: PG DOCS - logical replication filtering

2022-04-12 Thread Euler Taveira
On Tue, Apr 12, 2022, at 5:30 AM, Peter Smith wrote:
> Not changed. Because in fact, I copied most of this sentence
> (including the uppercase, "operations", "and/or") from existing
> documentation [1]
> e.g. see "The tables added to a publication that publishes UPDATE
> and/or DELETE operations must ..."
> [1] https://www.postgresql.org/docs/current/sql-createpublication.html
Hmm. You are checking the Notes. I'm referring the the publish parameter. IMO
this sentence should use operations in lowercase letters too because even if
you create it with uppercase letters, Postgres will provide a lowercase word
when you dump it.

> Yeah, I know the information is the same in the summary and in the
> text. Personally, I find big slabs of technical text difficult to
> digest, so I'd have to spend 5 minutes of careful reading and drawing
> the exact same summary on a piece of paper anyway just to visualize
> what the text says. The summary makes it easy to understand at a
> glance. But I have modified the summary in [v9] to remove the "case"
> and to add other column headings as suggested.
Isn't it better to use a table instead of synopsis?

> Not changed. The readers of this docs page are all users who will be
> familiar with the filter expressions, so I felt the "OR'ed together"
> part will be perfectly clear to the intended audience.
If you want to keep it, change it to "ORed". It is used in indices.sgml. Let's
keep the consistence.

> But I did not understand your point about “If, for some reason,
> someone decided to change it, this section will contain obsolete
> information”, because IIUC that will be equally true for both the
> explanation and the output, so I did not understand why you say "psql
> output is fine, the explanation is not". It is the business of this
> documentation to help the user to know how and where they can find the
> row filter information they may need to know.
You are describing a psql command here. My point is keep psql explanation in
the psql section. This section is to describe the row filter feature. If we
start describing features in other sections, we will have outdated information
when the referred feature is changed and someone fails to find all references.
I tend to concentrate detailed explanation in the feature section. If I have to
add links in other sections, I use "Seee Section 1.23 for details ...".

> Not changed. The publisher and subscriber programlistings are always
> separated. If you are looking at the rendered HTML I think it is quite
> clear that one is at the publisher and one is at the subscriber. OTOH,
> if we omitted creating the tables on the subscriber then I think that
> really would cause some confusion.
The difference is extra space. By default, the CSS does not include a border
for programlisting. That's why I complained about it. I noticed that the
website CSS includes it. However, the PDF will not include the border. I would
add a separate description for the subscriber just to be clear.

One last suggestion, you are using identifiers in uppercase letters but
"primary key" is in lowercase.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: PG DOCS - logical replication filtering

2022-04-11 Thread Euler Taveira
;s" x "p"). The publication and subscription definitions are fine there.

I think reusing the same tables and publication introduces complexity.
Shouldn't we just use different tables and publication to provide an "easy"
example? It would avoid DROP PUBLICATION, ALTER SUBSCRIPTION and TRUNCATE.

> Do the inserts same as before.

We should indicate the node (publisher) to be clear.

[1] https://www.postgresql.org/docs/devel/sql-createpublication.html


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Support logical replication of DDLs

2022-04-11 Thread Euler Taveira
On Mon, Apr 11, 2022, at 2:00 AM, Amit Kapila wrote:
> On Thu, Apr 7, 2022 at 3:46 PM Amit Kapila  wrote:
> >
> > On Wed, Mar 23, 2022 at 10:39 AM Japin Li  wrote:
> >
> > 2. For DDL replication, do we need to wait for a consistent point of
> > snapshot? For DMLs, that point is a convenient point to initialize
> > replication from, which is why we export a snapshot at that point,
> > which is used to read normal data. Do we have any similar needs for
> > DDL replication?
> >
> 
> I have thought a bit more about this and I think we need to build the
> snapshot for DML replication as we need to read catalog tables to
> decode the corresponding WAL but it is not clear to me if we have a
> similar requirement for DDL replication. If the catalog access is
> required then it makes sense to follow the current snapshot model,
> otherwise, we may need to think differently for DDL replication.
> 
> One more related point is that for DML replication, we do ensure that
> we copy the entire data of the table (via initial sync) which exists
> even before the publication for that table exists, so do we want to do
> something similar for DDLs? How do we sync the schema of the table
> before the user has defined the publication? Say the table has been
> created before the publication is defined and after that, there are
> only Alter statements, so do we expect, users to create the table on
> the subscriber and then we can replicate the Alter statements? And
> even if we do that it won't be clear which Alter statements will be
> replicated after publication is defined especially if those Alters
> happened concurrently with defining publications?
The *initial* DDL replication is a different problem than DDL replication. The
former requires a snapshot to read the current catalog data and build a CREATE
command as part of the subscription process. The subsequent DDLs in that object
will be handled by a different approach that is being discussed here.

I'm planning to work on the initial DDL replication. I'll open a new thread as
soon as I write a design for it. Just as an example, the pglogical approach is
to use pg_dump behind the scenes to provide the schema [1]. It is a reasonable
approach but an optimal solution should be an API to provide the initial DDL
commands. I mean the main point of this feature is to have an API to create an
object that the logical replication can use it for initial schema
synchronization. This "DDL to create an object" was already discussed in the
past [2].


[1] 
https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_sync.c#L942
[2] https://www.postgresql.org/message-id/4E69156E.5060509%40dunslane.net


--
Euler Taveira
EDB   https://www.enterprisedb.com/


merge documentation fix

2022-04-01 Thread Euler Taveira
Hi,

While inspecting the MERGE documentation, I noticed that there is an extra
semicolon in one of the examples that shouldn't be there.

diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
index c547122c9b..ac1c0a83dd 100644
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -596,7 +596,7 @@ ON s.winename = w.winename
WHEN NOT MATCHED AND s.stock_delta > 0 THEN
   INSERT VALUES(s.winename, s.stock_delta)
WHEN MATCHED AND w.stock + s.stock_delta > 0 THEN
-  UPDATE SET stock = w.stock + s.stock_delta;
+  UPDATE SET stock = w.stock + s.stock_delta
WHEN MATCHED THEN
   DELETE;



--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Logical replication timeout problem

2022-03-31 Thread Euler Taveira
On Thu, Mar 31, 2022, at 11:27 PM, Amit Kapila wrote:
> This is exactly our initial analysis and we have tried a patch on
> these lines and it has a noticeable overhead. See [1]. Calling this
> for each change or each skipped change can bring noticeable overhead
> that is why we decided to call it after a certain threshold (100) of
> skipped changes. Now, surely as mentioned in my previous reply we can
> make it generic such that instead of calling this (update_progress
> function as in the patch) for skipped cases, we call it always. Will
> that make it better?
That's what I have in mind but using a different approach.

> > The functions CreateInitDecodingContext and CreateDecodingContext receives 
> > the
> > update_progress function as a parameter. These functions are called in 2
> > places: (a) streaming replication protocol (CREATE_REPLICATION_SLOT) and (b)
> > SQL logical decoding functions (pg_logical_*_changes). Case (a) uses
> > WalSndUpdateProgress as a progress function. Case (b) does not have one 
> > because
> > it is not required -- local decoding/communication. There is no custom 
> > update
> > progress routine for each output plugin which leads me to the question:
> > couldn't we encapsulate the update progress call into the callback 
> > functions?
> >
> 
> Sorry, I don't get your point. What exactly do you mean by this?
> AFAIS, currently we call this output plugin API in pgoutput functions
> only, do you intend to get it invoked from a different place?
It seems I didn't make myself clear. The callbacks I'm referring to the
*_cb_wrapper functions. After every ctx->callbacks.foo_cb() call into a
*_cb_wrapper() function, we have something like:

if (ctx->progress & PGOUTPUT_PROGRESS_FOO)
NewUpdateProgress(ctx, false);

The NewUpdateProgress function would contain a logic similar to the
update_progress() from the proposed patch. (A different function name here just
to avoid confusion.)

The output plugin is responsible to set ctx->progress with the callback
variables (for example, PGOUTPUT_PROGRESS_CHANGE for change_cb()) that we would
like to run NewUpdateProgress.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Logical replication timeout problem

2022-03-31 Thread Euler Taveira
On Thu, Mar 31, 2022, at 9:24 AM, Masahiko Sawada wrote:
> The patch basically looks good to me. But the only concern to me is
> that once we get the patch committed, we will have to call
> update_progress() at all paths in callbacks that process changes.
> Which seems poor maintainability.
I didn't like the current fix for the same reason. We need a robust feedback
system for logical replication. We had this discussion in the "skip empty
transactions" thread [1].

> On the other hand, possible another solution would be to add a new
> callback that is called e.g., every 1000 changes so that walsender
> does its job such as timeout handling while processing the decoded
> data in reorderbuffer.c. The callback is set only if the walsender
> does logical decoding, otherwise NULL. With this idea, other plugins
> will also be able to benefit without changes. But I’m not really sure
> it’s a good design, and adding a new callback introduces complexity.
No new callback is required.

In the current code, each output plugin callback is responsible to call
OutputPluginUpdateProgress. It is up to the output plugin author to add calls
to this function. The lack of a call in a callback might cause issues like what
was described in the initial message.

The functions CreateInitDecodingContext and CreateDecodingContext receives the
update_progress function as a parameter. These functions are called in 2
places: (a) streaming replication protocol (CREATE_REPLICATION_SLOT) and (b)
SQL logical decoding functions (pg_logical_*_changes). Case (a) uses
WalSndUpdateProgress as a progress function. Case (b) does not have one because
it is not required -- local decoding/communication. There is no custom update
progress routine for each output plugin which leads me to the question:
couldn't we encapsulate the update progress call into the callback functions?
If so, we could have an output plugin parameter to inform which callbacks we
would like to call the update progress routine. This would simplify the code,
make it less error prone and wouldn't impose a burden on maintainability.

[1] 
https://www.postgresql.org/message-id/20200309183018.tzkzwu635sd366ej%40alap3.anarazel.de


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: logical replication restrictions

2022-03-23 Thread Euler Taveira
On Mon, Mar 21, 2022, at 10:09 PM, Euler Taveira wrote:
> On Mon, Mar 21, 2022, at 10:04 PM, Andres Freund wrote:
>> On 2022-03-20 21:40:40 -0300, Euler Taveira wrote:
>> > On Mon, Feb 28, 2022, at 9:18 PM, Euler Taveira wrote:
>> > > Long time, no patch. Here it is. I will provide documentation in the next
>> > > version. I would appreciate some feedback.
>> > This patch is broken since commit 
>> > 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33. I
>> > rebased it.
>> 
>> This fails tests, specifically it seems psql crashes:
>> https://cirrus-ci.com/task/6592281292570624?logs=cores#L46
> Yeah. I forgot to test this patch with cassert before sending it. :( I didn't
> send a new patch because there is another issue (with int128) that I'm
> currently reworking. I'll send another patch soon.
Here is another version after rebasing it. In this version I fixed the psql
issue and rewrote interval_to_ms function.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 6718e96d3682af9094c02b0e308a8c814148a197 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v3] Time-delayed logical replication subscriber

Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (specially to fix
errors that might cause data loss).

If the subscriber sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.

The delay is calculated between the WAL time stamp and the current time
on the subscriber.

The delay occurs only on WAL records for transaction begins. Regular and
prepared transactions are covered. Streamed transactions are not
delayed. It should be implemented in future releases.

Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=xtyaxkih5ctm_vm8kjkh7aaitckvmch4rzr...@mail.gmail.com
---
 doc/src/sgml/ref/alter_subscription.sgml   |   5 +-
 doc/src/sgml/ref/create_subscription.sgml  |  33 +
 src/backend/catalog/pg_subscription.c  |   1 +
 src/backend/catalog/system_views.sql   |   4 +-
 src/backend/commands/subscriptioncmds.c|  46 ++-
 src/backend/replication/logical/worker.c   |  61 +
 src/backend/utils/adt/timestamp.c  |  32 +
 src/bin/pg_dump/pg_dump.c  |  13 +-
 src/bin/pg_dump/pg_dump.h  |   1 +
 src/bin/psql/describe.c|  13 +-
 src/include/catalog/pg_subscription.h  |   2 +
 src/include/datatype/timestamp.h   |   2 +
 src/include/utils/timestamp.h  |   2 +
 src/test/regress/expected/subscription.out | 149 -
 src/test/regress/sql/subscription.sql  |  20 +++
 src/test/subscription/t/030_apply_delay.pl |  76 +++
 16 files changed, 389 insertions(+), 71 deletions(-)
 create mode 100644 src/test/subscription/t/030_apply_delay.pl

diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ac2db249cb..83e4e352ca 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -205,8 +205,9 @@ ALTER SUBSCRIPTION name RENAME TO <
   information.  The parameters that can be altered
   are slot_name,
   synchronous_commit,
-  binary, streaming, and
-  disable_on_error.
+  binary, streaming,
+  disable_on_error, and
+  min_apply_delay.
  
 

diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index b701752fc9..2bc8deb132 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -302,6 +302,39 @@ CREATE SUBSCRIPTION subscription_name
 

+
+   
+min_apply_delay (integer)
+
+ 
+  By default, subscriber applies changes as soon as possible. Similar
+  to the physical replication feature
+  (), it may be useful to
+  have a time-delayed copy of data for logical replication. This
+  parameter allows you to delay the application of changes by a
+  specified amount of time. If this value is specificed without units,
+  it is taken as milliseconds. The default is zero, adding no delay.
+ 
+ 
+  The delay occurs only after the initial table synchronization. It is
+  possible that the replication delay between publisher and subscriber
+  exceeds the value of this parameter, in which case no delay is added.
+  Note that the delay is calculated between the WAL time stamp as
+  written on publisher and the current time on the subscriber. Delays
+  in logical decoding and in transfer the transaction may reduce the
+  actual wait time. If the system clocks on publisher and subscriber
+  are not synchronized,

Re: logical replication restrictions

2022-03-21 Thread Euler Taveira
On Mon, Mar 21, 2022, at 10:04 PM, Andres Freund wrote:
> On 2022-03-20 21:40:40 -0300, Euler Taveira wrote:
> > On Mon, Feb 28, 2022, at 9:18 PM, Euler Taveira wrote:
> > > Long time, no patch. Here it is. I will provide documentation in the next
> > > version. I would appreciate some feedback.
> > This patch is broken since commit 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33. 
> > I
> > rebased it.
> 
> This fails tests, specifically it seems psql crashes:
> https://cirrus-ci.com/task/6592281292570624?logs=cores#L46
Yeah. I forgot to test this patch with cassert before sending it. :( I didn't
send a new patch because there is another issue (with int128) that I'm
currently reworking. I'll send another patch soon.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Skipping logical replication transactions on subscriber side

2022-03-21 Thread Euler Taveira
On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
> I have fixed all the above comments as per your suggestion in the
> attached. Do let me know if something is missed?
Looks good to me.

> > src/test/subscription/t/029_disable_on_error.pl |  94 --
> > src/test/subscription/t/029_on_error.pl | 183 +++
> >
> > It seems you are removing a test for 
> > 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33.
> > I should also name 029_on_error.pl to something else such as 
> > 030_skip_lsn.pl or
> > a generic name 030_skip_option.pl.
> >
> 
> As explained in my previous email, I don't think any change is
> required for this comment but do let me know if you still think so?
Oh, sorry about the noise. I saw mixed tests between the 2 new features and I
was confused if it was intentional or not.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Skipping logical replication transactions on subscriber side

2022-03-20 Thread Euler Taveira
On Thu, Mar 17, 2022, at 3:03 AM, Amit Kapila wrote:
> On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada  wrote:
> >
> > I've attached an updated version patch.
> >
> 
> The patch LGTM. I have made minor changes in comments and docs in the
> attached patch. Kindly let me know what you think of the attached?
> 
> I am planning to commit this early next week (on Monday) unless there
> are more comments/suggestions.
I reviewed this last version and I have a few comments.

+* If the user set subskiplsn, we do a sanity check to make
+* sure that the specified LSN is a probable value.

... user *sets*...

+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("skip WAL location (LSN) must be 
greater than origin LSN %X/%X",
+   LSN_FORMAT_ARGS(remote_lsn;

Shouldn't we add the LSN to be skipped in the "(LSN)"?

+* Start a new transaction to clear the subskipxid, if not started
+* yet.

It seems it means subskiplsn.

+ * subskipxid in order to inform users for cases e.g., where the user 
mistakenly
+ * specified the wrong subskiplsn.

It seems it means subskiplsn.

+sub test_skip_xact
+{

It seems this function should be named test_skip_lsn. Unless the intention is
to cover other skip options in the future.

src/test/subscription/t/029_disable_on_error.pl |  94 --
src/test/subscription/t/029_on_error.pl | 183 +++

It seems you are removing a test for 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33.
I should also name 029_on_error.pl to something else such as 030_skip_lsn.pl or
a generic name 030_skip_option.pl.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: logical replication restrictions

2022-03-20 Thread Euler Taveira
On Mon, Feb 28, 2022, at 9:18 PM, Euler Taveira wrote:
> Long time, no patch. Here it is. I will provide documentation in the next
> version. I would appreciate some feedback.
This patch is broken since commit 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33. I
rebased it.

I added documentation that explains how this parameter works. I decided to
rename the parameter from apply_delay to min_apply_delay to use the same
terminology from the physical replication. IMO the new name seems clear that
there isn't a guarantee that we are always x ms behind the publisher. Indeed,
due to processing/transferring the delay might be higher than the specified
interval.

I refactored the way the delay is applied. The previous patch is only covering
a regular transaction. This new one also covers prepared transaction. The
current design intercepts the transaction during the first change (at the time
it will start the transaction to apply the changes) and applies the delay
before effectively starting the transaction. The previous patch uses
begin_replication_step() as this point. However, to support prepared
transactions I changed the apply_delay signature to accepts a timestamp
parameter (because we use another variable to calculate the delay for prepared
transactions -- prepare_time). Hence, the apply_delay() moved to another places
-- apply_handle_begin and apply_handle_begin_prepare().

The new code does not apply the delay in 2 situations:

* STREAM START: streamed transactions might not have commit_time or
  prepare_time set. I'm afraid it is not possible to use the referred variables
  because at STREAM START time we don't have a transaction commit time. The
  protocol could provide a timestamp that indicates when it starts streaming
  the transaction then we could use it to apply the delay. Unfortunately, we
  don't have it. Having said that this new patch does not apply delay for
  streamed transactions.
* non-transaction messages: the delay could be applied to non-transaction
  messages too. It is sent independently of the transaction that contains it.
  Since the logical replication does not send messages to the subscriber, this
  is not an issue. However, consumers that use pgoutput and wants to implement
  a delay will require it.

I'm still looking for a way to support streamed transactions without much
surgery into the logical replication protocol.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 9ad783c1259aed9bef81877265c648aae84ed5d8 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v2] Time-delayed logical replication subscriber

Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (specially to fix
errors that might cause data loss).

If the subscriber sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.

The delay is calculated between the WAL time stamp and the current time
on the subscriber.

The delay occurs only on WAL records for transaction begins. Regular and
prepared transactions are covered. Streamed transactions are not
delayed. It should be implemented in future releases.

Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=xtyaxkih5ctm_vm8kjkh7aaitckvmch4rzr...@mail.gmail.com
---
 doc/src/sgml/ref/alter_subscription.sgml   |   5 +-
 doc/src/sgml/ref/create_subscription.sgml  |  33 +
 src/backend/catalog/pg_subscription.c  |   1 +
 src/backend/catalog/system_views.sql   |   4 +-
 src/backend/commands/subscriptioncmds.c|  46 ++-
 src/backend/replication/logical/worker.c   |  61 +
 src/backend/utils/adt/timestamp.c  |   8 ++
 src/bin/pg_dump/pg_dump.c  |  13 +-
 src/bin/pg_dump/pg_dump.h  |   1 +
 src/bin/psql/describe.c|  11 +-
 src/include/catalog/pg_subscription.h  |   2 +
 src/include/utils/timestamp.h  |   2 +
 src/test/regress/expected/subscription.out | 141 +
 src/test/regress/sql/subscription.sql  |  20 +++
 src/test/subscription/t/030_apply_delay.pl |  76 +++
 15 files changed, 358 insertions(+), 66 deletions(-)
 create mode 100644 src/test/subscription/t/030_apply_delay.pl

diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 58b78a94ea..b764f8eba8 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -204,8 +204,9 @@ ALTER SUBSCRIPTION name RENAME TO <
   information.  The parameters that can be altered
   are slot_name,
   synchronous_commit,
-  binary, streaming, and
-  disable_on_error.
+  binary, streaming,
+  disable_on_error, and
+  min_apply_delay.
  
 

diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index b701752fc9..2bc8deb1

Re: PG DOCS - logical replication filtering

2022-03-10 Thread Euler Taveira
On Fri, Mar 4, 2022, at 12:41 AM, Peter Smith wrote:
> PSA patch v3 to address all comments received so far.
Peter,

I started reviewing this documentation for row filter and I noticed that I was
changing too much lines to submit a patch on the top of it. Hence, I'm
attaching a new version based on this one.

Here as some of the changes that I did:

* links: I renamed the ids to use "logical-replication-row-filter" instead of
  "logical-replication-rf" because it is used in the anchors. A compound word
  is better than an abbreviation.
* titles: I changed all titles because of some stylish issue (words are
  generally capitalized) or because it reads better.
* sections: I moved the "Restrictions" section to the top but it seems
  important than the other sections. I removed the "psql" section. The psql
  commands are shown in the "Examples" section so I don't think we should
  provide a section for it.
* sentences: I expanded several sentences to provide details about the specific
  topic. I also reordered some sentences and removed some duplicated sentences.
* replica identity: I added a link to replica identity. It is a key concept for
  row filter.
* transformations: I added a few sentences explaining when/why a transformation
  is required. I removed the "Cases" part (same as in the source code) because
  it is redundant with the new sentences.
* partitioned table: the title should be _partitioned_ table. I replaced the
  bullets with sentences in the same paragraph.
* initial data sync: I removed the "subscriber" from the section title. When
  you say "initial data synchronization" it seems clear we're talking about the
  subscriber. I also add a sentence saying why pre-15 does not consider row
  filters.
* combining row filters: I renamed the section and decided to remove "for the
  same table". When reading the first sentences from this section, it is clear
  that row filtering is per table. So if you are combining multiple row
  filters, it should be for the same table. I also added a sentence saying why
  some clauses make the row filter irrelevant.
* examples: I combined the psql commands that shows row filter information
  together (\dRp+ and \d). I changed to connection string to avoid "localhost".
  Why? It does not seem a separate service (there is no port) and setup pub/sub
  in the same cluster requires additional steps. It is better to illustrate
  different clusters (at least it seems so since we don't provide details from
  publisher). I changed a value in an UPDATE because both UPDATEs use 999.

We could probably reduce the number of rows in the example but I didn't bother
to remove them.

It seems we can remove some sentences from the CREATE PUBLICATION because we
have a new section that explains all of it. I think the link that was added by
this patch is sufficient.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From aea0c822489fbdeba14141e215f76e3777521bb2 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Thu, 10 Mar 2022 17:38:24 -0300
Subject: [PATCH v4] doc: new section for row filter.

---
 doc/src/sgml/logical-replication.sgml| 444 +++
 doc/src/sgml/ref/create_publication.sgml |   2 +
 2 files changed, 446 insertions(+)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 82326c3901..38ac77b6e1 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -118,6 +118,8 @@
any combination of INSERT, UPDATE,
DELETE, and TRUNCATE, similar to how triggers are fired by
particular event types.  By default, all operation types are replicated.
+   (Row filters have no effect for TRUNCATE. See 
+   ).
   
 
   
@@ -317,6 +319,448 @@
   
  
 
+ 
+  Row Filter
+
+  
+   The published tables replicate all data to the subscribers. The replicated
+   data can be selected using a row filter. The user
+   might choose to use row filters for behavioral, security or performance
+   reasons.
+  
+
+  
+   If a published table sets a row filter, a row is replicated if its data
+   satisfies the row filter expression. It means if row filters are used in a
+   set of tables, data will be partially replicated.
+  
+
+  
+   The row filter is defined per table. Use a WHERE clause
+   after the table name for each published table that requires data to be
+   filtered out. The WHERE clause must be enclosed by
+   parentheses. See  for details.
+  
+
+  
+   Row Filter Rules
+
+   
+Row filter is applied before publishing the changes.
+   
+
+   
+If the row filter evaluates to false or
+NULL then the row is not replicated.
+   
+
+   
+The WHERE clause expression is evaluated with the same
+role used for the replication connection. It means the role specified in the
+CONNECTION clause of the .
+   
+
+   
+Row filter has no effect for TRUNCAT

Re: Add the replication origin name and commit-LSN to logical replication worker errcontext

2022-03-04 Thread Euler Taveira
On Fri, Mar 4, 2022, at 2:54 AM, Amit Kapila wrote:
> The LSN of the transaction that contains the change violating the
> constraint and the replication origin name can be found from the
> server log (LSN 0/14C0378 and replication origin
> pg_16395 in the above case).  To skip the
> transaction, the subscription needs to be disabled temporarily by
> ALTER SUBSCRIPTION ... DISABLE first. Then, the
> transaction can be skipped by calling the  linkend="pg-replication-origin-advance">
> pg_replication_origin_advance() function
> with the node_name (i.e.,
> pg_16395) and the next LSN of the commit LSN (i.e.,
> LSN 0/14C0379).
You could also add:

After that the replication can be resumed by ALTER SUBSCRIPTION ...
ENABLE.

Let's provide complete instructions.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2022-03-03 Thread Euler Taveira
On Thu, Mar 3, 2022, at 7:47 AM, Amit Kapila wrote:
> LGTM. I'll push this tomorrow unless Tomas or Euler feels otherwise.
Sounds good to me.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2022-03-02 Thread Euler Taveira
On Wed, Mar 2, 2022, at 8:45 AM, Tomas Vondra wrote:
> While working on the column filtering patch, which touches about the
> same places, I noticed two minor gaps in testing:
> 
> 1) The regression tests do perform multiple ALTER PUBLICATION commands,
> tweaking the row filter. But there are no checks the row filter was
> actually modified / stored in the catalog. It might be just thrown away
> and no one would notice.
The test that row filter was modified is available in a previous section. The
one that you modified (0001) is testing the supported objects.

153 ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e 
< 2000);
154 \dRp+ testpub5
155 ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
156 \dRp+ testpub5
157 -- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE 
expression)
158 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 
500);
159 \dRp+ testpub5

IIRC this test was written before adding the row filter information into the
psql. We could add \d+ testpub_rf_tbl3 before and after the modification.

> 2) There are no pg_dump tests.
WFM.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: logical replication restrictions

2022-03-01 Thread Euler Taveira
On Tue, Mar 1, 2022, at 3:27 AM, osumi.takami...@fujitsu.com wrote:
> $ git am v1-0001-Time-delayed-logical-replication-subscriber.patch
I generally use -3 to fall back on 3-way merge. Doesn't it work for you?


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: logical replication restrictions

2022-02-28 Thread Euler Taveira
On Wed, Sep 22, 2021, at 1:57 PM, Euler Taveira wrote:
> On Wed, Sep 22, 2021, at 1:18 AM, Amit Kapila wrote:
>> On Tue, Sep 21, 2021 at 4:21 PM Marcos Pegoraro  wrote:
>>> No, I´m talking about that configuration you can have on standby servers
>>> recovery_min_apply_delay = '8h'
>>> 
>> 
>> oh okay, I think this can be useful in some cases where we want to avoid 
>> data loss similar to its use for physical standby. For example, if the user 
>> has by mistake truncated the table (or deleted some required data) on the 
>> publisher, we can always it from the subscriber if we have such a feature.
>> 
>> Having said that, I am not sure if we can call it a restriction. It is more 
>> of a TODO kind of thing. It doesn't sound advisable to me to keep growing 
>> the current Restrictions page [1].
> It is a new feature. pglogical supports it and it is useful for delayed
> secondary server and if, for some business reason, you have to delay when data
> is available. There might be other use cases but these are the ones I 
> regularly
> heard from customers.
> 
> BTW, I have a WIP patch for this feature. I didn't have enough time to post it
> because it lacks documentation and tests. I'm planning to do it as soon as 
> this
> CF ends.
Long time, no patch. Here it is. I will provide documentation in the next
version. I would appreciate some feedback.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 9635fec1a031b82ec5d67cdfe16aa1f553ffa936 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v1] Time-delayed logical replication subscriber

Similar to physical replication, a time-delayed copy of the data is
useful for some scenarios (specially to fix errors that might cause data
loss).

If the subscriber sets apply_delay parameter, the logical replication
worker will delay the transaction commit for apply_delay milliseconds.

Discussion: https://postgr.es/m/CAB-JLwYOYwL=xtyaxkih5ctm_vm8kjkh7aaitckvmch4rzr...@mail.gmail.com
---
 src/backend/catalog/pg_subscription.c  |  1 +
 src/backend/catalog/system_views.sql   |  3 +-
 src/backend/commands/subscriptioncmds.c| 44 +-
 src/backend/replication/logical/worker.c   | 48 +++
 src/backend/utils/adt/timestamp.c  |  8 ++
 src/bin/pg_dump/pg_dump.c  | 16 +++-
 src/bin/pg_dump/pg_dump.h  |  1 +
 src/bin/psql/describe.c|  8 +-
 src/include/catalog/pg_subscription.h  |  3 +
 src/include/utils/timestamp.h  |  2 +
 src/test/regress/expected/subscription.out | 96 +++---
 src/test/subscription/t/029_apply_delay.pl | 71 
 12 files changed, 248 insertions(+), 53 deletions(-)
 create mode 100644 src/test/subscription/t/029_apply_delay.pl

diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..0788384579 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
 	sub->binary = subform->subbinary;
 	sub->stream = subform->substream;
 	sub->twophasestate = subform->subtwophasestate;
+	sub->applydelay = subform->subapplydelay;
 
 	/* Get conninfo */
 	datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3cb69b1f87..1cc0d86f2e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,8 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
 -- All columns of pg_subscription except subconninfo are publicly readable.
 REVOKE ALL ON pg_subscription FROM public;
 GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
-  substream, subtwophasestate, subslotname, subsynccommit, subpublications)
+  substream, subtwophasestate, subslotname, subsynccommit,
+  subapplydelay, subpublications)
 ON pg_subscription TO public;
 
 CREATE VIEW pg_stat_subscription_workers AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3ef6607d24..19916f04a8 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -46,6 +46,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
+#include "utils/timestamp.h"
 
 /*
  * Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -61,6 +62,7 @@
 #define SUBOPT_BINARY0x0080
 #define SUBOPT_STREAMING			0x0100
 #define SUBOPT_TWOPHASE_COMMIT		0x0200
+#define SUBOPT_APPLY_DELAY			0x0400
 
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
@@

Re: Commitfest manager for 2022-03

2022-02-27 Thread Euler Taveira
On Sat, Feb 26, 2022, at 9:37 PM, Justin Pryzby wrote:
> |https://commitfest.postgresql.org/37/2906/
> |Row filtering for logical replication
> If I'm not wrong, this is merged and should be closed?
I think Amit forgot to mark it as committed. Done.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Design of pg_stat_subscription_workers vs pgstats

2022-02-25 Thread Euler Taveira
On Fri, Feb 25, 2022, at 11:52 AM, Greg Stark wrote:
> On Tue, 25 Jan 2022 at 01:32, Andres Freund  wrote:
> >
> > Hi,
> >
> > I was looking the shared memory stats patch again.
> 
> Can you point me to this thread? I looked for it but couldn't find it.
https://postgr.es/m/20180629.173418.190173462.horiguchi.kyot...@lab.ntt.co.jp


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: logical decoding and replication of sequences

2022-02-23 Thread Euler Taveira
On Wed, Feb 23, 2022, at 4:18 PM, Tomas Vondra wrote:
> On 2/23/22 18:33, Euler Taveira wrote:
> > On Wed, Feb 23, 2022, at 1:07 PM, Tomas Vondra wrote:
> >> Maybe, but I don't think it's very common to have that many
> >> schemas added to the same publication. And it probably does not
> >> make much difference whether you have 1000 or 2000 items in the
> >> list - either both are acceptable or unacceptable, I think.
> >
> > Wouldn't it confuse users? Hey, duplicate publication. How? Wait.
> > Doh.
> > 
> 
> I don't follow. Duplicate publications? This talks about rows in
> pg_publication_namespace, not pg_publication.
I was referring to

  Publications:
  "testpub_schemas" (sequences)
  "testpub_schemas" (tables)

versus

  Publications:
  "testpub_schemas" (tables, sequences)

I prefer the latter.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: logical decoding and replication of sequences

2022-02-23 Thread Euler Taveira
On Wed, Feb 23, 2022, at 1:07 PM, Tomas Vondra wrote:
> Maybe, but I don't think it's very common to have that many schemas
> added to the same publication. And it probably does not make much
> difference whether you have 1000 or 2000 items in the list - either both
> are acceptable or unacceptable, I think.
Wouldn't it confuse users? Hey, duplicate publication. How? Wait. Doh.

> I think just hard-coding this into CreatePublicationStmt would work, but
> it'll be an issue once/if we start adding more options. I'm not sure if
> it makes sense to replicate other relkinds, but maybe DDL?
Materialized view? As you mentioned DDL, maybe we can use the CREATE
PUBLICATION syntax to select which DDL commands we want to replicate.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Frontend error logging style

2022-02-23 Thread Euler Taveira
On Wed, Feb 23, 2022, at 12:30 PM, Tom Lane wrote:
> Updating the patch is going to be a bit tedious, so I'm not
> going to do it without buy-in that this solution would be
> okay to commit.  Any objections?
No. I was confused by the multiple log functions spread in the client programs
while I was working on pg_subscriber [1]. Your proposal looks good to me.

[1] https://postgr.es/m/5ac50071-f2ed-4ace-a8fd-b892cffd33eb%40www.fastmail.com


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: List of all* PostgreSQL EXTENSIONs in the world

2022-02-23 Thread Euler Taveira
On Wed, Feb 23, 2022, at 6:00 AM, Joel Jacobson wrote:
> On Fri, Feb 11, 2022, at 04:46, Noah Misch wrote:
> > How did you make the list?  (I'd imagine doing it by searching for
> > repositories containing evidence like \bpgxs\b matches.)
> 
> Searching Github for repos with a *.control file in the root dir and a 
> Makefile containing ^PGXS
Interesting. What's an extension? It is something that contains user-defined
objects. It would be good if your list was expanded to contain addons (modules)
that are basically plugins that don't create additional objects in the database
e.g. an output plugin or a module that uses any hooks (such as auth_delay).
They generally don't provide control file (for example, wal2json). I don't know
if can only rely on PGXS check because there are client programs that uses the
PGXS infrastructure to build it.

> Hmm, now that you say it, maybe the ^PGXS regex should be case-insensitive,
> if pgxs can be written in e.g. lower case?
Makefile variable names are case-sensitive. You cannot write pgxs or PgXs; it
should be PGXS.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: speed up a logical replica setup

2022-02-21 Thread Euler Taveira
On Mon, Feb 21, 2022, at 8:28 PM, Andres Freund wrote:
> I think the system identifier should also be changed, otherwise you can way
> too easily get into situations trying to apply WAL from different systems to
> each other. Not going to end well, obviously.
Good point.

> > This tool does not take a base backup. It can certainly be included later.
> > There is already a tool do it: pg_basebackup.
> 
> It would make sense to allow to call pg_basebackup from the new tool. Perhaps
> with a --pg-basebackup-parameters or such.
Yeah. I'm planning to do that in a near future. There are a few questions in my
mind. Should we call the pg_basebackup directly (like
pglogical_create_subscriber does) or use a base backup machinery to obtain the
backup? If we choose the former, it should probably sanitize the
--pg-basebackup-parameters to allow only a subset of the command-line options
(?). AFAICS the latter requires some refactors in the pg_basebackup code --
e.g. expose at least one function (BaseBackup?) that accepts a struct of
command-line options as a parameter and returns success/failure. Another
possibility is to implement a simple BASE_BACKUP command via replication
protocol. The disadvantages are: (a) it could duplicate code and (b) it might
require maintenance if new options are added to the BASE_BACKUP command.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2022-02-21 Thread Euler Taveira
On Thu, Feb 17, 2022, at 3:36 AM, Amit Kapila wrote:
> As there is a new version, I would like to wait for a few more days
> before committing. I am planning to commit this early next week (by
> Tuesday) unless others or I see any more things that can be improved.
Amit, I don't have additional comments or suggestions. Let's move on. Next
topic. :-)

> I would once like to mention the replica identity handling of the
> patch. Right now, (on HEAD) we are not checking the replica identity
> combination at DDL time, they are checked at execution time in
> CheckCmdReplicaIdentity(). This patch follows the same scheme and
> gives an error at the time of update/delete if the table publishes
> update/delete and the publication(s) has a row filter that contains
> non-replica-identity columns. We had earlier thought of handling it at
> DDL time but that won't follow the existing scheme and has a lot of
> complications as explained in emails [1][2]. Do let me know if you see
> any problem here?
IMO it is not an issue that this patch needs to solve. The conclusion of
checking the RI at the DDL time vs execution time is that:

* the current patch just follows the same pattern used in the current logical
  replication implementation;
* it is easier to check during execution time (a central point) versus a lot of
  combinations for DDL commands;
* the check during DDL time might eventually break if new subcommands are
  added;
* the execution time does not have the maintenance burden imposed by new DDL
  subcommands;
* we might change the RI check to execute at DDL time if the current
  implementation imposes a significant penalty in certain workloads.

Again, it is material for another patch.

Thanks for taking care of a feature that has been discussed for 4 years [1].

[1] 
https://www.postgresql.org/message-id/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com


--
Euler Taveira
EDB   https://www.enterprisedb.com/


speed up a logical replica setup

2022-02-21 Thread Euler Taveira
Hi,

Logical replication has been used to migration with minimal downtime. However,
if you are dealing with a big database, the amount of required resources (disk
-- due to WAL retention) increases as the backlog (WAL) increases. Unless you
have a generous amount of resources and can wait for long period of time until
the new replica catches up, creating a logical replica is impracticable on
large databases.

The general idea is to create and convert a physical replica or a base backup
(archived WAL files available) into a logical replica. The initial data copy
and catchup tends to be faster on a physical replica. This technique has been
successfully used in pglogical_create_subscriber [1].

A new tool called pg_subscriber does this conversion and is tightly integrated
with Postgres.

DESIGN

The conversion requires 8 steps.

1. Check if the target data directory has the same system identifier than the
source data directory.
2. Stop the target server if it is running as a standby server. (Modify
recovery parameters requires a restart.)
3. Create one replication slot per specified database on the source server. One
additional replication slot is created at the end to get the consistent LSN
(This consistent LSN will be used as (a) a stopping point for the recovery
process and (b) a starting point for the subscriptions).
4. Write recovery parameters into the target data directory and start the
target server (Wait until the target server is promoted).
5. Create one publication (FOR ALL TABLES) per specified database on the source
server.
6. Create one subscription per specified database on the target server (Use
replication slot and publication created in a previous step. Don't enable the
subscriptions yet).
7. Sets the replication progress to the consistent LSN that was got in a
previous step.
8. Enable the subscription for each specified database on the target server.

This tool does not take a base backup. It can certainly be included later.
There is already a tool do it: pg_basebackup.

There is a --subscriber-conninfo option to inform the subscriber connection
string, however, we could remove it since this tool runs on the subscriber and
we can build a connection string.

NAME

I'm not sure about the proposed name. I came up with this one because it is not
so long. The last added tools uses pg_ prefix, verb (action) and object.
pg_initsubscriber and pg_createsubscriber are names that I thought but I'm not
excited about it.

DOCUMENTATION

It is available and describes this tool.

TESTS

Basic tests are included. It requires some tests to exercise this tool.


Comments?

[1] https://github.com/2ndQuadrant/pglogical


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 5827245b8a06f906f603100c8fb27be533a6c0a1 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Fri, 11 Feb 2022 03:05:58 -0300
Subject: [PATCH v1 1/2] Move readfile() and free_readfile() to file_utils.h

Allow these functions to be used by other binaries.

There is a static function called readfile() in initdb.c too. Rename it
to avoid conflicting with the exposed function.
---
 src/bin/initdb/initdb.c |  26 +++
 src/bin/pg_ctl/pg_ctl.c | 122 +---
 src/common/file_utils.c | 119 +++
 src/include/common/file_utils.h |   3 +
 4 files changed, 136 insertions(+), 134 deletions(-)

diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 97f15971e2..a4cbfeb954 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -243,8 +243,8 @@ static char **replace_token(char **lines,
 #ifndef HAVE_UNIX_SOCKETS
 static char **filter_lines_with_token(char **lines, const char *token);
 #endif
-static char **readfile(const char *path);
-static void writefile(char *path, char **lines);
+static char **read_text_file(const char *path);
+static void write_text_file(char *path, char **lines);
 static FILE *popen_check(const char *command, const char *mode);
 static char *get_id(void);
 static int	get_encoding_id(const char *encoding_name);
@@ -453,7 +453,7 @@ filter_lines_with_token(char **lines, const char *token)
  * get the lines from a text file
  */
 static char **
-readfile(const char *path)
+read_text_file(const char *path)
 {
 	char	  **result;
 	FILE	   *infile;
@@ -500,7 +500,7 @@ readfile(const char *path)
  * so that the resulting configuration files are nicely editable on Windows.
  */
 static void
-writefile(char *path, char **lines)
+write_text_file(char *path, char **lines)
 {
 	FILE	   *out_file;
 	char	  **line;
@@ -1063,7 +1063,7 @@ setup_config(void)
 
 	/* postgresql.conf */
 
-	conflines = readfile(conf_file);
+	conflines = read_text_file(conf_file);
 
 	snprintf(repltok, sizeof(repltok), "max_connections = %d", n_connections);
 	conflines = replace_token(conflines, "#max_connections = 100", repltok);
@@ -1214,7 +1214,7 @@ setup_config(void)
 
 	snprintf(path, sizeof(path), "%s/postgresql.conf"

Re: [PATCH] Add support to table_to_xmlschema regex when timestamp has time zone

2022-02-18 Thread Euler Taveira
On Fri, Feb 18, 2022, at 2:47 PM, Renan Soares Lopes wrote:
> Hello,
> 
> I added a patch to fix table_to_xmlschema, could you point me how to add a 
> unit test to that?
You should edit src/test/regress/expected/xmlmap.out. In this case, you should
also modify src/test/regress/expected/xmlmap_1.out that the output from this
test when you build without libxml support. Run 'make check' to test your fix
after building with/without libxml support.

Regarding this fix, it looks good to me. FWIW, character class escape is
defined here [1].

[1] https://www.w3.org/TR/xmlschema11-2/#cces


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Identify missing publications from publisher while create/alter subscription.

2022-02-09 Thread Euler Taveira
On Wed, Feb 9, 2022, at 12:06 PM, Ashutosh Sharma wrote:
> Just wondering if we should also be detecting the incorrect conninfo
> set with ALTER SUBSCRIPTION command as well. See below:
> 
> -- try creating a subscription with incorrect conninfo. the command fails.
> postgres=# create subscription sub1 connection 'host=localhost
> port=5490 dbname=postgres' publication pub1;
> ERROR:  could not connect to the publisher: connection to server at
> "localhost" (::1), port 5490 failed: Connection refused
> Is the server running on that host and accepting TCP/IP connections?
> connection to server at "localhost" (127.0.0.1), port 5490 failed:
> Connection refused
> Is the server running on that host and accepting TCP/IP connections?
That's because by default 'connect' parameter is true.

The important routine for all SUBSCRIPTION commands that handle connection
string is to validate the connection string e.g. check if all parameters are
correct. See walrcv_check_conninfo that calls PQconninfoParse.

The connection string is syntactically correct. Hence, no error. It could be
the case that the service is temporarily down. It is a useful and common
scenario that I wouldn't want to be forbid.

> -- reset the connninfo in the subscription to some wrong value. the
> command succeeds.
> postgres=# alter subscription sub1 connection 'host=localhost
> port=5490 dbname=postgres';
> ALTER SUBSCRIPTION
> postgres=#
> 
> postgres=# drop subscription sub1;
> ERROR:  could not connect to publisher when attempting to drop
> replication slot "sub1": connection to server at "localhost" (::1),
> port 5490 failed: Connection refused
> Is the server running on that host and accepting TCP/IP connections?
> connection to server at "localhost" (127.0.0.1), port 5490 failed:
> Connection refused
> Is the server running on that host and accepting TCP/IP connections?
> HINT:  Use ALTER SUBSCRIPTION ... SET (slot_name = NONE) to
> disassociate the subscription from the slot.
Again, dropping a subscription that is associated with a replication slot
requires a connection to remove the replication slot. If the publisher is gone
(and so the replication slot), follow the HINT advice.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: [BUG]Update Toast data failure in logical replication

2022-02-08 Thread Euler Taveira
On Tue, Feb 8, 2022, at 10:18 PM, tanghy.f...@fujitsu.com wrote:
> 2)
> + /*
> + * Check if the old tuple's attribute is stored externally and is a
> + * member of external_cols.
> + */
> + if (VARATT_IS_EXTERNAL((struct varlena *) DatumGetPointer(value1)) &&
> + bms_is_member(attrnum - FirstLowInvalidHeapAttributeNumber,
> +   external_cols))
> + *has_external = true;
> 
> If has_external is already true, it seems we don't need this check, so should 
> we
> check has_external first?
Is it worth it? I don't think so. It complicates a non-critical path. In
general, the condition will be executed once or twice.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: [BUG]Update Toast data failure in logical replication

2022-01-24 Thread Euler Taveira
On Mon, Jan 24, 2022, at 12:58 AM, Michael Paquier wrote:
> FWIW, I find the API changes of HeapDetermineModifiedColumns() and
> ExtractReplicaIdentity() a bit grotty.  Shouldn't we try to flatten
> the old tuple instead?  There are things like
> toast_flatten_tuple_to_datum() for this purpose if a tuple satisfies
> HeapTupleHasExternal(), or just heap_copy_tuple_as_datum().
> 
I checked v4 and I don't like the HeapDetermineModifiedColumns() and
heap_tuple_attr_equals() changes either. It seems it is hijacking these
functions to something else. I would suggest to change the signature to

static void
heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum,
   HeapTuple tup1, HeapTuple tup2,
   bool *is_equal, bool *key_has_external);

and

static void
HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols,
 HeapTuple oldtup, HeapTuple newtup,
 Bitmapset *modified_attrs, bool *key_has_external);

I didn't figure out a cheap way to check if the key has external value other
than slightly modifying the HeapDetermineModifiedColumns() function and its
subroutine heap_tuple_attr_equals(). As Alvaro said I don't think adding
HeapTupleHasExternal() (as in v3) is a good idea because it does not optimize
genuine cases such as a table whose PK is an integer and contains a single
TOAST column.

Another suggestion is to keep key_changed and add another attribute
(key_has_external) to ExtractReplicaIdentity(). If we need key_changed in the
future we'll have to decompose it again. You also encapsulates that
optimization into the function that helps with future improvements/fixes.

static HeapTuple
ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed,
   bool key_has_external, bool *copy);


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Logging replication state changes

2022-01-08 Thread Euler Taveira
On Sat, Jan 8, 2022, at 6:29 PM, SATYANARAYANA NARLAPURAM wrote:
> We need the historical information to analyze and root cause in addition to 
> the live debugging. It would be good to have better control over replication 
> messages. 
I think the answer to this demand is not to change the message level but to
implement a per-module log_min_messages. This idea is in the TODO [1] for more
than a decade. Check the archives.

I agree with Tom that the referred messages are noisy, hence, DEBUG1 is fine
for it.

[1] https://wiki.postgresql.org/wiki/Todo


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2022-01-07 Thread Euler Taveira
On Fri, Jan 7, 2022, at 6:05 AM, Amit Kapila wrote:
> Euler, I have one more question about this patch for you. I see that
> in the patch we are calling coerce_to_target_type() in
> pgoutput_row_filter_init_expr() but do we really need the same? We
> already do that via
> transformPubWhereClauses->transformWhereClause->coerce_to_boolean
> before storing where clause expression. It is not clear to me why that
> is required? We might want to add a comment if that is required.
It is redundant. It seems an additional safeguard that we should be removed.
Good catch.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2022-01-07 Thread Euler Taveira
On Fri, Jan 7, 2022, at 3:35 AM, Amit Kapila wrote:
> On Fri, Jan 7, 2022 at 9:44 AM Amit Kapila  wrote:
> >
> > On Thu, Jan 6, 2022 at 6:42 PM Euler Taveira  wrote:
> > >
> > > IMO we shouldn't reuse ReorderBufferChangeType. For a long-term solution, 
> > > it is
> > > fragile. ReorderBufferChangeType has values that do not matter for row 
> > > filter
> > > and it relies on the fact that REORDER_BUFFER_CHANGE_INSERT,
> > > REORDER_BUFFER_CHANGE_UPDATE and REORDER_BUFFER_CHANGE_DELETE are the 
> > > first 3
> > > values from the enum, otherwise, it breaks rfnodes and no_filters in
> > > pgoutput_row_filter().
> > >
> >
> > I think you mean to say it will break in pgoutput_row_filter_init(). I
> > see your point but OTOH, if we do what you are suggesting then don't
> > we need an additional mapping between ReorderBufferChangeType and
> > RowFilterPublishAction as row filter and pgoutput_change API need to
> > use those values.
> >
> 
> Can't we use 0,1,2 as indexes for rfnodes/no_filters based on change
> type as they are local variables as that will avoid the fragileness
> you are worried about. I am slightly hesitant to introduce new enum
> when we are already using reorder buffer change type in pgoutput.c.
WFM. I used numbers + comments in a previous patch set [1]. I suggested the enum
because each command would be self explanatory.

[1] 
https://www.postgresql.org/message-id/49ba49f1-8bdb-40b7-ae9e-f17d88b3afcd%40www.fastmail.com


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2022-01-06 Thread Euler Taveira
On Thu, Jan 6, 2022, at 9:48 PM, Bossart, Nathan wrote:
> After a quick glance, I didn't see an easy way to hold a session open
> while the test does other things.  If there isn't one, modifying
> backup_fs_hot() to work with non-exclusive mode might be more trouble
> than it is worth.
You can use IPC::Run to start psql in background. See examples in
src/test/recovery.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


fix libpq comment

2022-01-06 Thread Euler Taveira
Hi,

While checking the PQping code I noticed that pg_ctl does not rely on PQping
since commit f13ea95f9e4 (v10) so the attached patch removes a comment from
internal_ping().


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 3055de48fe71f47df357114c7a42db05edcdb290 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Thu, 6 Jan 2022 20:50:30 -0300
Subject: [PATCH] pg_ctl does not rely on PQping anymore

Since commit f13ea95f9e4, PQping is not used by pg_ctl.
---
 src/interfaces/libpq/fe-connect.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9b6a6939f0..dfebfd1096 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3919,8 +3919,7 @@ internal_ping(PGconn *conn)
 		return PQPING_NO_RESPONSE;
 
 	/*
-	 * Report PQPING_REJECT if server says it's not accepting connections. (We
-	 * distinguish this case mainly for the convenience of pg_ctl.)
+	 * Report PQPING_REJECT if server says it's not accepting connections.
 	 */
 	if (strcmp(conn->last_sqlstate, ERRCODE_CANNOT_CONNECT_NOW) == 0)
 		return PQPING_REJECT;
-- 
2.20.1



Re: row filtering for logical replication

2022-01-06 Thread Euler Taveira
On Thu, Jan 6, 2022, at 1:18 AM, Amit Kapila wrote:
> On Thu, Jan 6, 2022 at 8:43 AM Peter Smith  wrote:
> >
> > On Wed, Jan 5, 2022 at 9:52 PM Amit Kapila  wrote:
> > >
> > ...
> >
> > > Another minor comment:
> > > +static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype,
> > >
> > > Do we need to specify the 'enum' type before changetype parameter?
> > >
> >
> > That is because there is currently no typedef for the enum
> > ReorderBufferChangeType.
> >
> 
> But I see that the 0002 patch is already adding the required typedef.
IMO we shouldn't reuse ReorderBufferChangeType. For a long-term solution, it is
fragile. ReorderBufferChangeType has values that do not matter for row filter
and it relies on the fact that REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE and REORDER_BUFFER_CHANGE_DELETE are the first 3
values from the enum, otherwise, it breaks rfnodes and no_filters in
pgoutput_row_filter(). I suggest a separate enum that contains only these 3
values.

enum RowFilterPublishAction {
   PUBLISH_ACTION_INSERT,
   PUBLISH_ACTION_UPDATE,
   PUBLISH_ACTION_DELETE
};


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2021-12-26 Thread Euler Taveira
On Sun, Dec 26, 2021, at 1:09 PM, Alvaro Herrera wrote:
> On 2021-Dec-26, Euler Taveira wrote:
> 
> > On Sat, Dec 25, 2021, at 1:20 AM, Amit Kapila wrote:
> > > On Fri, Dec 24, 2021 at 11:04 AM Peter Smith  
> > > wrote:
> > > >
> > > > So, IMO the PG docs wording for this part should be relaxed a bit.
> > > >
> > > > e.g.
> > > > BEFORE:
> > > > +   A nullable column in the WHERE clause could 
> > > > cause the
> > > > +   expression to evaluate to false; avoid using columns without 
> > > > not-null
> > > > +   constraints in the WHERE clause.
> > > > AFTER:
> > > > +   A nullable column in the WHERE clause could 
> > > > cause the
> > > > +   expression to evaluate to false. To avoid unexpected results, any 
> > > > possible
> > > > +   null values should be accounted for.
> 
> Is this actually correct?  I think a null value would cause the
> expression to evaluate to null, not false; the issue is that the filter
> considers a null value as not matching (right?).  Maybe it's better to
> spell that out explicitly; both these wordings seem distracting.
[Reading it again...] I think it is referring to the
pgoutput_row_filter_exec_expr() return. That's not accurate because it is
talking about the expression and the expression returns true, false and null.
However, the referred function returns only true or false. I agree that we
should explictily mention that a null return means the row won't be published.

> You have this elsewhere:
> 
> +  If the optional WHERE clause is specified, only rows
> +  that satisfy the  class="parameter">expression 
> +  will be published. Note that parentheses are required around the 
> +  expression. It has no effect on TRUNCATE commands.
> 
> Maybe this whole thing is clearer if you just say "If the optional WHERE
> clause is specified, rows for which the expression returns false or null
> will not be published."  With that it should be fairly clear what
> happens if you have NULL values in the columns used in the expression,
> and you can just delete that phrase you're discussing.
Your proposal sounds good to me.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2021-12-26 Thread Euler Taveira
On Sat, Dec 25, 2021, at 1:20 AM, Amit Kapila wrote:
> On Fri, Dec 24, 2021 at 11:04 AM Peter Smith  wrote:
> >
> > So, IMO the PG docs wording for this part should be relaxed a bit.
> >
> > e.g.
> > BEFORE:
> > +   A nullable column in the WHERE clause could cause the
> > +   expression to evaluate to false; avoid using columns without not-null
> > +   constraints in the WHERE clause.
> > AFTER:
> > +   A nullable column in the WHERE clause could cause the
> > +   expression to evaluate to false. To avoid unexpected results, any 
> > possible
> > +   null values should be accounted for.
> >
> 
> Your suggested wording sounds reasonable to me. Euler, others, any thoughts?
+1.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Delay the variable initialization in get_rel_sync_entry

2021-12-23 Thread Euler Taveira
On Wed, Dec 22, 2021, at 10:11 AM, houzj.f...@fujitsu.com wrote:
> When reviewing some logical replication patches. I noticed that in
> function get_rel_sync_entry() we always invoke get_rel_relispartition() 
> and get_rel_relkind() at the beginning which could cause unnecessary
> cache access.
> 
> ---
> get_rel_sync_entry(PGOutputData *data, Oid relid)
> {
> RelationSyncEntry *entry;
> bool am_partition = get_rel_relispartition(relid);
> char relkind = get_rel_relkind(relid);
> ---
> 
> The extra cost could sometimes be noticeable because get_rel_sync_entry is a
> hot function which is executed for each change. And the 'am_partition' and
> 'relkind' are necessary only when we need to rebuild the RelationSyncEntry.
> 
> Here is the perf result for the case when inserted large amounts of data into 
> a
> un-published table in which case the cost is noticeable.
> 
> --12.83%--pgoutput_change
> |--11.84%--get_rel_sync_entry
> |--4.76%--get_rel_relispartition
> |--4.70%--get_rel_relkind
Good catch. WFM. Deferring variable initialization close to its first use is
good practice.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: correct the sizes of values and nulls arrays in pg_control_checkpoint

2021-12-23 Thread Euler Taveira
On Thu, Dec 23, 2021, at 8:39 AM, Bharath Rupireddy wrote:
> pg_control_checkpoint emits 18 columns whereas the values and nulls
> arrays are defined to be of size 19. Although it's not critical,
> attaching a tiny patch to fix this.
Good catch! I'm wondering if a constant wouldn't be useful for such case.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: pg_archivecleanup - add the ability to detect, archive and delete the unneeded wal files on the primary

2021-12-23 Thread Euler Taveira
On Thu, Dec 23, 2021, at 9:58 AM, Bharath Rupireddy wrote:
> pg_archivecleanup currently takes a WAL file name as input to delete
> the WAL files prior to it [1]. As suggested by Satya (cc-ed) in
> pg_replslotdata thread [2], can we enhance the pg_archivecleanup to
> automatically detect the last checkpoint (from control file) LSN,
> calculate the lowest restart_lsn required by the replication slots, if
> any (by reading the replication slot info from pg_logical directory),
> archive the unneeded (an archive_command similar to that of the one
> provided in the server config can be provided as an input) WAL files
> before finally deleting them? Making pg_archivecleanup tool as an
> end-to-end solution will help greatly in disk full situations because
> of WAL files growth (inactive replication slots, archive command
> failures, infrequent checkpoint etc.).
pg_archivecleanup is a tool to remove WAL files from the *archive*. Are you
suggesting to use it for removing files from pg_wal directory too? No, thanks.
WAL files are a key component for backup and replication. Hence, you cannot
deliberately allow a tool to remove WAL files from PGDATA. IMO this issue
wouldn't occur if you have a monitoring system and alerts and someone to keep
an eye on it. If the disk full situation was caused by a failed archive command
or a disconnected standby, it is easy to figure out; the fix is simple.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: relcache not invalidated when ADD PRIMARY KEY USING INDEX

2021-12-20 Thread Euler Taveira
On Mon, Dec 20, 2021, at 3:45 AM, houzj.f...@fujitsu.com wrote:
> Hi,
> 
> When reviewing some replica identity related patches, I found that when adding
> primary key using an existing unique index on not null columns, the
> target table's relcache won't be invalidated.
Good catch.

It seems you can simplify your checking indexForm->indisprimary directly, no?

Why did you add new tests for test_decoding? I think the TAP tests alone are
fine. BTW, this test is similar to publisher3/subscriber3. Isn't it better to
use the same pub/sub to reduce the test execution time?


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Confused comment about drop replica identity index

2021-12-20 Thread Euler Taveira
On Mon, Dec 20, 2021, at 8:11 AM, Michael Paquier wrote:
> On Mon, Dec 20, 2021 at 03:46:13AM +, wangw.f...@fujitsu.com wrote:
> > Here is a patch to correct wrong comment about
> > REPLICA_IDENTITY_INDEX, And improve the pg-doc.
> 
> That's mostly fine.  I have made some adjustments as per the
> attached.
Your patch looks good to me.

> Pondering more about this thread, I don't think we should change the
> existing behavior in the back-branches, but I don't have any arguments
> about doing such changes on HEAD to help the features being worked
> on, either.  So I'd like to apply and back-patch the attached, as a
> first step, to fix the inconsistency.
> 
What do you think about the attached patch? It forbids the DROP INDEX. We might
add a detail message but I didn't in this patch.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 2ba00335746cf8348e019582a253721047f1a0ac Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Thu, 16 Dec 2021 16:54:47 -0300
Subject: [PATCH v1] Disallow dropping an index that is used by replica
 identity

Drop an index that is used by a replica identity can break the
replication for UPDATE and DELETE operations. This may be undesirable
from the usability standpoint. It introduces a behaviour change.
---
 src/backend/commands/tablecmds.c   | 14 +-
 src/test/regress/expected/replica_identity.out |  3 +++
 src/test/regress/sql/replica_identity.sql  |  3 +++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index bf42587e38..727ad6625b 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1530,15 +1530,21 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	   rel->relname);
 
 	/*
+	 * An index used by a replica identity cannot be dropped because some
+	 * operations (UPDATE and DELETE) relies on the index properties
+	 * (uniqueness and NOT NULL) to record old values. These old values are
+	 * essential for identifying the row on the subscriber.
+	 *
 	 * Check the case of a system index that might have been invalidated by a
 	 * failed concurrent process and allow its drop. For the time being, this
 	 * only concerns indexes of toast relations that became invalid during a
 	 * REINDEX CONCURRENTLY process.
 	 */
-	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	if (relkind == RELKIND_INDEX)
 	{
 		HeapTuple	locTuple;
 		Form_pg_index indexform;
+		bool		indisreplident;
 		bool		indisvalid;
 
 		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
@@ -1550,8 +1556,14 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 
 		indexform = (Form_pg_index) GETSTRUCT(locTuple);
 		indisvalid = indexform->indisvalid;
+		indisreplident = indexform->indisreplident;
 		ReleaseSysCache(locTuple);
 
+		if (indisreplident)
+			ereport(ERROR,
+	(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+	 errmsg("could not drop index \"%s\" because it is required by the replica identity", rel->relname)));
+
 		/* Mark object as being an invalid index of system catalogs */
 		if (!indisvalid)
 			invalid_system_index = true;
diff --git a/src/test/regress/expected/replica_identity.out b/src/test/regress/expected/replica_identity.out
index e25ec06a84..56900451f7 100644
--- a/src/test/regress/expected/replica_identity.out
+++ b/src/test/regress/expected/replica_identity.out
@@ -227,6 +227,9 @@ Indexes:
 -- used as replica identity.
 ALTER TABLE test_replica_identity3 ALTER COLUMN id DROP NOT NULL;
 ERROR:  column "id" is in index used as replica identity
+-- DROP INDEX is not allowed if the index is used as REPLICA IDENTITY
+DROP INDEX test_replica_identity3_id_key;
+ERROR:  could not drop index "test_replica_identity3_id_key" because it is required by the replica identity
 DROP TABLE test_replica_identity;
 DROP TABLE test_replica_identity2;
 DROP TABLE test_replica_identity3;
diff --git a/src/test/regress/sql/replica_identity.sql b/src/test/regress/sql/replica_identity.sql
index 33da829713..f46769e965 100644
--- a/src/test/regress/sql/replica_identity.sql
+++ b/src/test/regress/sql/replica_identity.sql
@@ -98,6 +98,9 @@ ALTER TABLE test_replica_identity3 ALTER COLUMN id TYPE bigint;
 -- used as replica identity.
 ALTER TABLE test_replica_identity3 ALTER COLUMN id DROP NOT NULL;
 
+-- DROP INDEX is not allowed if the index is used as REPLICA IDENTITY
+DROP INDEX test_replica_identity3_id_key;
+
 DROP TABLE test_replica_identity;
 DROP TABLE test_replica_identity2;
 DROP TABLE test_replica_identity3;
-- 
2.20.1



Re: Confused comment about drop replica identity index

2021-12-16 Thread Euler Taveira
On Thu, Dec 16, 2021, at 8:55 PM, Michael Paquier wrote:
> On Thu, Dec 16, 2021 at 03:08:46PM -0300, Alvaro Herrera wrote:
> > Hmm, so if a table has REPLICA IDENTITY INDEX and there is a publication
> > with an explicit column list, then we need to forbid the DROP INDEX for
> > that index.
> 
> Hmm.  I have not followed this thread very closely.
> 
> > I wonder why don't we just forbid DROP INDEX of an index that's been
> > defined as replica identity.  It seems quite silly an operation to
> > allow.
It would avoid pilot errors.

> The commit logs talk about b23b0f55 here for this code, to ease the
> handling of relcache entries for rd_replidindex.  07cacba is the
> origin of the logic (see RelationGetIndexList).  Andres?
> 
> I don't think that this is really an argument against putting more
> restrictions as anything that deals with an index drop, including the
> internal ones related to constraints, would need to go through
> index_drop(), and new features may want more restrictions in place as
> you say.
> 
> Now, I don't see a strong argument in changing this behavior either
> (aka I have not looked at what this implies for the new publication
> types), and we still need to do something for the comment/docs in
> existing branches, anyway.  So I would still fix this gap as a first
> step, then deal with the rest on HEAD as necessary.
> 
I've never understand the weak dependency between the REPLICA IDENTITY and the
index used by it. I'm afraid we will receive complaints about this unexpected
behavior (my logical replication setup is broken because I dropped an index) as
far as new logical replication features are added.  Row filtering imposes some
restrictions in UPDATEs and DELETEs (an error message is returned and the
replication stops) if a column used in the expression isn't part of the REPLICA
IDENTITY anymore.

It seems we already have some code in RangeVarCallbackForDropRelation() that
deals with a system index error condition. We could save a syscall and provide
a test for indisreplident there.

If this restriction is undesirable, we should at least document this choice and
probably emit a WARNING for DROP INDEX.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2021-12-06 Thread Euler Taveira
On Mon, Dec 6, 2021, at 3:44 AM, Amit Kapila wrote:
> I think what you said as (b) is wrong because we want to allow builtin
> immutable functions. See discussion [1].
It was a typo. I mean "non-immutable" function.

> True, but that is the main reason the review and development are being
> done as separate sub-features. I suggest still keeping the similar
> separation till some of the reviews of each of the patches are done,
> otherwise, we need to rethink how to divide for easier review. We need
> to retain the 0005 patch because that handles many problems without
> which the main patch is incomplete and buggy w.r.t replica identity.
IMO we should merge sub-features as soon as we reach consensus. Every new
sub-feature breaks comments, tests and documentation if you want to remove or
rearrange patches. It seems I misread 0005. I agree that it is important. I'll
check it.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2021-12-06 Thread Euler Taveira
On Mon, Dec 6, 2021, at 3:35 AM, Dilip Kumar wrote:
> On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira  wrote:
> >
> > On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
> >
> > PS> I will update the commit message in the next version. I barely changed 
> > the
> > documentation to reflect the current behavior. I probably missed some 
> > changes
> > but I will fix in the next version.
> >
> > I realized that I forgot to mention a few things about the UPDATE behavior.
> > Regardless of 0003, we need to define which tuple will be used to evaluate 
> > the
> > row filter for UPDATEs. We already discussed it circa [1]. This current 
> > version
> > chooses *new* tuple. Is it the best choice?
> 
> But with 0003, we are using both the tuple for evaluating the row
> filter, so instead of fixing 0001, why we don't just merge 0003 with
> 0001?  I mean eventually, 0003 is doing what is the agreed behavior,
> i.e. if just OLD is matching the filter then convert the UPDATE to
> DELETE OTOH if only new is matching the filter then convert the UPDATE
> to INSERT.  Do you think that even we merge 0001 and 0003 then also
> there is an open issue regarding which row to select for the filter?
Maybe I was not clear. IIUC we are still discussing 0003 and I would like to
propose a different default based on the conclusion I came up. If we merged
0003, that's fine; this change will be useless. If we don't or it is optional,
it still has its merit.

Do we want to pay the overhead to evaluating both tuple for UPDATEs? I'm still
processing if it is worth it. If you think that in general the row filter
contains the primary key and it is rare to change it, it will waste cycles
evaluating the same expression twice. It seems this behavior could be
controlled by a parameter.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2021-12-02 Thread Euler Taveira
On Thu, Dec 2, 2021, at 4:18 AM, Peter Smith wrote:
> PSA a new v44* patch set.
> 
> It includes a new patch 0006 which implements the idea above.
> 
> ExprState cache logic is basically all the same as before (including
> all the OR combining), but there are now 4x ExprState caches keyed and
> separated by the 4x different pubactions.
row filter is not applied for TRUNCATEs so it is just 3 operations.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: O(n) tasks cause lengthy startups and checkpoints

2021-12-01 Thread Euler Taveira
On Wed, Dec 1, 2021, at 9:19 PM, Bossart, Nathan wrote:
> On 12/1/21, 2:56 PM, "Andres Freund"  wrote:
> > On 2021-12-01 20:24:25 +, Bossart, Nathan wrote:
> >> I realize adding a new maintenance worker might be a bit heavy-handed,
> >> but I think it would be nice to have somewhere to offload tasks that
> >> really shouldn't impact startup and checkpointing.  I imagine such a
> >> process would come in handy down the road, too.  WDYT?
> >
> > -1. I think the overhead of an additional worker is disproportional here. 
> > And
> > there's simplicity benefits in having a predictable cleanup interlock as 
> > well.
> 
> Another idea I had was to put some upper limit on how much time is
> spent on such tasks.  For example, a checkpoint would only spend X
> minutes on CheckPointSnapBuild() before giving up until the next one.
> I think the main downside of that approach is that it could lead to
> unbounded growth, so perhaps we would limit (or even skip) such tasks
> only for end-of-recovery and shutdown checkpoints.  Perhaps the
> startup tasks could be limited in a similar fashion.
Saying that a certain task is O(n) doesn't mean it needs a separate process to
handle it. Did you have a use case or even better numbers (% of checkpoint /
startup time) that makes your proposal worthwhile?

I would try to optimize (1) and (2). However, delayed removal can be a
long-term issue if the new routine cannot keep up with the pace of file
creation (specially if the checkpoints are far apart).

For (3), there is already a GUC that would avoid the slowdown during startup.
Use it if you think the startup time is more important that disk space occupied
by useless files.

For (4), you are forgetting that the on-disk state of replication slots is
stored in the pg_replslot/SLOTNAME/state. It seems you cannot just rename the
replication slot directory and copy the state file. What happen if there is a
crash before copying the state file?

While we are talking about items (1), (2) and (4), we could probably have an
option to create some ephemeral logical decoding files into ramdisk (similar to
statistics directory). I wouldn't like to hijack this thread but this proposal
could alleviate the possible issues that you pointed out. If people are
interested in this proposal, I can start a new thread about it.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2021-11-30 Thread Euler Taveira
On Tue, Nov 30, 2021, at 7:25 AM, Amit Kapila wrote:
> On Tue, Nov 30, 2021 at 11:37 AM Dilip Kumar  wrote:
> >
> > What about the initial table sync? during that, we are going to
> > combine all the filters or we are going to apply only the insert
> > filters?
> >
> 
> AFAIK, currently, initial table sync doesn't respect publication
> actions so it should combine all the filters. What do you think?
I agree. If you think that it might need a row to apply DML commands (UPDATE,
DELETE) in the future or that due to a row filter that row should be available
in the subscriber (INSERT-only case), it makes sense to send all rows that
satisfies any row filter.

The current code already works this way. All row filter are combined into a
WHERE clause using OR. If any of the publications don't have a row filter,
there is no WHERE clause.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2021-11-29 Thread Euler Taveira
On Mon, Nov 29, 2021, at 7:11 AM, Amit Kapila wrote:
> I don't think it is a good idea to combine the row-filter from the
> publication that publishes just 'insert' with the row-filter that
> publishes 'updates'. We shouldn't apply the 'insert' filter for
> 'update' and similarly for publication operations. We can combine the
> filters when the published operations are the same. So, this means
> that we might need to cache multiple row-filters but I think that is
> better than having another restriction that publish operation 'insert'
> should also honor RI columns restriction.
That's exactly what I meant to say but apparently I didn't explain in details.
If a subscriber has multiple publications and a table is part of these
publications with different row filters, it should check the publication action
*before* including it in the row filter list. It means that an UPDATE operation
cannot apply a row filter that is part of a publication that has only INSERT as
an action. Having said that we cannot always combine multiple row filter
expressions into one. Instead, it should cache individual row filter expression
and apply the OR during the row filter execution (as I did in the initial
patches before this caching stuff). The other idea is to have multiple caches
for each action.  The main disadvantage of this approach is to create 4x
entries.

I'm experimenting the first approach that stores multiple row filters and its
publication action right now. Unfortunately we cannot use the
relentry->pubactions because it aggregates this information if you have
multiple entries. It seems a separate array should store this information that
will be used later while evaluating the row filter -- around
pgoutput_row_filter_exec_expr() call.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2021-11-25 Thread Euler Taveira
On Thu, Nov 25, 2021, at 10:39 AM, houzj.f...@fujitsu.com wrote:
> When researching and writing a top-up patch about this.
> I found a possible issue which I'd like to confirm first.
> 
> It's possible the table is published in two publications A and B, publication 
> A
> only publish "insert" , publication B publish "update". When UPDATE, both row
> filter in A and B will be executed. Is this behavior expected?
Good question. No. The code should check the action before combining the
multiple row filters.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Should we improve "PID XXXX is not a PostgreSQL server process" warning for pg_terminate_backend(<>)?

2021-11-17 Thread Euler Taveira
On Mon, Nov 15, 2021, at 4:27 AM, Bharath Rupireddy wrote:
> As there is some interest shown in this thread at [1], I'm attaching a
> new v3 patch here. Please review it.
I took a look at this patch. I have a few comments.

+ ereport(WARNING,
+ (errmsg("signalling postmaster with PID %d is not allowed", pid)));

I would say "signal postmaster PID 1234 is not allowed". It is not an
in-progress action.

s/shared-memory/shared memory/

syslogger and statistics collector don't have a procArray entry so you could
probably provide a new function that checks if it is an auxiliary process.
AuxiliaryPidGetProc() does not return all auxiliary processes; syslogger and
statistics collector don't have a procArray entry. You can use their PIDs
(SysLoggerPID and PgStatPID) to provide an accurate information.

+ if (proc)
+ ereport(WARNING,
+ (errmsg("signalling PostgreSQL server process with PID %d is not allowed",

I would say "signal PostgreSQL auxiliary process PID 1234 is not allowed".

+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL server process", pid)));

I would say "PID 1234 is not a PostgreSQL backend process". That's the glossary
terminology.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Allow users to choose what happens when recovery target is not reached

2021-11-13 Thread Euler Taveira
On Sat, Nov 13, 2021, at 10:15 AM, Bharath Rupireddy wrote:
> Firstly, the proposed patch adds no new behaviour as such, it just
> gives the ability that is existing today on v12 and below (prior to
> commit dc78866 which went into v13 and later).
It reintroduces an awkward behavior [1].

> I think performing PITR is the user's wish - whether the primary is
> available or not, it is completely the user's choice. The user might
> start the PITR, when the primary is available, thinking that it sends
> all the WAL files required for achieving recovery target. But imagine
> a disaster happens and the primary server crashes, say the recovery
> has replayed a huge bunch of WAL records (a TB may be), and the
> primary failed without sending the last one or few WAL files, should
> the PITR target server be failing this case after replaying a huge
> bunch of WAL records? The user might want the target server to be
> available instead of FATALly shutting down. This is the exact problem
> the proposed patch is trying to solve.
Are you archiving on the primary server? You are risking your customer's
business suggesting such setup. You should store the WAL files on your backup
server.

It seems your setup has a flaw. You set a recovery target but accept a scenario
that is not what you initially asked for. If it is a real PITR, it is awkward
like Peter [1] said. You could validate your recovery settings checking the
timestamp of the last WAL file as a rough approximation of the maximum recovery
target time. The other option is to run pg_waldump to obtain the last commit
timestamp.

If you care about your customer's data, you won't use such option. Otherwise, I
repeat the Julien's question [2]: isn't it better to simply don't specify a 
target
and let the recovery go as far as possible?

> As I said earlier, the behaviour is not too dangerous as it is not
> something new that the patch is proposing, it exists today in v12 and
> below. In fact, it gives a way out of a "dangerous situation" if the
> user ever gets stuck in it without wasting recovery cycles and compute
> resources, by quickly getting the database to be available(of course,
> the responsibility lies with the user to deal with the missing WAL
> files).
Your proposal seems that the user is shooting in the dark. If a FATAL message
was got it means the user missed the target. Even after that the user accepts
the situation, remove the target parameters and start the server again. I think
promote or even pause might lead to incorrect expectations (if the user doesn't
carefully inspect the log messages).

A disadvantage of this proposal is that if you have it set to 'promote', start
the recovery and the server gets promoted before reaching the target. While
inspecting your server configuration, you realized that you are pointing to the
incorrect archive or the WAL files were not available in time (due to timing
issues). You have no option but start from scratch.

[1] https://postgr.es/m/234a0c50-1160-86c2-4e4b-35e9684f1799%402ndquadrant.com
[2] 
https://postgr.es/m/CAOBaU_ZDkyoQvEsYT0-p1Hb0m_nGtQJ4tTGm2-Ay6v%3DTCjmsWg%40mail.gmail.com


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation

2021-11-13 Thread Euler Taveira
On Sat, Nov 13, 2021, at 12:00 AM, Bharath Rupireddy wrote:
> On Sat, Nov 13, 2021 at 12:06 AM Euler Taveira  wrote:
> > > Here's a rebased v8 patch. Please review it.
> >
> > This improves the user experience by increasing the granularity of error
> > reporting, and maps well with the precedent set in 81d5995b4.  I'm marking 
> > this
> > Ready for Committer and will go ahead and apply this unless there are
> > objections.
> >
> > Shouldn't we modify errdetail_relkind_not_supported() to include 
> > relpersistence
> > as a 2nd parameter and move those messages to it? I experiment this idea 
> > with
> > the attached patch. The idea is to provide a unique function that reports
> > accurate detail messages.
> 
> Thanks. It is a good idea to use errdetail_relkind_not_supported. I
> slightly modified the API to "int errdetail_relkind_not_supported(Oid
> relid, Form_pg_class rd_rel);" to simplify things and increase the
> usability of the function further. For instance, it can report the
> specific error for the catalog tables as well. And, also added "int
> errdetail_relkind_not_supported _v2(Oid relid, char relkind, char
> relpersistence);" so that the callers not having Form_pg_class (there
> are 3 callers exist) can pass the parameters directly.
Do we really need 2 functions? I don't think so. Besides that, relid is
redundant since this information is available in the Form_pg_class struct.

int errdetail_relkind_not_supported(Oid relid, Form_pg_class rd_rel);

My suggestion is to keep only the 3 parameter function:

int errdetail_relkind_not_supported(Oid relid, char relkind, char 
relpersistence);

Multiple functions that is just a wrapper for a central one is a good idea for
backward compatibility. That's not the case here.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: [BUG]Invalidate relcache when setting REPLICA IDENTITY

2021-11-12 Thread Euler Taveira
On Fri, Nov 12, 2021, at 3:10 AM, houzj.f...@fujitsu.com wrote:
> On Fri, Nov 12, 2021 1:33 PM Amit Kapila  wrote:
> > On Fri, Nov 12, 2021 at 10:50 AM Amit Kapila  
> > wrote:
> > > But won't that generate invalidation for the rel twice in the case
> > > (change Replica Identity from Nothing to some index) you mentioned in
> > > the previous email?
> > >
> > 
> > Oh, I see the point. I think this is okay because
> > AddRelcacheInvalidationMessage doesn't allow to add duplicate rel
> > invalidation. If that is the case I wonder why not simply register
> > invalidation without any check in the for loop as was the case with
> > Tang's original patch?
> 
> OK, I also think the code in Tang's original patch is fine.
> Attach the patch which register invalidation without any check in the for 
> loop.
WFM.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation

2021-11-12 Thread Euler Taveira
On Fri, Nov 12, 2021, at 9:41 AM, Daniel Gustafsson wrote:
> > On 4 Nov 2021, at 05:24, Bharath Rupireddy 
> >  wrote:
> > 
> > On Wed, Nov 3, 2021 at 6:21 PM Daniel Gustafsson  wrote:
> >> 
> >>> On 26 Jul 2021, at 09:33, Bharath Rupireddy 
> >>>  wrote:
> >> 
> >>> PSA v7.
> >> 
> >> This patch no longer applies on top of HEAD, please submit a rebased 
> >> version.
> > 
> > Here's a rebased v8 patch. Please review it.
> 
> This improves the user experience by increasing the granularity of error
> reporting, and maps well with the precedent set in 81d5995b4.  I'm marking 
> this
> Ready for Committer and will go ahead and apply this unless there are
> objections.
Shouldn't we modify errdetail_relkind_not_supported() to include relpersistence
as a 2nd parameter and move those messages to it? I experiment this idea with
the attached patch. The idea is to provide a unique function that reports
accurate detail messages.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From f068d4688a95c8e8c5a98d8d6f1847ddfafda43c Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 4 Nov 2021 04:22:21 +
Subject: [PATCH v9] Improve publication error messages

Provide accurate messages for CREATE PUBLICATION commands. While at it,
modify errdetail_relkind_not_supported to distinguish tables based on
its persistence property. It provides accurate detail messages for
future work. It also includes tests to cover temporary, unlogged, system
and foreign tables.
---
 contrib/amcheck/verify_heapam.c|  2 +-
 contrib/pageinspect/rawpage.c  |  2 +-
 contrib/pg_surgery/heap_surgery.c  |  2 +-
 contrib/pg_visibility/pg_visibility.c  |  2 +-
 contrib/pgstattuple/pgstatapprox.c |  2 +-
 contrib/pgstattuple/pgstatindex.c  |  2 +-
 contrib/pgstattuple/pgstattuple.c  |  2 +-
 contrib/postgres_fdw/expected/postgres_fdw.out |  6 ++
 contrib/postgres_fdw/sql/postgres_fdw.sql  |  5 +
 src/backend/catalog/pg_class.c | 10 --
 src/backend/catalog/pg_publication.c   | 13 +++--
 src/backend/commands/comment.c |  2 +-
 src/backend/commands/indexcmds.c   |  2 +-
 src/backend/commands/lockcmds.c|  5 +++--
 src/backend/commands/seclabel.c|  2 +-
 src/backend/commands/sequence.c|  2 +-
 src/backend/commands/statscmds.c   |  2 +-
 src/backend/commands/tablecmds.c   |  9 +
 src/backend/commands/trigger.c |  6 +++---
 src/backend/executor/execReplication.c |  2 +-
 src/backend/parser/parse_utilcmd.c |  2 +-
 src/backend/rewrite/rewriteDefine.c|  4 ++--
 src/include/catalog/pg_class.h |  2 +-
 src/test/regress/expected/publication.out  | 16 
 src/test/regress/sql/publication.sql   | 14 ++
 25 files changed, 80 insertions(+), 38 deletions(-)

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index bae5340111..43854d1361 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -314,7 +314,7 @@ verify_heapam(PG_FUNCTION_ARGS)
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("cannot check relation \"%s\"",
 		RelationGetRelationName(ctx.rel)),
- errdetail_relkind_not_supported(ctx.rel->rd_rel->relkind)));
+ errdetail_relkind_not_supported(ctx.rel->rd_rel->relkind, ctx.rel->rd_rel->relpersistence)));
 
 	/*
 	 * Sequences always use heap AM, but they don't show that in the catalogs.
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 4bfa346c24..2259068d73 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -160,7 +160,7 @@ get_raw_page_internal(text *relname, ForkNumber forknum, BlockNumber blkno)
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("cannot get raw page from relation \"%s\"",
 		RelationGetRelationName(rel)),
- errdetail_relkind_not_supported(rel->rd_rel->relkind)));
+ errdetail_relkind_not_supported(rel->rd_rel->relkind, rel->rd_rel->relpersistence)));
 
 	/*
 	 * Reject attempts to read non-local temporary relations; we would be
diff --git a/contrib/pg_surgery/heap_surgery.c b/contrib/pg_surgery/heap_surgery.c
index 7edfe4f326..1c90c87c04 100644
--- a/contrib/pg_surgery/heap_surgery.c
+++ b/contrib/pg_surgery/heap_surgery.c
@@ -110,7 +110,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("cannot operate on relation \"%s\"",
 		RelationGetRelationName(rel)),
- errdetail_relkind_not_supported(rel->rd_re

Re: [BUG]Invalidate relcache when setting REPLICA IDENTITY

2021-11-11 Thread Euler Taveira
On Thu, Nov 11, 2021, at 9:01 AM, houzj.f...@fujitsu.com wrote:
> Also attach the patches for back branch and remove some unnecessary
> changes from pgindent.
I reviewed your patch and I think the fix could be simplified by

if (OidIsValid(indexOid))
CacheInvalidateRelcache(rel);

If indexOid is valid it is a REPLICA IDENTITY INDEX. A few lines above there is
a check for a valid indexOid that makes sure the index is already marked as a
replica identity; if so, it bail out. If it is not, the relation should be
invalidated. Am I missing something?

I also modified your test case to include a DELETE command, wait the initial
table sync to avoid failing a subsequent test and improve some comments.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From e81e0bb6c6796e7fd692b7d6643db8fb6770d9a1 Mon Sep 17 00:00:00 2001
From: "houzj.fnst" 
Date: Thu, 11 Nov 2021 18:21:37 +0800
Subject: [PATCH v3] Invalidate relcache entry when changing REPLICA IDENTITY
 INDEX

When changing REPLICA IDENTITY INDEX to another one, the table relcache
entry was not invalidated which cause unexpected behaviour when
replicating UPDATE or DELETE changes.

If we are replacing indexes on the REPLICA IDENTITY, invalidate the
relcache entry to make sure the index Bitmapset is rebuild.
---
 src/backend/commands/tablecmds.c|  9 
 src/test/subscription/t/100_bugs.pl | 80 -
 2 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 857cc5ce6e..97a05da4b1 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15492,6 +15492,15 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
 		heap_freetuple(pg_index_tuple);
 	}
 
+	/*
+	 * If the correct index was not marked as replica identity, table relcache
+	 * should be invalidated. Hence, all sessions will refresh the table
+	 * replica identity before any UPDATE or DELETE on the table that was
+	 * invalidated.
+	 */
+	if (OidIsValid(indexOid))
+		CacheInvalidateRelcache(rel);
+
 	table_close(pg_index, RowExclusiveLock);
 }
 
diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index 34a60fd9ab..00de20e9b5 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -6,7 +6,7 @@ use strict;
 use warnings;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
-use Test::More tests => 5;
+use Test::More tests => 7;
 
 # Bug #15114
 
@@ -224,3 +224,81 @@ $node_sub->safe_psql('postgres', "DROP TABLE tab1");
 $node_pub->stop('fast');
 $node_pub_sub->stop('fast');
 $node_sub->stop('fast');
+
+# https://postgr.es/m/OS0PR01MB61133CA11630DAE45BC6AD95FB939%40OS0PR01MB6113.jpnprd01.prod.outlook.com'
+
+# The bug was that when changing the REPLICA IDENTITY INDEX to another one, the
+# target table relcache would not be invalidated. This could cause unexpected
+# behaviour when replicating UPDATE/DELETE change.
+$node_publisher = PostgreSQL::Test::Cluster->new('publisher3');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+$node_subscriber = PostgreSQL::Test::Cluster->new('subscriber3');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab_replidentity_index(a int not null, b int not null)");
+$node_publisher->safe_psql('postgres',
+	"CREATE UNIQUE INDEX idx_replidentity_index_a ON tab_replidentity_index(a)"
+);
+$node_publisher->safe_psql('postgres',
+	"CREATE UNIQUE INDEX idx_replidentity_index_b ON tab_replidentity_index(b)"
+);
+
+# use index idx_replidentity_index_a as REPLICA IDENTITY on publisher.
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab_replidentity_index REPLICA IDENTITY USING INDEX idx_replidentity_index_a"
+);
+
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab_replidentity_index VALUES(1, 1),(2, 2)");
+
+$node_subscriber->safe_psql('postgres',
+	"CREATE TABLE tab_replidentity_index(a int not null, b int not null)");
+$node_subscriber->safe_psql('postgres',
+	"CREATE UNIQUE INDEX idx_replidentity_index_a ON tab_replidentity_index(a)"
+);
+$node_subscriber->safe_psql('postgres',
+	"CREATE UNIQUE INDEX idx_replidentity_index_b ON tab_replidentity_index(b)"
+);
+# use index idx_replidentity_index_b as REPLICA IDENTITY on subscriber because
+# it reflects the future scenario we are testing: changing REPLICA IDENTITY
+# INDEX.
+$node_subscriber->safe_psql('postgres',
+	"ALTER TABLE tab_replidentity_index REPLICA IDENTITY USING INDEX idx_replidentity_index_b"
+);
+
+$publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION tap_pub FOR TABLE tab_

Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display

2021-11-11 Thread Euler Taveira
On Thu, Nov 11, 2021, at 2:50 AM, Bharath Rupireddy wrote:
> But for the snprintf(activitymsg, sizeof(activitymsg), "archiving %s",
> xlog); we have elog(DEBUG1, "archived write-ahead log file \"%s\"",
> xlog); after the archiving command. It is also good to have a similar
> debug message before archive command execution, we can get to know
> whether archive command is executed or not, if yes how much time did
> it take etc.?
If you are only interested in how much time it took to archive a WAL file, you
don't need 2 messages (before and after); one message is sufficient. If you
calculate the difference and then print it as part of the existent message.
Calculate the difference iif the log level is at least DEBUG1 (same as the log
message).

Messages related to system() e.g. archive and recovery, you could measure the
execution time in your own script/program.

I like Alvaro's idea of implementing a ring buffer for this kind of activity.
This implementation could be applied to checkpoints, archiving, recovery,
replication, base backup, WAL usage, connections and others.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display

2021-11-10 Thread Euler Taveira
On Wed, Nov 10, 2021, at 2:38 PM, Bharath Rupireddy wrote:
> As discussed in [1], isn't it a better idea to add some of activity
> messages [2] such as recovery, archive, backup, streaming etc. to
> server logs at LOG level? They are currently being set into ps display
> which is good if the postgres is being run on a standalone box/VM
> where users can see the ps display, but it doesn't help much in case
> the postgres is being run on a cloud environment where users don't
> have access to ps display output. Moreover, the ps display is
> transient and will not help to analyze after an issue occurs.
Besides recovery, the other activities already provide information through
views. I might be missing something but it seems the current views already
provide the information that ps displays.

> Having the above messages in the server logs will be useful to
> understand how the system is/was doing/progressing with these
> (sometimes time-intensive) operations.
It could be useful for a root cause analysis, however, you are also increasing
the number of LOG messages in a system that doesn't/won't require such
information. It is fine to add additional DEBUG messages if there isn't a
similar one yet. If at least the message level were module-controlled, you
could modify a setting to gather more messages from a specific module.
Unfortunately, that's not possible so we should avoid superfluous messages.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


<    1   2   3   4   >