subject:"row filtering for logical replication"

Re: row filtering for logical replication

2022-04-13 Thread Amit Kapila

On Wed, Apr 13, 2022 at 10:01 PM Alvaro Herrera  wrote:
>
> BTW I just noticed that AlterPublicationOptions acquires only
> ShareAccessLock on the publication object.  I think this is too lax ...
> what if two of them run concurrently? (say to specify different
> published actions)  Do they overwrite the other's update?
>

No, they won't overwrite. Firstly the AccessShareLock on the
publication object is not related to concurrent change of the
publication object. They will be protected by normal update-row rules
(like till the first transaction finishes, the other will wait). See
an example below:

Session-1
postgres=# Begin;
BEGIN
postgres=*# Alter publication pub1 set (publish = 'insert');
ALTER PUBLICATION

Session-2:
postgres=# Begin;
BEGIN
postgres=*# Alter publication pub1 set (publish = 'update');

The Alter in Session-2 will wait till we end the transaction in Session-1.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-04-13 Thread Alvaro Herrera

On 2022-Apr-13, Amit Kapila wrote:

> Thanks, this will work and fix the issue. I think this looks better
> than the current code, 

Thanks for looking!  Pushed.

> however, I am not sure if the handling for the
> concurrently dropped tables is better (both get_rel_relkind() and
> get_rel_name() can fail due to those reasons). I understand this won't
> fail because of the protection you have in the patch,

Well, the point is that these routines return NULL if the relation
cannot be found in the cache, so just doing "continue" (without raising
any error) if any of those happens is sufficient for correct behavior.

BTW I just noticed that AlterPublicationOptions acquires only
ShareAccessLock on the publication object.  I think this is too lax ...
what if two of them run concurrently? (say to specify different
published actions)  Do they overwrite the other's update?  I think it'd
be better to acquire ShareUpdateExclusive to ensure only one is running
at a time.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"This is a foot just waiting to be shot"(Andrew Dunstan)

Re: row filtering for logical replication

2022-04-12 Thread Amit Kapila

On Tue, Apr 12, 2022 at 6:16 PM Alvaro Herrera  wrote:
>
> On 2022-Apr-12, Amit Kapila wrote:
>
> > It still has the same problem. The table can be dropped just before
> > this message and the get_rel_name will return NULL and we don't expect
> > that.
>
> Ugh, I forgot to change the errmsg() parts to use the new variable,
> apologies.  Fixed.
>

Thanks, this will work and fix the issue. I think this looks better
than the current code, however, I am not sure if the handling for the
concurrently dropped tables is better (both get_rel_relkind() and
get_rel_name() can fail due to those reasons). I understand this won't
fail because of the protection you have in the patch, so feel free to
go ahead with this if you like this style better.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-04-12 Thread Alvaro Herrera

On 2022-Apr-12, Amit Kapila wrote:

> I mean that it fetches the tuple from the RELOID cache and then
> performs relkind and other checks similar to what we are doing. I
> think it could also have used get_rel_relkind() but probably not done
> because it doesn't have a lock on the relation.

Ah, but that one uses a lot more fields from the pg_class tuple in the
non-error path.  We only need relkind, up until we know the error is to
be thrown.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: row filtering for logical replication

2022-04-12 Thread Alvaro Herrera

On 2022-Apr-12, Amit Kapila wrote:

> It still has the same problem. The table can be dropped just before
> this message and the get_rel_name will return NULL and we don't expect
> that.

Ugh, I forgot to change the errmsg() parts to use the new variable,
apologies.  Fixed.

> Also, is there a reason that you haven't kept the test case added by Hou-San?

None.  I put it back here.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
>From f23be23c27eb9bed7350745233f4660f4c5b326a Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 12 Apr 2022 12:59:50 +0200
Subject: [PATCH v4] fixup checking for rowfilter/collist on altering
 publication

---
 src/backend/commands/publicationcmds.c| 89 +++
 src/test/regress/expected/publication.out | 16 +++-
 src/test/regress/sql/publication.sql  |  8 ++
 3 files changed, 63 insertions(+), 50 deletions(-)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7aacb6b2fe..d2b9f95f6d 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -945,60 +945,57 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
 
 		foreach(lc, root_relids)
 		{
-			HeapTuple	rftuple;
 			Oid			relid = lfirst_oid(lc);
-			bool		has_column_list;
-			bool		has_row_filter;
+			char		relkind;
+			char	   *relname;
+			HeapTuple	rftuple;
+			bool		has_rowfilter;
+			bool		has_collist;
+
+			/* Beware: we don't have lock on the relations */
 
 			rftuple = SearchSysCache2(PUBLICATIONRELMAP,
 	  ObjectIdGetDatum(relid),
 	  ObjectIdGetDatum(pubform->oid));
-
-			has_row_filter
-= !heap_attisnull(rftuple, Anum_pg_publication_rel_prqual, NULL);
-
-			has_column_list
-= !heap_attisnull(rftuple, Anum_pg_publication_rel_prattrs, NULL);
-
-			if (HeapTupleIsValid(rftuple) &&
-(has_row_filter || has_column_list))
+			if (!HeapTupleIsValid(rftuple))
+continue;
+			has_rowfilter = !heap_attisnull(rftuple, Anum_pg_publication_rel_prqual, NULL);
+			has_collist = !heap_attisnull(rftuple, Anum_pg_publication_rel_prattrs, NULL);
+			if (!has_rowfilter && !has_collist)
 			{
-HeapTuple	tuple;
-
-tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
-if (HeapTupleIsValid(tuple))
-{
-	Form_pg_class relform = (Form_pg_class) GETSTRUCT(tuple);
-
-	if ((relform->relkind == RELKIND_PARTITIONED_TABLE) &&
-		has_row_filter)
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("cannot set %s for publication \"%s\"",
-		"publish_via_partition_root = false",
-		stmt->pubname),
- errdetail("The publication contains a WHERE clause for a partitioned table \"%s\" "
-		   "which is not allowed when %s is false.",
-		   NameStr(relform->relname),
-		   "publish_via_partition_root")));
-
-	if ((relform->relkind == RELKIND_PARTITIONED_TABLE) &&
-		has_column_list)
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("cannot set %s for publication \"%s\"",
-		"publish_via_partition_root = false",
-		stmt->pubname),
- errdetail("The publication contains a column list for a partitioned table \"%s\" "
-		   "which is not allowed when %s is false.",
-		   NameStr(relform->relname),
-		   "publish_via_partition_root")));
-
-	ReleaseSysCache(tuple);
-}
-
 ReleaseSysCache(rftuple);
+continue;
 			}
+
+			relkind = get_rel_relkind(relid);
+			if (relkind != RELKIND_PARTITIONED_TABLE)
+			{
+ReleaseSysCache(rftuple);
+continue;
+			}
+			relname = get_rel_name(relid);
+			if (relname == NULL)	/* table concurrently dropped */
+			{
+ReleaseSysCache(rftuple);
+continue;
+			}
+
+			if (has_rowfilter)
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg("cannot set parameter \"%s\" to false for publication \"%s\"",
+"publish_via_partition_root",
+stmt->pubname),
+		 errdetail("The publication contains a WHERE clause for partitioned table \"%s\" which is not allowed when \"%s\" is false.",
+   relname, "publish_via_partition_root")));
+			Assert(has_collist);
+			ereport(ERROR,
+	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+	 errmsg("cannot set parameter \"%s\" to false for publication \"%s\"",
+			"publish_via_partition_root",
+			stmt->pubname),
+	 errdetail("The publication contains a column list for partitioned table \"%s\", which is not allowed when \"%s\" is false.",
+			   relname, "publish_via_partition_root")));
 		}
 	}
 
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 8208f9fa0e..cc9c8990ae 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -588,8 +588,12 @@ UPDATE rf_tbl_abcd_part_pk SET a = 1;
 -- fail

Re: row filtering for logical replication

2022-04-12 Thread Amit Kapila

On Tue, Apr 12, 2022 at 5:01 PM Alvaro Herrera  wrote:
>
> On 2022-Apr-12, Amit Kapila wrote:
>
> > On Tue, Apr 12, 2022 at 3:45 PM Amit Kapila  wrote:
>
> > > We don't have a lock on the relation, so if it gets dropped
> > > concurrently, it won't behave sanely. For example, get_rel_name() will
> > > return NULL which seems incorrect to me.
>
> Oh, oops ... a trap for the unwary?  Anyway, yes, we can disregard the
> entry when get_rel_name returns null.  Amended patch attached.
>
> > > I am not sure about this as well because you will instead do a RELOID
> > > cache lookup even when there is no row filter or column list.
>
> I guess my assumption is that the pg_class cache is typically more
> populated than other relcaches, but that's unsubstantiated.  I'm not
> sure if we have any way to tell which one is the more common case.
> Anyway, let's do it the way you already had it.
>
> > It seems to me that we have a similar coding pattern in 
> > ExecGrant_Relation().
>
> Not sure what you mean?
>

I mean that it fetches the tuple from the RELOID cache and then
performs relkind and other checks similar to what we are doing. I
think it could also have used get_rel_relkind() but probably not done
because it doesn't have a lock on the relation.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-04-12 Thread Amit Kapila

On Tue, Apr 12, 2022 at 5:12 PM Alvaro Herrera  wrote:
>
> Sorry, I think I neglected to "git add" some late changes.
>

+ if (has_rowfilter)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("cannot set parameter \"%s\" to false for publication \"%s\"",
+ "publish_via_partition_root",
+ stmt->pubname),
+ errdetail("The publication contains a WHERE clause for partitioned
table \"%s\" which is not allowed when \"%s\" is false.",
+get_rel_name(relid),
+"publish_via_partition_root")));

It still has the same problem. The table can be dropped just before
this message and the get_rel_name will return NULL and we don't expect
that.

Also, is there a reason that you haven't kept the test case added by Hou-San?

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-04-12 Thread Alvaro Herrera

Sorry, I think I neglected to "git add" some late changes.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
>From e7569ed4c4a01f782f9326ebc9a3c9049973ef4b Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 12 Apr 2022 12:59:50 +0200
Subject: [PATCH v3] fixup checking for rowfilter/collist on altering
 publication

---
 src/backend/commands/publicationcmds.c| 91 +++
 src/test/regress/expected/publication.out |  8 +-
 2 files changed, 49 insertions(+), 50 deletions(-)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7aacb6b2fe..5aa5201055 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -945,60 +945,59 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
 
 		foreach(lc, root_relids)
 		{
-			HeapTuple	rftuple;
 			Oid			relid = lfirst_oid(lc);
-			bool		has_column_list;
-			bool		has_row_filter;
+			char		relkind;
+			char	   *relname;
+			HeapTuple	rftuple;
+			bool		has_rowfilter;
+			bool		has_collist;
+
+			/* Beware: we don't have lock on the relations */
 
 			rftuple = SearchSysCache2(PUBLICATIONRELMAP,
 	  ObjectIdGetDatum(relid),
 	  ObjectIdGetDatum(pubform->oid));
-
-			has_row_filter
-= !heap_attisnull(rftuple, Anum_pg_publication_rel_prqual, NULL);
-
-			has_column_list
-= !heap_attisnull(rftuple, Anum_pg_publication_rel_prattrs, NULL);
-
-			if (HeapTupleIsValid(rftuple) &&
-(has_row_filter || has_column_list))
+			if (!HeapTupleIsValid(rftuple))
+continue;
+			has_rowfilter = !heap_attisnull(rftuple, Anum_pg_publication_rel_prqual, NULL);
+			has_collist = !heap_attisnull(rftuple, Anum_pg_publication_rel_prattrs, NULL);
+			if (!has_rowfilter && !has_collist)
 			{
-HeapTuple	tuple;
-
-tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
-if (HeapTupleIsValid(tuple))
-{
-	Form_pg_class relform = (Form_pg_class) GETSTRUCT(tuple);
-
-	if ((relform->relkind == RELKIND_PARTITIONED_TABLE) &&
-		has_row_filter)
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("cannot set %s for publication \"%s\"",
-		"publish_via_partition_root = false",
-		stmt->pubname),
- errdetail("The publication contains a WHERE clause for a partitioned table \"%s\" "
-		   "which is not allowed when %s is false.",
-		   NameStr(relform->relname),
-		   "publish_via_partition_root")));
-
-	if ((relform->relkind == RELKIND_PARTITIONED_TABLE) &&
-		has_column_list)
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("cannot set %s for publication \"%s\"",
-		"publish_via_partition_root = false",
-		stmt->pubname),
- errdetail("The publication contains a column list for a partitioned table \"%s\" "
-		   "which is not allowed when %s is false.",
-		   NameStr(relform->relname),
-		   "publish_via_partition_root")));
-
-	ReleaseSysCache(tuple);
-}
-
 ReleaseSysCache(rftuple);
+continue;
 			}
+
+			relkind = get_rel_relkind(relid);
+			if (relkind != RELKIND_PARTITIONED_TABLE)
+			{
+ReleaseSysCache(rftuple);
+continue;
+			}
+			relname = get_rel_name(relid);
+			if (relname == NULL)	/* table concurrently dropped */
+			{
+ReleaseSysCache(rftuple);
+continue;
+			}
+
+			if (has_rowfilter)
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg("cannot set parameter \"%s\" to false for publication \"%s\"",
+"publish_via_partition_root",
+stmt->pubname),
+		 errdetail("The publication contains a WHERE clause for partitioned table \"%s\" which is not allowed when \"%s\" is false.",
+   get_rel_name(relid),
+   "publish_via_partition_root")));
+			Assert(has_collist);
+			ereport(ERROR,
+	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+	 errmsg("cannot set parameter \"%s\" to false for publication \"%s\"",
+			"publish_via_partition_root",
+			stmt->pubname),
+	 errdetail("The publication contains a column list for partitioned table \"%s\", which is not allowed when \"%s\" is false.",
+			   get_rel_name(relid),
+			   "publish_via_partition_root")));
 		}
 	}
 
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 8208f9fa0e..37cf11e785 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -588,8 +588,8 @@ UPDATE rf_tbl_abcd_part_pk SET a = 1;
 -- fail - cannot set PUBLISH_VIA_PARTITION_ROOT to false if any row filter is
 -- used for partitioned table
 ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
-ERROR:  cannot set publish_via_partition_root = false for publication "testpub6"
-DETAIL:  The publication contains a WHERE clause for a partitioned table

Re: row filtering for logical replication

2022-04-12 Thread Alvaro Herrera

On 2022-Apr-12, Amit Kapila wrote:

> On Tue, Apr 12, 2022 at 3:45 PM Amit Kapila  wrote:

> > We don't have a lock on the relation, so if it gets dropped
> > concurrently, it won't behave sanely. For example, get_rel_name() will
> > return NULL which seems incorrect to me.

Oh, oops ... a trap for the unwary?  Anyway, yes, we can disregard the
entry when get_rel_name returns null.  Amended patch attached.

> > I am not sure about this as well because you will instead do a RELOID
> > cache lookup even when there is no row filter or column list.

I guess my assumption is that the pg_class cache is typically more
populated than other relcaches, but that's unsubstantiated.  I'm not
sure if we have any way to tell which one is the more common case.
Anyway, let's do it the way you already had it.

> It seems to me that we have a similar coding pattern in ExecGrant_Relation().

Not sure what you mean?  In that function, when the syscache lookup
returns NULL, an error is raised.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El número de instalaciones de UNIX se ha elevado a 10,
y se espera que este número aumente" (UPM, 1972)
>From 631a6d04cbe420164833dd4e88a86d0e076fd47d Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 12 Apr 2022 12:59:50 +0200
Subject: [PATCH v2] fixup checking for rowfilter/collist on altering
 publication

---
 src/backend/commands/publicationcmds.c| 91 +++
 src/test/regress/expected/publication.out |  8 +-
 2 files changed, 49 insertions(+), 50 deletions(-)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7aacb6b2fe..59fc39e9f2 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -945,60 +945,59 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
 
 		foreach(lc, root_relids)
 		{
-			HeapTuple	rftuple;
 			Oid			relid = lfirst_oid(lc);
-			bool		has_column_list;
-			bool		has_row_filter;
+			char		relkind;
+			char	   *relname;
+			HeapTuple	rftuple;
+			bool		has_rowfilter;
+			bool		has_collist;
+
+			/* Beware: we don't have lock on the relations */
 
 			rftuple = SearchSysCache2(PUBLICATIONRELMAP,
 	  ObjectIdGetDatum(relid),
 	  ObjectIdGetDatum(pubform->oid));
-
-			has_row_filter
-= !heap_attisnull(rftuple, Anum_pg_publication_rel_prqual, NULL);
-
-			has_column_list
-= !heap_attisnull(rftuple, Anum_pg_publication_rel_prattrs, NULL);
-
-			if (HeapTupleIsValid(rftuple) &&
-(has_row_filter || has_column_list))
+			if (!HeapTupleIsValid(rftuple))
+continue;
+			has_rowfilter = !heap_attisnull(rftuple, Anum_pg_publication_rel_prqual, NULL);
+			has_collist = !heap_attisnull(rftuple, Anum_pg_publication_rel_prattrs, NULL);
+			if (!has_rowfilter && !has_collist)
 			{
-HeapTuple	tuple;
-
-tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
-if (HeapTupleIsValid(tuple))
-{
-	Form_pg_class relform = (Form_pg_class) GETSTRUCT(tuple);
-
-	if ((relform->relkind == RELKIND_PARTITIONED_TABLE) &&
-		has_row_filter)
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("cannot set %s for publication \"%s\"",
-		"publish_via_partition_root = false",
-		stmt->pubname),
- errdetail("The publication contains a WHERE clause for a partitioned table \"%s\" "
-		   "which is not allowed when %s is false.",
-		   NameStr(relform->relname),
-		   "publish_via_partition_root")));
-
-	if ((relform->relkind == RELKIND_PARTITIONED_TABLE) &&
-		has_column_list)
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("cannot set %s for publication \"%s\"",
-		"publish_via_partition_root = false",
-		stmt->pubname),
- errdetail("The publication contains a column list for a partitioned table \"%s\" "
-		   "which is not allowed when %s is false.",
-		   NameStr(relform->relname),
-		   "publish_via_partition_root")));
-
-	ReleaseSysCache(tuple);
-}
-
 ReleaseSysCache(rftuple);
+continue;
 			}
+
+			relkind = get_rel_relkind(relid);
+			if (relkind != RELKIND_PARTITIONED_TABLE)
+			{
+ReleaseSysCache(rftuple);
+continue;
+			}
+			relname = get_rel_name(relid);
+			if (relname == NULL)	/* table concurrently dropped */
+			{
+ReleaseSysCache(rftuple);
+continue;
+			}
+
+			if (has_rowfilter)
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg("cannot set parameter \"%s\" to false for publication \"%s\"",
+"publish_via_partition_root",
+stmt->pubname),
+		 errdetail("The publication includes partitioned table \"%s\" with a WHERE clause, which is not allowed when \"%s\" is false.",
+   get_rel_name(relid),
+   "publish_via_partition_root")));
+			Assert(has_collist);
+			ereport(ERROR,
+

Re: row filtering for logical replication

2022-04-12 Thread Amit Kapila

On Tue, Apr 12, 2022 at 3:45 PM Amit Kapila  wrote:
>
> On Tue, Apr 12, 2022 at 2:35 PM Alvaro Herrera  
> wrote:
> >
> > I understand that this is a minimal fix, and for that it seems OK, but I
> > think the surrounding style is rather baroque.  This code can be made
> > simpler.  Here's my take on it.
> >
>
> We don't have a lock on the relation, so if it gets dropped
> concurrently, it won't behave sanely. For example, get_rel_name() will
> return NULL which seems incorrect to me.
>

It seems to me that we have a similar coding pattern in ExecGrant_Relation().

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-04-12 Thread Amit Kapila

On Tue, Apr 12, 2022 at 2:35 PM Alvaro Herrera  wrote:
>
> I understand that this is a minimal fix, and for that it seems OK, but I
> think the surrounding style is rather baroque.  This code can be made
> simpler.  Here's my take on it.
>

We don't have a lock on the relation, so if it gets dropped
concurrently, it won't behave sanely. For example, get_rel_name() will
return NULL which seems incorrect to me.

>  I think it's also faster: we avoid
> looking up pg_publication_rel entries for rels that aren't partitioned
> tables.
>

I am not sure about this as well because you will instead do a RELOID
cache lookup even when there is no row filter or column list.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-04-12 Thread Alvaro Herrera

I understand that this is a minimal fix, and for that it seems OK, but I
think the surrounding style is rather baroque.  This code can be made
simpler.  Here's my take on it.  I think it's also faster: we avoid
looking up pg_publication_rel entries for rels that aren't partitioned
tables.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Cuando mañana llegue pelearemos segun lo que mañana exija" (Mowgli)
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7aacb6b2fe..e8ef003fe5 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -945,60 +945,42 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
 
 		foreach(lc, root_relids)
 		{
-			HeapTuple	rftuple;
 			Oid			relid = lfirst_oid(lc);
-			bool		has_column_list;
-			bool		has_row_filter;
+			char		relkind;
+			HeapTuple	rftuple;
+
+			relkind = get_rel_relkind(relid);
+			if (relkind != RELKIND_PARTITIONED_TABLE)
+continue;
 
 			rftuple = SearchSysCache2(PUBLICATIONRELMAP,
 	  ObjectIdGetDatum(relid),
 	  ObjectIdGetDatum(pubform->oid));
+			if (!HeapTupleIsValid(rftuple))
+continue;
 
-			has_row_filter
-= !heap_attisnull(rftuple, Anum_pg_publication_rel_prqual, NULL);
+			if (!heap_attisnull(rftuple, Anum_pg_publication_rel_prqual, NULL))
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg("cannot set %s to false for publication \"%s\"",
+"publish_via_partition_root",
+stmt->pubname),
+		 errdetail("The publication contains a WHERE clause for a partitioned table \"%s\" which is not allowed when %s is false.",
+   get_rel_name(relid),
+   "publish_via_partition_root")));
 
-			has_column_list
-= !heap_attisnull(rftuple, Anum_pg_publication_rel_prattrs, NULL);
+			if (!heap_attisnull(rftuple, Anum_pg_publication_rel_prattrs, NULL))
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg("cannot set %s to false for publication \"%s\"",
+"publish_via_partition_root",
+stmt->pubname),
+		 errdetail("The publication contains a column list for a partitioned table \"%s\" which is not allowed when %s is false.",
+   get_rel_name(relid),
+   "publish_via_partition_root")));
 
-			if (HeapTupleIsValid(rftuple) &&
-(has_row_filter || has_column_list))
-			{
-HeapTuple	tuple;
+			ReleaseSysCache(rftuple);
 
-tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
-if (HeapTupleIsValid(tuple))
-{
-	Form_pg_class relform = (Form_pg_class) GETSTRUCT(tuple);
-
-	if ((relform->relkind == RELKIND_PARTITIONED_TABLE) &&
-		has_row_filter)
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("cannot set %s for publication \"%s\"",
-		"publish_via_partition_root = false",
-		stmt->pubname),
- errdetail("The publication contains a WHERE clause for a partitioned table \"%s\" "
-		   "which is not allowed when %s is false.",
-		   NameStr(relform->relname),
-		   "publish_via_partition_root")));
-
-	if ((relform->relkind == RELKIND_PARTITIONED_TABLE) &&
-		has_column_list)
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("cannot set %s for publication \"%s\"",
-		"publish_via_partition_root = false",
-		stmt->pubname),
- errdetail("The publication contains a column list for a partitioned table \"%s\" "
-		   "which is not allowed when %s is false.",
-		   NameStr(relform->relname),
-		   "publish_via_partition_root")));
-
-	ReleaseSysCache(tuple);
-}
-
-ReleaseSysCache(rftuple);
-			}
 		}
 	}
 
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 8208f9fa0e..580cc5be7f 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -588,7 +588,7 @@ UPDATE rf_tbl_abcd_part_pk SET a = 1;
 -- fail - cannot set PUBLISH_VIA_PARTITION_ROOT to false if any row filter is
 -- used for partitioned table
 ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
-ERROR:  cannot set publish_via_partition_root = false for publication "testpub6"
+ERROR:  cannot set publish_via_partition_root to false for publication "testpub6"
 DETAIL:  The publication contains a WHERE clause for a partitioned table "rf_tbl_abcd_part_pk" which is not allowed when publish_via_partition_root is false.
 -- Now change the root filter to use a column "b"
 -- (which is not in the replica identity)
@@ -951,7 +951,7 @@ UPDATE rf_tbl_abcd_part_pk SET a = 1;
 -- fail - cannot set PUBLISH_VIA_PARTITION_ROOT to false if any column list is
 -- used for partitioned table
 ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
-ERROR:  cannot set publish_via_partition_root =

Re: row filtering for logical replication

2022-04-11 Thread Peter Smith

On Tue, Apr 12, 2022 at 11:31 AM houzj.f...@fujitsu.com
 wrote:
>
> On Tuesday, April 12, 2022 8:40 AM Peter Smith  wrote:
> >
> > FYI, I was playing with row filters and partitions recently, and while doing
> > something a bit unusual I received a cache leak warning.
> >
> > Below are the steps to reproduce it:
> >
> >
> > test_pub=# CREATE TABLE parent(a int primary key) PARTITION BY RANGE(a);
> > CREATE TABLE
> >
> > test_pub=# CREATE TABLE child PARTITION OF parent DEFAULT; CREATE TABLE
> >
> > test_pub=# CREATE PUBLICATION p4 FOR TABLE parent WHERE (a < 5), child
> > WHERE (a >= 5) WITH (publish_via_partition_root=true);
> > CREATE PUBLICATION
> >
> > test_pub=# ALTER PUBLICATION p4 SET TABLE parent, child WHERE (a >= 5);
> > ALTER PUBLICATION
> >
> > test_pub=# ALTER PUBLICATION p4 SET (publish_via_partition_root = false);
> > 2022-04-11 17:37:58.426 AEST [28152] WARNING:  cache reference leak:
> > cache pg_publication_rel (49), tuple 0/12 has count 1
> > WARNING:  cache reference leak: cache pg_publication_rel (49), tuple
> > 0/12 has count 1
> > ALTER PUBLICATION
>
> Thanks for reporting.
>
> I think the reason is that we didn't invoke ReleaseSysCache when rftuple is
> valid and no filter exists. We need to release the tuple whenever the
> rftuple is valid. Attach a patch which fix this.
>

Thanks! Your patch could be applied cleanly, and the reported problem
now seems fixed.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

RE: row filtering for logical replication

2022-04-11 Thread houzj.f...@fujitsu.com

On Tuesday, April 12, 2022 8:40 AM Peter Smith  wrote:
> 
> FYI, I was playing with row filters and partitions recently, and while doing
> something a bit unusual I received a cache leak warning.
> 
> Below are the steps to reproduce it:
> 
> 
> test_pub=# CREATE TABLE parent(a int primary key) PARTITION BY RANGE(a);
> CREATE TABLE
> 
> test_pub=# CREATE TABLE child PARTITION OF parent DEFAULT; CREATE TABLE
> 
> test_pub=# CREATE PUBLICATION p4 FOR TABLE parent WHERE (a < 5), child
> WHERE (a >= 5) WITH (publish_via_partition_root=true);
> CREATE PUBLICATION
> 
> test_pub=# ALTER PUBLICATION p4 SET TABLE parent, child WHERE (a >= 5);
> ALTER PUBLICATION
> 
> test_pub=# ALTER PUBLICATION p4 SET (publish_via_partition_root = false);
> 2022-04-11 17:37:58.426 AEST [28152] WARNING:  cache reference leak:
> cache pg_publication_rel (49), tuple 0/12 has count 1
> WARNING:  cache reference leak: cache pg_publication_rel (49), tuple
> 0/12 has count 1
> ALTER PUBLICATION

Thanks for reporting.

I think the reason is that we didn't invoke ReleaseSysCache when rftuple is
valid and no filter exists. We need to release the tuple whenever the
rftuple is valid. Attach a patch which fix this.

Best regards,
Hou zj


0001-Fix-missed-ReleaseSysCache-in-AlterPublicationOption.patch
Description:  0001-Fix-missed-ReleaseSysCache-in-AlterPublicationOption.patch

Re: row filtering for logical replication

2022-04-11 Thread Peter Smith

FYI, I was playing with row filters and partitions recently, and while
doing something a bit unusual I received a cache leak warning.

Below are the steps to reproduce it:


test_pub=# CREATE TABLE parent(a int primary key) PARTITION BY RANGE(a);
CREATE TABLE

test_pub=# CREATE TABLE child PARTITION OF parent DEFAULT;
CREATE TABLE

test_pub=# CREATE PUBLICATION p4 FOR TABLE parent WHERE (a < 5), child
WHERE (a >= 5) WITH (publish_via_partition_root=true);
CREATE PUBLICATION

test_pub=# ALTER PUBLICATION p4 SET TABLE parent, child WHERE (a >= 5);
ALTER PUBLICATION

test_pub=# ALTER PUBLICATION p4 SET (publish_via_partition_root = false);
2022-04-11 17:37:58.426 AEST [28152] WARNING:  cache reference leak:
cache pg_publication_rel (49), tuple 0/12 has count 1
WARNING:  cache reference leak: cache pg_publication_rel (49), tuple
0/12 has count 1
ALTER PUBLICATION

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2022-03-06 Thread Amit Kapila

On Mon, Mar 7, 2022 at 12:50 AM Tomas Vondra
 wrote:
>
> On 3/3/22 21:07, Euler Taveira wrote:
> > On Thu, Mar 3, 2022, at 7:47 AM, Amit Kapila wrote:
> >> LGTM. I'll push this tomorrow unless Tomas or Euler feels otherwise.
> > Sounds good to me.
> >
>
> +1
>

Thanks, Pushed 
(https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ceb57afd3ce177e897cb4c5b44aa683fc0036782).

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-03-06 Thread Tomas Vondra

On 3/3/22 21:07, Euler Taveira wrote:
> On Thu, Mar 3, 2022, at 7:47 AM, Amit Kapila wrote:
>> LGTM. I'll push this tomorrow unless Tomas or Euler feels otherwise.
> Sounds good to me.
> 

+1

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: row filtering for logical replication

2022-03-03 Thread Euler Taveira

On Thu, Mar 3, 2022, at 7:47 AM, Amit Kapila wrote:
> LGTM. I'll push this tomorrow unless Tomas or Euler feels otherwise.
Sounds good to me.


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: row filtering for logical replication

2022-03-03 Thread Amit Kapila

On Thu, Mar 3, 2022 at 11:18 AM shiy.f...@fujitsu.com
 wrote:
>
> On Thu, Mar 3, 2022 10:40 AM Amit Kapila  wrote:
> >
> > On Wed, Mar 2, 2022 at 5:42 PM Euler Taveira  wrote:
> > >
> > > On Wed, Mar 2, 2022, at 8:45 AM, Tomas Vondra wrote:
> > >
> > > While working on the column filtering patch, which touches about the
> > > same places, I noticed two minor gaps in testing:
> > >
> > > 1) The regression tests do perform multiple ALTER PUBLICATION commands,
> > > tweaking the row filter. But there are no checks the row filter was
> > > actually modified / stored in the catalog. It might be just thrown away
> > > and no one would notice.
> > >
> > > The test that row filter was modified is available in a previous section. 
> > > The
> > > one that you modified (0001) is testing the supported objects.
> > >
> >
> > Right. But if Tomas thinks it is good to add for these ones as well
> > then I don't mind.
> >
> > > 153 ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000
> > AND e < 2000);
> > > 154 \dRp+ testpub5
> > > 155 ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
> > > 156 \dRp+ testpub5
> > > 157 -- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE
> > expression)
> > > 158 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300
> > AND e < 500);
> > > 159 \dRp+ testpub5
> > >
> > > IIRC this test was written before adding the row filter information into 
> > > the
> > > psql. We could add \d+ testpub_rf_tbl3 before and after the modification.
> > >
> >
> >
> > Agreed. We can use \d instead of \d+ as row filter is available with \d.
> >
> > > 2) There are no pg_dump tests.
> > >
> > > WFM.
> > >
> >
> > This is a miss. I feel we can add a few more.
> >
>
> Agree that we can add some tests, attach the patch which fixes these two 
> points.
>

LGTM. I'll push this tomorrow unless Tomas or Euler feels otherwise.

-- 
With Regards,
Amit Kapila.

RE: row filtering for logical replication

2022-03-02 Thread shiy.f...@fujitsu.com

On Thu, Mar 3, 2022 10:40 AM Amit Kapila  wrote:
> 
> On Wed, Mar 2, 2022 at 5:42 PM Euler Taveira  wrote:
> >
> > On Wed, Mar 2, 2022, at 8:45 AM, Tomas Vondra wrote:
> >
> > While working on the column filtering patch, which touches about the
> > same places, I noticed two minor gaps in testing:
> >
> > 1) The regression tests do perform multiple ALTER PUBLICATION commands,
> > tweaking the row filter. But there are no checks the row filter was
> > actually modified / stored in the catalog. It might be just thrown away
> > and no one would notice.
> >
> > The test that row filter was modified is available in a previous section. 
> > The
> > one that you modified (0001) is testing the supported objects.
> >
> 
> Right. But if Tomas thinks it is good to add for these ones as well
> then I don't mind.
> 
> > 153 ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000
> AND e < 2000);
> > 154 \dRp+ testpub5
> > 155 ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
> > 156 \dRp+ testpub5
> > 157 -- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE
> expression)
> > 158 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300
> AND e < 500);
> > 159 \dRp+ testpub5
> >
> > IIRC this test was written before adding the row filter information into the
> > psql. We could add \d+ testpub_rf_tbl3 before and after the modification.
> >
> 
> 
> Agreed. We can use \d instead of \d+ as row filter is available with \d.
> 
> > 2) There are no pg_dump tests.
> >
> > WFM.
> >
> 
> This is a miss. I feel we can add a few more.
> 

Agree that we can add some tests, attach the patch which fixes these two points.

Regards,
Shi yu 


0001-Add-some-tests-for-row-filter-in-logical-replication.patch
Description:  0001-Add-some-tests-for-row-filter-in-logical-replication.patch

Re: row filtering for logical replication

2022-03-02 Thread Amit Kapila

On Wed, Mar 2, 2022 at 5:42 PM Euler Taveira  wrote:
>
> On Wed, Mar 2, 2022, at 8:45 AM, Tomas Vondra wrote:
>
> While working on the column filtering patch, which touches about the
> same places, I noticed two minor gaps in testing:
>
> 1) The regression tests do perform multiple ALTER PUBLICATION commands,
> tweaking the row filter. But there are no checks the row filter was
> actually modified / stored in the catalog. It might be just thrown away
> and no one would notice.
>
> The test that row filter was modified is available in a previous section. The
> one that you modified (0001) is testing the supported objects.
>

Right. But if Tomas thinks it is good to add for these ones as well
then I don't mind.

> 153 ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND 
> e < 2000);
> 154 \dRp+ testpub5
> 155 ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
> 156 \dRp+ testpub5
> 157 -- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE 
> expression)
> 158 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e 
> < 500);
> 159 \dRp+ testpub5
>
> IIRC this test was written before adding the row filter information into the
> psql. We could add \d+ testpub_rf_tbl3 before and after the modification.
>


Agreed. We can use \d instead of \d+ as row filter is available with \d.

> 2) There are no pg_dump tests.
>
> WFM.
>

This is a miss. I feel we can add a few more.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-03-02 Thread Euler Taveira

On Wed, Mar 2, 2022, at 8:45 AM, Tomas Vondra wrote:
> While working on the column filtering patch, which touches about the
> same places, I noticed two minor gaps in testing:
> 
> 1) The regression tests do perform multiple ALTER PUBLICATION commands,
> tweaking the row filter. But there are no checks the row filter was
> actually modified / stored in the catalog. It might be just thrown away
> and no one would notice.
The test that row filter was modified is available in a previous section. The
one that you modified (0001) is testing the supported objects.

153 ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e 
< 2000);
154 \dRp+ testpub5
155 ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
156 \dRp+ testpub5
157 -- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE 
expression)
158 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 
500);
159 \dRp+ testpub5

IIRC this test was written before adding the row filter information into the
psql. We could add \d+ testpub_rf_tbl3 before and after the modification.

> 2) There are no pg_dump tests.
WFM.


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: row filtering for logical replication

2022-03-02 Thread Tomas Vondra

Hi,

While working on the column filtering patch, which touches about the
same places, I noticed two minor gaps in testing:

1) The regression tests do perform multiple ALTER PUBLICATION commands,
tweaking the row filter. But there are no checks the row filter was
actually modified / stored in the catalog. It might be just thrown away
and no one would notice.

2) There are no pg_dump tests.


So attached are two trivial patched, addressing this. The first one adds
a couple \dRp and \d commands, to show what the catalogs contain. The
second one adds a simple pg_dump test.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL CompanyFrom e781f840e38701c63d8b57ff36bd520f2cced6ad Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Sat, 26 Feb 2022 17:33:09 +0100
Subject: [PATCH 1/3] Verify changing WHERE condition for a publication

Commit 52e4f0cd47 added support for row filters in logical replication,
including regression tests with multiple ALTER PUBLICATION commands,
modifying the row filter. But the tests never verified that the row
filter was actually updated in the catalog. This adds a couple \d and
\dRp commands, to verify the catalog was updated.
---
 src/test/regress/expected/publication.out | 66 +++
 src/test/regress/sql/publication.sql  |  8 +++
 2 files changed, 74 insertions(+)

diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 3c382e520e4..227ce759486 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -395,15 +395,81 @@ LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (b < '2' CO...
 DETAIL:  User-defined collations are not allowed.
 -- ok - NULLIF is allowed
 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+\d+ testpub_rf_tbl1
+  Table "public.testpub_rf_tbl1"
+ Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description 
++-+---+--+-+--+--+-
+ a  | integer |   |  | | plain|  | 
+ b  | text|   |  | | extended |  | 
+Publications:
+"testpub5" WHERE (NULLIF(1, 2) = a)
+
+\dRp+ testpub5
+Publication testpub5
+  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--++-+-+-+---+--
+ regress_publication_user | f  | t   | f   | f   | f | f
+Tables:
+"public.testpub_rf_tbl1" WHERE (NULLIF(1, 2) = a)
+
 -- ok - built-in operators are allowed
 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+\d+ testpub_rf_tbl1
+  Table "public.testpub_rf_tbl1"
+ Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description 
++-+---+--+-+--+--+-
+ a  | integer |   |  | | plain|  | 
+ b  | text|   |  | | extended |  | 
+Publications:
+"testpub5" WHERE (a IS NULL)
+
+\dRp+ testpub5
+Publication testpub5
+  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--++-+-+-+---+--
+ regress_publication_user | f  | t   | f   | f   | f | f
+Tables:
+"public.testpub_rf_tbl1" WHERE (a IS NULL)
+
 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS DISTINCT FROM 5);
 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a, a + 1) < (2, 3));
 -- ok - built-in type coercions between two binary compatible datatypes are allowed
 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (b::varchar < '2');
+\d+ testpub_rf_tbl1
+  Table "public.testpub_rf_tbl1"
+ Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description 
++-+---+--+-+--+--+-
+ a  | integer |   |  | | plain|  | 
+ b  | text|   |  | | extended |  | 
+Publications:
+"testpub5" WHERE (((b)::character varying)::text < '2'::text)
+
+\dRp+ testpub5
+Publication testpub5
+  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--++-+-+-+---+--
+ regress_publication_user | f  |

Re: row filtering for logical replication

2022-02-23 Thread Peter Smith

On Thu, Feb 24, 2022 at 1:33 PM Amit Kapila  wrote:
>
> On Thu, Feb 24, 2022 at 7:57 AM Peter Smith  wrote:
> >
> > I noticed that there was a build-farm failure on the machine 'komodoensis' 
> > [1]
> >
> > #   Failed test 'check replicated rows to tab_rowfilter_toast'
> > #   at t/028_row_filter.pl line 687.
> > #  got: ''
> > # expected: 't|1'
> > # Looks like you failed 1 test of 20.
> > [18:21:24] t/028_row_filter.pl 
> > Dubious, test returned 1 (wstat 256, 0x100)
> > Failed 1/20 subtests
> >
> > That failure looks intermittent because from the history you can see
> > the same machine already passed multiple times in this test case.
> >
> > When I investigated the test case I noticed there seems to be a
> > missing "catchup" ($node_publisher->wait_for_catchup($appname);), so
> > sometimes if the replication happens too slowly then the expected row
> > might not be found on the subscriber side.
> >
>
> Your analysis seems correct to me and it is evident from the result as
> well. Reviewing the test, it seems other similar places already have
> the catchup but it is missed after this update test.
>
> > I will post a patch to fix this shortly.
> >
>
> Thanks.
>

PSA a patch to fix the observed [1] build-farm failure.

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPv%3De9Qd1TSYo8Og6x6Abfz3b9_htwinLp4ENPgV45DACQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Fix-BF-test-fail-caused-by-bad-timing.patch
Description: Binary data

Re: row filtering for logical replication

2022-02-23 Thread Amit Kapila

On Thu, Feb 24, 2022 at 7:57 AM Peter Smith  wrote:
>
> I noticed that there was a build-farm failure on the machine 'komodoensis' [1]
>
> #   Failed test 'check replicated rows to tab_rowfilter_toast'
> #   at t/028_row_filter.pl line 687.
> #  got: ''
> # expected: 't|1'
> # Looks like you failed 1 test of 20.
> [18:21:24] t/028_row_filter.pl 
> Dubious, test returned 1 (wstat 256, 0x100)
> Failed 1/20 subtests
>
> That failure looks intermittent because from the history you can see
> the same machine already passed multiple times in this test case.
>
> When I investigated the test case I noticed there seems to be a
> missing "catchup" ($node_publisher->wait_for_catchup($appname);), so
> sometimes if the replication happens too slowly then the expected row
> might not be found on the subscriber side.
>

Your analysis seems correct to me and it is evident from the result as
well. Reviewing the test, it seems other similar places already have
the catchup but it is missed after this update test.

> I will post a patch to fix this shortly.
>

Thanks.

-- 
With Regards,
Amit Kapila.

RE: row filtering for logical replication

2022-02-23 Thread Shinoda, Noriyoshi (PN Japan FSIP)

> You can give it for multiple tables. See below as an example:

Thank you very much. I understood.

Regards,
Noriyoshi Shinoda
-Original Message-
From: Amit Kapila  
Sent: Thursday, February 24, 2022 11:25 AM
To: Shinoda, Noriyoshi (PN Japan FSIP) 
Cc: Euler Taveira ; houzj.f...@fujitsu.com; Peter Smith 
; Alvaro Herrera ; Greg 
Nancarrow ; vignesh C ; Ajin Cherian 
; tanghy.f...@fujitsu.com; Dilip Kumar 
; Rahila Syed ; Peter Eisentraut 
; Önder Kalacı ; 
japin ; Michael Paquier ; David 
Steele ; Craig Ringer ; Amit 
Langote ; PostgreSQL Hackers 

Subject: Re: row filtering for logical replication

On Thu, Feb 24, 2022 at 7:43 AM Shinoda, Noriyoshi (PN Japan FSIP) 
 wrote:
>
> Hi,
> Thank you for developing of the great feature.
> If multiple tables are specified when creating a PUBLICATION, is it 
> supposed that the WHERE clause condition is given to only one table?
>

You can give it for multiple tables. See below as an example:

> --- operation log ---
> postgres=> CREATE TABLE data1(c1 INT PRIMARY KEY, c2 VARCHAR(10)); 
> CREATE TABLE postgres=> CREATE TABLE data2(c1 INT PRIMARY KEY, c2 
> VARCHAR(10)); CREATE TABLE postgres=> CREATE PUBLICATION pub1 FOR 
> TABLE data1,data2 WHERE (c1 < 1000); CREATE PUBLICATION

postgres=# CREATE PUBLICATION pub_data_1 FOR TABLE data1 WHERE (c1 > 10), data2 
WHERE (c1 < 1000); CREATE PUBLICATION

--
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-02-23 Thread Peter Smith

I noticed that there was a build-farm failure on the machine 'komodoensis' [1]

#   Failed test 'check replicated rows to tab_rowfilter_toast'
#   at t/028_row_filter.pl line 687.
#  got: ''
# expected: 't|1'
# Looks like you failed 1 test of 20.
[18:21:24] t/028_row_filter.pl 
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/20 subtests

That failure looks intermittent because from the history you can see
the same machine already passed multiple times in this test case.

When I investigated the test case I noticed there seems to be a
missing "catchup" ($node_publisher->wait_for_catchup($appname);), so
sometimes if the replication happens too slowly then the expected row
might not be found on the subscriber side.

I will post a patch to fix this shortly.

--
[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis=2022-02-23%2016%3A12%3A03

Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: row filtering for logical replication

2022-02-23 Thread Amit Kapila

On Thu, Feb 24, 2022 at 7:43 AM Shinoda, Noriyoshi (PN Japan FSIP)
 wrote:
>
> Hi,
> Thank you for developing of the great feature.
> If multiple tables are specified when creating a PUBLICATION,
> is it supposed that the WHERE clause condition is given to only one table?
>

You can give it for multiple tables. See below as an example:

> --- operation log ---
> postgres=> CREATE TABLE data1(c1 INT PRIMARY KEY, c2 VARCHAR(10));
> CREATE TABLE
> postgres=> CREATE TABLE data2(c1 INT PRIMARY KEY, c2 VARCHAR(10));
> CREATE TABLE
> postgres=> CREATE PUBLICATION pub1 FOR TABLE data1,data2 WHERE (c1 < 1000);
> CREATE PUBLICATION

postgres=# CREATE PUBLICATION pub_data_1 FOR TABLE data1 WHERE (c1 >
10), data2 WHERE (c1 < 1000);
CREATE PUBLICATION

-- 
With Regards,
Amit Kapila.

RE: row filtering for logical replication

2022-02-23 Thread Shinoda, Noriyoshi (PN Japan FSIP)

Hi, 
Thank you for developing of the great feature. 
If multiple tables are specified when creating a PUBLICATION, 
is it supposed that the WHERE clause condition is given to only one table? 
I attached the operation log below.

--- operation log ---
postgres=> CREATE TABLE data1(c1 INT PRIMARY KEY, c2 VARCHAR(10));
CREATE TABLE
postgres=> CREATE TABLE data2(c1 INT PRIMARY KEY, c2 VARCHAR(10));
CREATE TABLE
postgres=> CREATE PUBLICATION pub1 FOR TABLE data1,data2 WHERE (c1 < 1000);
CREATE PUBLICATION
postgres=> \d data1
  Table "public.data1"
 Column | Type  | Collation | Nullable | Default
+---+---+--+-
 c1 | integer   |   | not null |
 c2 | character varying(10) |   |  |
Indexes:
"data1_pkey" PRIMARY KEY, btree (c1)
Publications:
"pub1"

postgres=> \d data2
  Table "public.data2"
 Column | Type  | Collation | Nullable | Default
+---+---+--+-
 c1 | integer   |   | not null |
 c2 | character varying(10) |   |  |
Indexes:
"data2_pkey" PRIMARY KEY, btree (c1)
Publications:
"pub1" WHERE (c1 < 1000)

postgres=> SELECT prrelid, prqual FROM pg_publication_rel;
 prrelid |   prqual
-+-
   16408 |
   16413 | {OPEXPR :opno 97 :opfuncid 66 :opresulttype 16 :opretset false :opcol
lid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1
:varcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 53} {CONST :con
sttype 23 :consttypmod -1 :constcollid 0 :constlen 4 :constbyval true :constisnu
ll false :location 58 :constvalue 4 [ -24 3 0 0 0 0 0 0 ]}) :location 56}
(2 rows)

Regards,
Noriyoshi Shinoda

-Original Message-
From: Amit Kapila  
Sent: Wednesday, February 23, 2022 11:06 AM
To: Euler Taveira 
Cc: houzj.f...@fujitsu.com; Peter Smith ; Alvaro Herrera 
; Greg Nancarrow ; vignesh C 
; Ajin Cherian ; 
tanghy.f...@fujitsu.com; Dilip Kumar ; Rahila Syed 
; Peter Eisentraut ; 
Önder Kalacı ; japin ; Michael 
Paquier ; David Steele ; Craig Ringer 
; Amit Langote ; PostgreSQL 
Hackers 
Subject: Re: row filtering for logical replication

On Tue, Feb 22, 2022 at 4:47 AM Euler Taveira  wrote:
>
> On Thu, Feb 17, 2022, at 3:36 AM, Amit Kapila wrote:
>
> As there is a new version, I would like to wait for a few more days 
> before committing. I am planning to commit this early next week (by
> Tuesday) unless others or I see any more things that can be improved.
>
> Amit, I don't have additional comments or suggestions. Let's move on. 
> Next topic. :-)
>

Pushed!

--
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-02-22 Thread Amit Kapila

On Tue, Feb 22, 2022 at 4:47 AM Euler Taveira  wrote:
>
> On Thu, Feb 17, 2022, at 3:36 AM, Amit Kapila wrote:
>
> As there is a new version, I would like to wait for a few more days
> before committing. I am planning to commit this early next week (by
> Tuesday) unless others or I see any more things that can be improved.
>
> Amit, I don't have additional comments or suggestions. Let's move on. Next
> topic. :-)
>

Pushed!

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-02-21 Thread Euler Taveira

On Thu, Feb 17, 2022, at 3:36 AM, Amit Kapila wrote:
> As there is a new version, I would like to wait for a few more days
> before committing. I am planning to commit this early next week (by
> Tuesday) unless others or I see any more things that can be improved.
Amit, I don't have additional comments or suggestions. Let's move on. Next
topic. :-)

> I would once like to mention the replica identity handling of the
> patch. Right now, (on HEAD) we are not checking the replica identity
> combination at DDL time, they are checked at execution time in
> CheckCmdReplicaIdentity(). This patch follows the same scheme and
> gives an error at the time of update/delete if the table publishes
> update/delete and the publication(s) has a row filter that contains
> non-replica-identity columns. We had earlier thought of handling it at
> DDL time but that won't follow the existing scheme and has a lot of
> complications as explained in emails [1][2]. Do let me know if you see
> any problem here?
IMO it is not an issue that this patch needs to solve. The conclusion of
checking the RI at the DDL time vs execution time is that:

* the current patch just follows the same pattern used in the current logical
  replication implementation;
* it is easier to check during execution time (a central point) versus a lot of
  combinations for DDL commands;
* the check during DDL time might eventually break if new subcommands are
  added;
* the execution time does not have the maintenance burden imposed by new DDL
  subcommands;
* we might change the RI check to execute at DDL time if the current
  implementation imposes a significant penalty in certain workloads.

Again, it is material for another patch.

Thanks for taking care of a feature that has been discussed for 4 years [1].

[1] 
https://www.postgresql.org/message-id/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com

--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: row filtering for logical replication

2022-02-17 Thread Peter Smith

On Thu, Feb 17, 2022 at 5:37 PM Amit Kapila  wrote:
...
> As there is a new version, I would like to wait for a few more days
> before committing. I am planning to commit this early next week (by
> Tuesday) unless others or I see any more things that can be improved.

I have no more review comments.

This Row Filter patch v85 LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2022-02-10 Thread Amit Kapila

On Thu, Feb 10, 2022 at 9:29 AM houzj.f...@fujitsu.com
 wrote:
>
>
> Attach the V80 patch which addressed the above comments and
> comments from Amit[1].
>

Thanks for the new version. Few minor/cosmetic comments:

1. Can we slightly change the below comment:
Before:
+ * To avoid fetching the publication information, we cache the publication
+ * actions and row filter validation information.

After:
To avoid fetching the publication information repeatedly, we cache the
publication actions and row filter validation information.

2.
+ /*
+ * For ordinary tables, make sure we don't copy data from child
+ * that inherits the named table.
+ */
+ if (lrel.relkind == RELKIND_RELATION)
+ appendStringInfoString(, " ONLY ");

I think we should mention the reason why we are doing so. So how about
something like: "For regular tables, make sure we don't copy data from
a child that inherits the named table as those will be copied
separately."

3.
Can we change the below comment?

Before:
+ /*
+ * Initialize the tuple slot, map and row filter that are only used
+ * when publishing inserts, updates or deletes.
+ */

After:
Initialize the tuple slot, map, and row filter. These are only used
when publishing inserts, updates, or deletes.

4.
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);

Here, you can add a comment saying: "-- Test row filters" or something
on those lines.

5.
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1
WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+  Table "public.testpub_rf_tbl1"
+ Column |  Type   | Collation | Nullable | Default | Storage  | Stats
target | Description
++-+---+--+-+--+--+-
+ a  | integer |   |  | | plain|  |
+ b  | text|   |  | | extended |  |
+Publications:
+"testpub_dplus_rf_no"
+"testpub_dplus_rf_yes" WHERE (a > 1)

I think here \d is sufficient to show row filters? I think it is
better to use table names such as testpub_rf_yes or testpub_rf_no in
this test.

6.
+# Similarly, the table filter for tab_rf_x (after the initial phase) has no
+# effect when combined with the ALL TABLES IN SCHEMA.
+# Expected: 5 initial rows + 2 new rows = 7 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_rf_x (x)
VALUES (-99), (99)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x)
FROM tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');

I think the comment here should say "ALL TABLES." instead of "ALL
TABLES IN SCHEMA." as there is no publication before this test which
is created with "ALL TABLES IN SCHEMA" clause.

7.
+# The subscription of the ALL TABLES IN SCHEMA publication means
there should be
+# no filtering on the tablesync COPY, so all expect all 5 will be present.

It doesn't make sense to use 'all' twice in the above comment, the
first one can be removed.

8.
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',

I think it will be good if we can add some generic comments explaining
the purpose of the tests following this. We can add "# Tests FOR TABLE
with row filter publications" before the current comment.

9. For the newly added test for tab_rowfilter_inherited, the patch has
a test case only for initial sync, can we add a test for replication
after initial sync for the same?

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-02-09 Thread Peter Smith

On Tue, Jan 25, 2022 at 2:18 PM houzj.f...@fujitsu.com
 wrote:
>
...

> > 4. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc
> >
> > - if (relation->rd_pubactions)
> > + if (relation->rd_pubdesc)
> >   {
> > - pfree(relation->rd_pubactions);
> > - relation->rd_pubactions = NULL;
> > + pfree(relation->rd_pubdesc);
> > + relation->rd_pubdesc = NULL;
> >   }
> >
> > What is the purpose of this code? Can't it all just be removed?
> > e.g. Can't you Assert that relation->rd_pubdesc is NULL at this point?
> >
> > (if it was not-null the function would have returned immediately from the 
> > top)
>
> I think it might be better to change this as a separate patch.

OK. I have made a separate thread [1[ for discussing this one.

--
[1] 
https://www.postgresql.org/message-id/flat/1524753.1644453267%40sss.pgh.pa.us#1c40bbc4126daaf75b927a021526654a

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2022-02-08 Thread Amit Kapila

On Wed, Feb 9, 2022 at 7:07 AM Peter Smith  wrote:
>
> 2. src/backend/commands/publicationcmds.c -
> contain_mutable_or_ud_functions_checker
>
> +/* check_functions_in_node callback */
> +static bool
> +contain_mutable_or_user_functions_checker(Oid func_id, void *context)
> +{
> + return (func_volatile(func_id) != PROVOLATILE_IMMUTABLE ||
> + func_id >= FirstNormalObjectId);
> +}
>
> I was wondering why is the checking for user function and mutable
> functions combined in one function like this.  IMO it might be better
> to have 2 "checker" callback functions instead of just one  - then the
> error messages can be split too so that only the relevant one is
> displayed to the user.
>

For that, we need to invoke the checker function multiple times for a
node and or expression. So, not sure if it is worth it.


-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-02-08 Thread Peter Smith

I did a review of the v79 patch. Below are my review comments:

==

1. doc/src/sgml/ref/create_publication.sgml - CREATE PUBLICATION

The commit message for v79-0001 says:

If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.


I think that the same information should also be mentioned in the PG
DOCS for CREATE PUBLICATION note about the WHERE clause.

~~~

2. src/backend/commands/publicationcmds.c -
contain_mutable_or_ud_functions_checker

+/* check_functions_in_node callback */
+static bool
+contain_mutable_or_user_functions_checker(Oid func_id, void *context)
+{
+ return (func_volatile(func_id) != PROVOLATILE_IMMUTABLE ||
+ func_id >= FirstNormalObjectId);
+}

I was wondering why is the checking for user function and mutable
functions combined in one function like this.  IMO it might be better
to have 2 "checker" callback functions instead of just one  - then the
error messages can be split too so that only the relevant one is
displayed to the user.

BEFORE
contain_mutable_or_user_functions_checker --> "User-defined or
built-in mutable functions are not allowed."

AFTER
contain_user_functions_checker --> "User-defined functions are not allowed."
contain_mutable_function_checker --> "Built-in mutable functions are
not allowed."

~~~

3. src/backend/commands/publicationcmds.c - check_simple_rowfilter_expr_walker

+ case T_Const:
+ case T_FuncExpr:
+ case T_BoolExpr:
+ case T_RelabelType:
+ case T_CollateExpr:
+ case T_CaseExpr:
+ case T_CaseTestExpr:
+ case T_ArrayExpr:
+ case T_CoalesceExpr:
+ case T_MinMaxExpr:
+ case T_XmlExpr:
+ case T_NullTest:
+ case T_BooleanTest:
+ case T_List:
+ break;

Perhaps a comment should be added here simply saying "OK, supported"
just to make it more obvious?

~~~

4. src/test/regress/sql/publication.sql - test comment

+-- fail - user-defined types disallowed

For consistency with the nearby comments it would be better to reword this one:
 "fail - user-defined types are not allowed"

~~~

5. src/test/regress/sql/publication.sql - test for \d

+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1
WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1

Actually, the \d (without "+") will also display filters but I don't
think that has been tested anywhere. So suggest updating the comment
and adding one more test

AFTER
-- test \d+  and \d  (now these display filter
information)
...
\d+ testpub_rf_tbl1
\d testpub_rf_tbl1

~~~

6. src/test/regress/sql/publication.sql - tests for partitioned table

+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;

Those comments and the way the code is arranged did not make it very
clear to me what exactly these tests are doing.

I think all the changes to the publish_via_partition_root belong BELOW
those test comments don't they?
Also the same comment "-- ok - partition does not have row filter"
appears 2 times so that can be made more clear too.

e.g. IIUC it should be changed to something a bit like this (Note - I
did not change the SQL, I only moved it a bit and changed the
comments):

AFTER (??)
-- Tests for partitioned table
ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);

-- ok - PUBLISH_VIA_PARTITION_ROOT is false
-- Here the partition does not have a row filter
-- Col "a" is in replica identity.
ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
UPDATE rf_tbl_abcd_part_pk SET a = 1;

-- ok - PUBLISH_VIA_PARTITION_ROOT is true
-- Here the partition does not have a row filter, so the root filter
will be used.
-- Col "a" is in replica identity.
ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
UPDATE rf_tbl_abcd_part_pk SET a = 1;

-- Now change the root filter to use a column "b" (which is not in the
replica identity)
ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);

-- ok - PUBLISH_VIA_PARTITION_ROOT is false
-- Here the partition does not have a row filter
-- Col "a" is in replica identity.

Re: row filtering for logical replication

2022-02-08 Thread Amit Kapila

On Tue, Feb 8, 2022 at 8:01 AM houzj.f...@fujitsu.com
 wrote:
>
> >
> > 12. src/backend/replication/pgoutput/pgoutput.c - get_rel_sync_entry
> >
> > + /*
> > + * Initialize the row filter after getting the final publish_as_relid
> > + * as we only evaluate the row filter of the relation which we publish
> > + * change as.
> > + */
> > + pgoutput_row_filter_init(data, active_publications, entry);
> >
> > The comment "which we publish change as" seems strangely worded.
> >
> > Perhaps it should be:
> > "... only evaluate the row filter of the relation which being published."
>
> Changed.
>

I don't know if this change is an improvement. If you want to change
then I don't think 'which' makes sense in the following part of the
comment: "...relation which being published."

Few other comments:

1. Can we save sending schema change messages if the row filter
doesn't match by moving maybe_send_schema after row filter checks?

2.
commit message/docs:
"The WHERE clause
only allows simple expressions that don't have user-defined functions,
user-defined operators, user-defined collations, non-immutable built-in
functions, or references to system columns."

"user-defined types" is missing in this sentence.

3.
+ /*
+ * For all the supported nodes, check the functions and collations used in
+ * the nodes.
+ */

Again 'types' is missing in this comment.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-02-07 Thread Amit Kapila

On Mon, Feb 7, 2022 at 1:21 PM Peter Smith  wrote:
>
> 5. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (Simple?)
>
> +/*
> + * Is this a simple Node permitted within a row filter expression?
> + */
> +static bool
> +IsRowFilterSimpleExpr(Node *node)
> +{
>
> A lot has changed in this area recently and I feel that there is
> something not quite 100% right with the naming and/or logic in this
> expression validation. IMO there are several functions that seem to
> depend too much on each other in special ways...
>
> IIUC the "walker" logic now seems to be something like this:
> a) Check for special cases of the supported nodes
> b) Then check for supported (simple?) nodes (i.e.
> IsRowFilterSimpleExpr is now acting as a "catch-all" after the special
> case checks)
> c) Then check for some unsupported node embedded within a supported
> node (i.e. call expr_allowed_in_node)
> d) If any of a,b,c was bad then give an error.
>
> To achieve that logic the T_FuncExpr was added to the
> "IsRowFilterSimpleExpr". Meanwhile, other nodes like
> T_ScalarArrayOpExpr and T_NullIfExpr now are removed from
> IsRowFilterSimpleExpr - I don't quite know why these got removed
>

They are removed because those nodes need some special checks based on
which errors could be raised whereas other nodes don't need such
checks.

> but
> perhaps there is implicit knowledge that those node kinds were already
> checked by the "walker" before the IsRowFilterSimpleExpr function ever
> gets called.
>
> So, although I trust that everything is working OK,  I don't think
> IsRowFilterSimpleExpr is really just about simple nodes anymore. It is
> harder to see why some supported nodes are in there, and some
> supported nodes are not. It seems tightly entwined with the logic of
> check_simple_rowfilter_expr_walker; i.e. there seem to be assumptions
> about exactly when it will be called and what was checked before and
> what will be checked after calling it.
>
> IMO probably all the nodes we are supporting should be in the
> IsRowFilterSimpleExpr just for completeness (e.g. put T_NullIfExpr and
> T_ScalarArrayOpExpr back in there...), and maybe the function should
> be renamed (IsRowFilterAllowedNode?),
>

I am not sure if that is a good idea because then instead of
true/false, we need to get an error message as well but I think we can
move back all the nodes handled in IsRowFilterSimpleExpr back to
check_simple_rowfilter_expr_walker() and change the handling to
switch..case

One more thing in this context is, in ScalarArrayOpExpr handling, we
are not checking a few parameters like hashfuncid. Can we please add a
comment that why some parameters are checked and others not?

>
> ~~~
>
> 6. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (T_List)
>
> (From Amit's patch)
>
> @@ -395,6 +397,7 @@ IsRowFilterSimpleExpr(Node *node)
>   case T_NullTest:
>   case T_RelabelType:
>   case T_XmlExpr:
> + case T_List:
>   return true;
>   default:
>   return false;
>
>
> The case T_List should be moved to be alphabetical the same as all the
> other cases.
>

Hmm, I have added based on the way it is defined in nodes.h. T_List is
defined after T_XmlExpr in nodes.h. I don't see they are handled in
alphabetical order in other places like in check_functions_in_node().
I think the nodes that need the same handling should be together and
again there also we can keep them in order as they are defined in
nodes.h and otherwise also all other nodes should be in the same order
as they are defined in nodes.h. That way we will be consistent.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-02-06 Thread Peter Smith

Hi - I did a review of the v77 patches merged with Amit's v77 diff patch [1].

(Maybe this is equivalent to reviewing v78)

Below are my review comments:

==

1. doc/src/sgml/ref/create_publication.sgml - CREATE PUBLICATION

+   The WHERE clause allows simple expressions that
don't have
+   user-defined functions, operators, collations, non-immutable built-in
+   functions, or references to system columns.
+  

That seems slightly ambiguous for operators and collations. It's only
the USER-DEFINED ones we don't support.

Perhaps it should be worded like:

"allows simple expressions that don't have user-defined
functions/operators/collations, non-immutable built-in functions..."

or like

"allows simple expressions that don't have user-defined functions,
user-defined operators, user-defined collations, non-immutable
built-in functions..."

~~~

2. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication

+Oid
+GetTopMostAncestorInPublication(Oid puboid, List *ancestors)
+{
+ ListCell   *lc;
+ Oid topmost_relid = InvalidOid;
+
+ /*
+ * Find the "topmost" ancestor that is in this publication.
+ */
+ foreach(lc, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc);
+ List*apubids = GetRelationPublications(ancestor);
+ List*aschemaPubids = NIL;
+
+ if (list_member_oid(apubids, puboid))
+ topmost_relid = ancestor;
+ else
+ {
+ aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
+ if (list_member_oid(aschemaPubids, puboid))
+ topmost_relid = ancestor;
+ }
+
+ list_free(apubids);
+ list_free(aschemaPubids);
+ }
+
+ return topmost_relid;
+}

Wouldn't it be better for the aschemaPubids to be declared and freed
inside the else block?

e.g.

else
{
List *aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));

if (list_member_oid(aschemaPubids, puboid))
topmost_relid = ancestor;

list_free(aschemaPubids);
}

~~~

3. src/backend/commands/publicationcmds.c - contain_invalid_rfcolumn

+ if (pubviaroot && relation->rd_rel->relispartition)
+ {
+ publish_as_relid = GetTopMostAncestorInPublication(pubid, ancestors);
+
+ if (publish_as_relid == InvalidOid)
+ publish_as_relid = relid;
+ }

Consider using the macro code for the InvalidOid check. e.g.

if (!OidIsValid(publish_as_relid)
publish_as_relid = relid;

~~~

4. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (Tests)

+ switch (nodeTag(node))
+ {
+ case T_ArrayExpr:
+ case T_BooleanTest:
+ case T_BoolExpr:
+ case T_CaseExpr:
+ case T_CaseTestExpr:
+ case T_CoalesceExpr:
+ case T_CollateExpr:
+ case T_Const:
+ case T_FuncExpr:
+ case T_MinMaxExpr:
+ case T_NullTest:
+ case T_RelabelType:
+ case T_XmlExpr:
+ return true;
+ default:
+ return false;
+ }

I think there are several missing regression tests.

4a. There is a new message that says "User-defined collations are not
allowed." but I never saw any test case for it.

4b. There is also the RelabelType which seems to have no test case.
Amit previously provided [2] some SQL which would give an unexpected
error, so I guess that should be a new regression test case. e.g.
create table t1(c1 int, c2 varchar(100));
create publication pub1 for table t1 where (c2 < 'john');

~~~

5. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (Simple?)

+/*
+ * Is this a simple Node permitted within a row filter expression?
+ */
+static bool
+IsRowFilterSimpleExpr(Node *node)
+{

A lot has changed in this area recently and I feel that there is
something not quite 100% right with the naming and/or logic in this
expression validation. IMO there are several functions that seem to
depend too much on each other in special ways...

IIUC the "walker" logic now seems to be something like this:
a) Check for special cases of the supported nodes
b) Then check for supported (simple?) nodes (i.e.
IsRowFilterSimpleExpr is now acting as a "catch-all" after the special
case checks)
c) Then check for some unsupported node embedded within a supported
node (i.e. call expr_allowed_in_node)
d) If any of a,b,c was bad then give an error.

To achieve that logic the T_FuncExpr was added to the
"IsRowFilterSimpleExpr". Meanwhile, other nodes like
T_ScalarArrayOpExpr and T_NullIfExpr now are removed from
IsRowFilterSimpleExpr - I don't quite know why these got removed but
perhaps there is implicit knowledge that those node kinds were already
checked by the "walker" before the IsRowFilterSimpleExpr function ever
gets called.

So, although I trust that everything is working OK,  I don't think
IsRowFilterSimpleExpr is really just about simple nodes anymore. It is
harder to see why some supported nodes are in there, and some
supported nodes are not. It seems tightly entwined with the logic of
check_simple_rowfilter_expr_walker; i.e. there seem to be assumptions
about exactly when it will be called and what was checked before and
what will be checked after calling it.

IMO probably all the nodes we are supporting should be in the
IsRowFilterSimpleExpr just for completeness (e.g. put T_NullIfExpr

Re: row filtering for logical replication

2022-02-05 Thread Amit Kapila

On Fri, Feb 4, 2022 at 2:58 PM houzj.f...@fujitsu.com
 wrote:
>
> On Thursday, February 3, 2022 11:11 PM houzj.f...@fujitsu.com 
> 
>
> Since the v76--clean-up-pgoutput-cache-invalidation.patch has been
> committed, attach a new version patch set to make the cfbot happy. Also
> addressed the above comments related to tab-complete in 0002 patch.
>

I don't like some of the error message changes in this new version. For example:

v75:
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS
$$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer,
RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR:  invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL:  User-defined operators are not allowed.

v77
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS
$$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer,
RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR:  invalid publication WHERE expression
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+DETAIL:  User-defined or mutable functions are not allowed

I think the detailed message by v75 "DETAIL:  User-defined operators
are not allowed." will be easier for users to understand. I have made
some code changes and refactoring to make this behavior like previous
without removing the additional checks you have added in v77. I have
made a few changes to comments and error messages. Attached is a
top-up patch on your v77 patch series. I suggest we can combine the
0001 and 0002 patches as well.

-- 
With Regards,
Amit Kapila.


v77_diff_amit.1.patch
Description: Binary data

Re: row filtering for logical replication

2022-02-03 Thread Andres Freund

Hi,

On 2022-02-01 13:31:36 +1100, Peter Smith wrote:
> TEST STEPS - Workload case a
> 
> 1. Run initdb pub and sub and start both postgres instances (use the nosync 
> postgresql.conf)
> 
> 2. Run psql for both instances and create tables
> CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value));
> 
> 3. create the PUBLISHER on pub instance (e.g. choose from below depending on 
> filter)
> CREATE PUBLICATION pub_1 FOR TABLE test;  
> -- 100% (no filter)
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0);  -- 100% 
> allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 25); -- 75% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 50); -- 50% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 75); -- 25% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 100);-- 0% 
> allowed
> 
> 4. create the SUBSCRIBER on sub instance
> CREATE SUBSCRIPTION sync_sub CONNECTION 'host=127.0.0.1 port=5432 
> dbname=postgres application_name=sync_sub' PUBLICATION pub_1;
> 
> 5. On pub side modify the postgresql.conf on the publisher side and restart
> \q quite psql
> edit synchronous_standby_names = 'sync_sub' 
> restart the pub instance
> 
> 6. Run psql (pub side) and perform the test run.
> \timing
> INSERT INTO test SELECT i, i::text, row_to_json(row(i)) FROM 
> generate_series(1,101)i;
> select count(*) from test;
> TRUNCATE test;
> select count(*) from test;
> repeat 6 for each test run.

I think think using syncrep as the mechanism for benchmarking the decoding
side makes the picture less clear than it could be - you're measuring a lot of
things other than the decoding. E.g. the overhead of applying those changes. I
think it'd be more accurate to do something like:

/* create publications, table, etc */

-- create a slot from before the changes
SELECT pg_create_logical_replication_slot('origin', 'pgoutput');

/* the changes you're going to measure */

-- save end LSN
SELECT pg_current_wal_lsn();

-- create a slot for pg_recvlogical to consume
SELECT * FROM pg_copy_logical_replication_slot('origin', 'consume');

-- benchmark, endpos is from pg_current_wal_lsn() above
time pg_recvlogical -S consume --endpos 0/2413A720 --start -o proto_version=3 
-o publication_names=pub_1 -f /dev/null  -d postgres

-- clean up
SELECT pg_drop_replication_slot('consume');

Then repeat this with the different publications and compare the time taken
for the pg_recvlogical. That way the WAL is exactly the same, there is no
overhead of actually doing anything with the data on the other side, etc.

Greetings,

Andres Freund

Re: row filtering for logical replication

2022-02-03 Thread Ajin Cherian

On Sat, Jan 29, 2022 at 11:31 AM Andres Freund  wrote:
>
> Hi,
>
> Are there any recent performance evaluations of the overhead of row filters? I
> think it'd be good to get some numbers comparing:
>
> 1) $workload with master
> 2) $workload with patch, but no row filters
> 3) $workload with patch, row filter matching everything
> 4) $workload with patch, row filter matching few rows
>
> For workload I think it'd be worth testing:
> a) bulk COPY/INSERT into one table
> b) Many transactions doing small modifications to one table
> c) Many transactions targetting many different tables
> d) Interspersed DDL + small changes to a table
>

Here's the performance data results for scenario d:

HEAD   "with patch no row filter" "with patch 0%" "row-filter-patch
25%" "row-filter-patch v74 50%" "row-filter-patch 75%"
"row-filter-patch v74 100%"
1 65.397639 64.414034 5.919732 20.012096 36.35911 49.412548 64.508842
2 65.641783 65.255775 5.715082 20.157575 36.957403 51.355821 65.708444
3 65.096526 64.795163 6.146072 21.130709 37.679346 49.568513 66.602145
4 65.173569 64.68 5.787197 20.784607 34.465133 55.397313 63.545337
5 65.791092 66.000412 5.642696 20.258802 36.493626 52.873252 63.511428

The performance is similar to the other scenarios.
The script used is below:

CREATE TABLE test (key int, value text, value1 text, data jsonb,
PRIMARY KEY(key, value));

CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0); -- 100% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 25); -- 75% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 50); -- 50% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 75); -- 25% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 100); -- 0% allowed

DO
$do$
BEGIN
FOR i IN 1..101 BY 4000 LOOP
Alter table test alter column value1 TYPE varchar(30);
INSERT INTO test VALUES(i,'BAH', row_to_json(row(i)));
Alter table test ALTER COLUMN value1 TYPE text;
UPDATE test SET value = 'FOO' WHERE key = i;
COMMIT;
END LOOP;
END
$do$;

regards,
Ajin Cherian
Fujitsu Australia

Re: row filtering for logical replication

2022-02-02 Thread Ajin Cherian

Hi Peter,

I just tried scenario b that Andres suggested:

For scenario b, I did some testing with row-filter-patch v74 and
various levels of filtering. 0% replicated to 100% rows replicated.
The times are in seconds, I did 5 runs each.

Results:

RUN  HEAD "with patch 0%" "row-filter-patch 25%" "row-filter-patch
v74 50%" "row-filter-patch 75%" "row-filter-patch v74 100%"
1   17.26178  12.573736   12.869635  13.742167
  17.977112  17.75814
2   17.522473 12.919554   12.640879  14.202737
  14.515481  16.961836
3   17.124001 12.640879   12.706631  14.220245
  15.686613  17.219355
4   17.24122  12.602345   12.674566  14.219423
  15.564312  17.432765
5   17.25352  12.610657   12.689842  14.210725
  15.613708  17.403821

As can see the performance seen on HEAD is similar to that which the
patch achieves with all rows (100%) replication. The performance
improves linearly with
more rows filtered.

The test scenario used was:

1. On publisher and subscriber:
CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value));

2. On publisher: (based on which scenario is being tested)
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0); -- 100% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 25); -- 75% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 50); -- 50% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 75); -- 25% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 100); -- 0% allowed

3. On the subscriber:
CREATE SUBSCRIPTION sync_sub CONNECTION 'host=127.0.0.1 port=5432
dbname=postgres application_name=sync_sub' PUBLICATION pub_1;

4. now modify the postgresql.conf on the publisher side
synchronous_standby_names = 'sync_sub' and restart.

5. The test case:

DO
$do$
BEGIN
FOR i IN 1..101 BY 10 LOOP
INSERT INTO test VALUES(i,'BAH', row_to_json(row(i)));
UPDATE test SET value = 'FOO' WHERE key = i;
IF I % 1000 = 0 THEN
COMMIT;
END IF;
END LOOP;
END
$do$;

regards,
Ajin Cherian
Fujitsu Australia

On Tue, Feb 1, 2022 at 12:07 PM Peter Smith  wrote:
>
> On Sat, Jan 29, 2022 at 11:31 AM Andres Freund  wrote:
> >
> > Hi,
> >
> > Are there any recent performance evaluations of the overhead of row 
> > filters? I
> > think it'd be good to get some numbers comparing:
> >
> > 1) $workload with master
> > 2) $workload with patch, but no row filters
> > 3) $workload with patch, row filter matching everything
> > 4) $workload with patch, row filter matching few rows
> >
> > For workload I think it'd be worth testing:
> > a) bulk COPY/INSERT into one table
> > b) Many transactions doing small modifications to one table
> > c) Many transactions targetting many different tables
> > d) Interspersed DDL + small changes to a table
> >
>
> I have gathered performance data for the workload case (a):
>
> HEAD 46743.75
> v74 no filters 46929.15
> v74 allow 100% 46926.09
> v74 allow 75% 40617.74
> v74 allow 50% 35744.17
> v74 allow 25% 29468.93
> v74 allow 0% 22540.58
>
> PSA.
>
> This was tested using patch v74 and synchronous pub/sub. There are 1M
> INSERTS for publications using differing amounts of row filtering (or
> none).
>
> Observations:
> - There seems insignificant row-filter overheads (e.g. viz no filter
> and 100% allowed versus HEAD).
> - The elapsed time decreases linearly as there is less data getting 
> replicated.
>
> I will post the results for other workload kinds (b, c, d) when I have them.
>
> --
> Kind Regards,
> Peter Smith.
> Fujitsu Australia.

Re: row filtering for logical replication

2022-02-01 Thread Amit Kapila

On Tue, Feb 1, 2022 at 4:51 PM Amit Kapila  wrote:
>
> Review Comments:
> ===
> 1.
> + else if (IsA(node, OpExpr))
> + {
> + /* OK, except user-defined operators are not allowed. */
> + if (((OpExpr *) node)->opno >= FirstNormalObjectId)
> + errdetail_msg = _("User-defined operators are not allowed.");
> + }
>
> Is it sufficient to check only the allowed operators for OpExpr? Don't
> we need to check opfuncid to ensure that the corresponding function is
> immutable? Also, what about opresulttype, opcollid, and inputcollid? I
> think we don't want to allow user-defined types or collations but as
> we are restricting the opexp to use a built-in operator, those should
> not be present in such an expression. If that is the case, then I
> think we can add a comment for the same.
>

Today, I was looking at a few other nodes supported by the patch and I
have similar questions for those as well. As an example, the patch
allows T_ScalarArrayOpExpr and the node is as follows:

typedef struct ScalarArrayOpExpr
{
Expr xpr;
Oid opno; /* PG_OPERATOR OID of the operator */
Oid opfuncid; /* PG_PROC OID of comparison function */
Oid hashfuncid; /* PG_PROC OID of hash func or InvalidOid */
Oid negfuncid; /* PG_PROC OID of negator of opfuncid function
* or InvalidOid.  See above */
bool useOr; /* true for ANY, false for ALL */
Oid inputcollid; /* OID of collation that operator should use */
List*args; /* the scalar and array operands */
int location; /* token location, or -1 if unknown */
} ScalarArrayOpExpr;

Don't we need to check pg_proc OIDs like hashfuncid to ensure that it
is immutable like the patch is doing for FuncExpr? Similarly for
ArrayExpr node, don't we need to check the array_collid to see if it
contains user-defined collation? I think some of these might be okay
to permit but then it is better to have some comments to explain.

--
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-02-01 Thread Amit Kapila

On Tue, Feb 1, 2022 at 9:15 AM houzj.f...@fujitsu.com
 wrote:
>
> On Monday, January 31, 2022 9:02 PM Amit Kapila 
> >

Review Comments:
===
1.
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *) node)->opno >= FirstNormalObjectId)
+ errdetail_msg = _("User-defined operators are not allowed.");
+ }

Is it sufficient to check only the allowed operators for OpExpr? Don't
we need to check opfuncid to ensure that the corresponding function is
immutable? Also, what about opresulttype, opcollid, and inputcollid? I
think we don't want to allow user-defined types or collations but as
we are restricting the opexp to use a built-in operator, those should
not be present in such an expression. If that is the case, then I
think we can add a comment for the same.

2. Can we handle RelabelType node in
check_simple_rowfilter_expr_walker()? I think you need to check
resulttype and collation id to ensure that they are not user-defined.
There doesn't seem to be a need to check resulttypmod as that refers
to pg_attribute.atttypmod and that can't have anything unsafe. This
helps us to handle cases like the following which currently gives an
error:
create table t1(c1 int, c2 varchar(100));
create publication pub1 for table t1 where (c2 < 'john');

3. Similar to above, don't we need to consider disallowing
non-built-in collation of Var type nodes? Now, as we are only
supporting built-in types this might not be required. So, probably a
comment would suffice.

4.
A minor nitpick in tab-complete:
postgres=# Alter PUBLICATION pub1 ADD TABLE t2 WHERE ( c2 > 10)
,WHERE (

After the Where clause, it should not allow adding WHERE. This doesn't
happen for CREATE PUBLICATION case.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-31 Thread Greg Nancarrow

On Tue, Feb 1, 2022 at 2:45 PM houzj.f...@fujitsu.com <
houzj.f...@fujitsu.com> wrote:
>
> On Monday, January 31, 2022 9:02 PM Amit Kapila 
> >
> > 2.
> > + if (pubrinfo->pubrelqual)
> > + appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
> > + appendPQExpBufferStr(query, ";\n");
> >
> > Do we really need additional '()' for rwo filter expression here? See
the below
> > output from pg_dump:
> >
> > ALTER PUBLICATION pub1 ADD TABLE ONLY public.t1 WHERE ((c1 < 100));
>
> I will investigate this and change this later if needed.
>

I don't think we can make this change (i.e. remove the additional
parentheses), because then a "WHERE (TRUE)" row filter would result in
invalid pg_dump output:

e.g.   ALTER PUBLICATION pub1 ADD TABLE ONLY public.test1 WHERE TRUE;

(since currently, parentheses are required around the publication WHERE
expression)

See also the following commit, which specifically added these parentheses
and catered for WHERE TRUE:
https://www.postgresql.org/message-id/attachment/121245/0005-fixup-Publication-WHERE-condition-support-for-pg_dum.patch

Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-31 Thread Amit Kapila

On Tue, Feb 1, 2022 at 9:15 AM houzj.f...@fujitsu.com
 wrote:
>
> On Monday, January 31, 2022 9:02 PM Amit Kapila 
> >
>
> > 3.
> > + /* row filter (if any) */
> > + if (pset.sversion >= 15)
> > + {
> > + if (!PQgetisnull(result, i, 1))
> > + appendPQExpBuffer(, " WHERE %s", PQgetvalue(result, i, 1)); }
> >
> > I don't think we need this version check if while forming query we use NULL 
> > as
> > the second column in the corresponding query for v < 15.
>
> Changed.
>

But, I don't see a corresponding change in the else part of the query:
else
{
printfPQExpBuffer(,
  "SELECT pubname\n"
  "FROM pg_catalog.pg_publication p\n"
  "JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
  "WHERE pr.prrelid = '%s'\n"
  "UNION ALL\n"
  "SELECT pubname\n"
  "FROM pg_catalog.pg_publication p\n"
  "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
  "ORDER BY 1;",
  oid, oid);
}

Don't we need to do that to keep it working with previous versions?

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-31 Thread Greg Nancarrow

On Tue, Feb 1, 2022 at 2:45 PM houzj.f...@fujitsu.com
 wrote:
>
> Attach the V75 patch set which address the above, Amit's[1] and Greg's[2][3] 
> comments.
>

In the v74-0001 patch (and now in the v75-001 patch) a change was made
in the GetTopMostAncestorInPublication() function, to get the relation
and schema publications lists (for the ancestor Oid) up-front:

+ List*apubids = GetRelationPublications(ancestor);
+ List*aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
+
+ if (list_member_oid(apubids, puboid) ||
+list_member_oid(aschemaPubids, puboid))
+   topmost_relid = ancestor;

However, it seems that this makes it less efficient in the case a
match is found in the first list that is searched, since then there
was actually no reason to create the second list.
Instead of this, how about something like this:

List*apubids = GetRelationPublications(ancestor);
List*aschemaPubids = NULL;

if (list_member_oid(apubids, puboid) ||
   list_member_oid(aschemaPubids =
GetSchemaPublications(get_rel_namespace(ancestor)), puboid))
  topmost_relid = ancestor;

or, if that is considered a bit ugly due to the assignment within the
function parameters, alternatively:

List*apubids = GetRelationPublications(ancestor);
List*aschemaPubids = NULL;

if (list_member_oid(apubids, puboid))
   topmost_relid = ancestor;
else
{
   aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
   if (list_member_oid(aschemaPubids, puboid))
  topmost_relid = ancestor;
}

Regards,
Greg Nancarrow
Fujitsu Australia

RE: row filtering for logical replication

2022-01-31 Thread houzj.f...@fujitsu.com

> On Saturday, January 29, 2022 8:31 AM Andres Freund 
> wrote:
> >
> > Hi,
> >
> > Are there any recent performance evaluations of the overhead of row
> > filters? I think it'd be good to get some numbers comparing:
> 
> Thanks for looking at the patch! Will test it.
> 
> > >   case REORDER_BUFFER_CHANGE_INSERT:
> > >   {
> > > - HeapTuple   tuple = 
> > > >data.tp.newtuple->tuple;
> > > + /*
> > > +  * Schema should be sent before the logic that 
> > > replaces the
> > > +  * relation because it also sends the 
> > > ancestor's relation.
> > > +  */
> > > + maybe_send_schema(ctx, change, relation, 
> > > relentry);
> > > +
> > > + new_slot = relentry->new_slot;
> > > +
> > > + ExecClearTuple(new_slot);
> > > + 
> > > ExecStoreHeapTuple(>data.tp.newtuple->tuple,
> > > +new_slot, 
> > > false);
> >
> > Why? This isn't free, and you're doing it unconditionally. I'd bet this 
> > alone is
> > noticeable slowdown over the current state.
>
> It was intended to avoid deform the tuple twice, once in row filter execution
> ,second time in logicalrep_write_tuple. But I will test the performance
> impact of this and improve this if needed.

I removed the unnecessary ExecClearTuple here, I think the ExecStoreHeapTuple
here doesn't allocate or free any memory and seems doesn't have a noticeable
impact from the perf result[1]. And we need this to avoid deforming the tuple
twice. So, it looks acceptable to me.

[1] 0.01% 0.00%  postgres  pgoutput.so [.] ExecStoreHeapTuple@plt

Best regards,
Hou zj

Re: row filtering for logical replication

2022-01-31 Thread Greg Nancarrow

On Mon, Jan 31, 2022 at 12:57 PM houzj.f...@fujitsu.com
 wrote:
>
> Attach the V74 patch set which did the following changes:
>

In the v74-0001 patch, I noticed the following code in get_rel_sync_entry():

+ /*
+ * Tuple slots cleanups. (Will be rebuilt later if needed).
+ */
+ oldctx = MemoryContextSwitchTo(data->cachectx);
+
+ if (entry->old_slot)
+ ExecDropSingleTupleTableSlot(entry->old_slot);
+ if (entry->new_slot)
+ ExecDropSingleTupleTableSlot(entry->new_slot);
+
+ entry->old_slot = NULL;
+ entry->new_slot = NULL;
+
+ MemoryContextSwitchTo(oldctx);

I don't believe the calls to MemoryContextSwitchTo() are required
here, because within the context switch it's just freeing memory, not
allocating it.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-31 Thread Amit Kapila

On Sat, Jan 29, 2022 at 6:01 AM Andres Freund  wrote:
>
>
> > + if (has_filter)
> > + {
> > + /* Create or reset the memory context for row filters */
> > + if (entry->cache_expr_cxt == NULL)
> > + entry->cache_expr_cxt = 
> > AllocSetContextCreate(CacheMemoryContext,
> > +   
> > "Row filter expressions",
> > +   
> > ALLOCSET_DEFAULT_SIZES);
> > + else
> > + MemoryContextReset(entry->cache_expr_cxt);
>
> I see this started before this patch, but I don't think it's a great idea that
> pgoutput does a bunch of stuff in CacheMemoryContext. That makes it
> unnecessarily hard to debug leaks.
>
> Seems like all this should live somwhere below ctx->context, allocated in
> pgoutput_startup()?
>

Agreed.

> Consider what happens in a long-lived replication connection, where
> occasionally there's a transient error causing streaming to stop. At that
> point you'll just loose all knowledge of entry->cache_expr_cxt, no?
>

I think we will lose knowledge because the WALSender exits on ERROR
but that would be true even when we allocate it in this new allocated
context. Am, I missing something?

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-31 Thread Peter Smith

On Tue, Feb 1, 2022 at 12:07 PM Peter Smith  wrote:
>
> On Sat, Jan 29, 2022 at 11:31 AM Andres Freund  wrote:
> >
> > Hi,
> >
> > Are there any recent performance evaluations of the overhead of row 
> > filters? I
> > think it'd be good to get some numbers comparing:
> >
> > 1) $workload with master
> > 2) $workload with patch, but no row filters
> > 3) $workload with patch, row filter matching everything
> > 4) $workload with patch, row filter matching few rows
> >
> > For workload I think it'd be worth testing:
> > a) bulk COPY/INSERT into one table
> > b) Many transactions doing small modifications to one table
> > c) Many transactions targetting many different tables
> > d) Interspersed DDL + small changes to a table
> >
>
> I have gathered performance data for the workload case (a):
>
> HEAD 46743.75
> v74 no filters 46929.15
> v74 allow 100% 46926.09
> v74 allow 75% 40617.74
> v74 allow 50% 35744.17
> v74 allow 25% 29468.93
> v74 allow 0% 22540.58
>
> PSA.
>
> This was tested using patch v74 and synchronous pub/sub. There are 1M
> INSERTS for publications using differing amounts of row filtering (or
> none).
>
> Observations:
> - There seems insignificant row-filter overheads (e.g. viz no filter
> and 100% allowed versus HEAD).
> - The elapsed time decreases linearly as there is less data getting 
> replicated.
>

FYI - attached are the test steps I used in case anyone wants to try
to reproduce these results.

--
Kind Regards,
Peter Smith.
Fujitsu Australia
TEST STEPS - Workload case a

1. Run initdb pub and sub and start both postgres instances (use the nosync 
postgresql.conf)

2. Run psql for both instances and create tables
CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value));

3. create the PUBLISHER on pub instance (e.g. choose from below depending on 
filter)
CREATE PUBLICATION pub_1 FOR TABLE test;
-- 100% (no filter)
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0);-- 100% 
allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 25);   -- 75% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 50);   -- 50% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 75);   -- 25% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 100);  -- 0% allowed

4. create the SUBSCRIBER on sub instance
CREATE SUBSCRIPTION sync_sub CONNECTION 'host=127.0.0.1 port=5432 
dbname=postgres application_name=sync_sub' PUBLICATION pub_1;

5. On pub side modify the postgresql.conf on the publisher side and restart
\q quite psql
edit synchronous_standby_names = 'sync_sub' 
restart the pub instance

6. Run psql (pub side) and perform the test run.
\timing
INSERT INTO test SELECT i, i::text, row_to_json(row(i)) FROM 
generate_series(1,101)i;
select count(*) from test;
TRUNCATE test;
select count(*) from test;
repeat 6 for each test run.

~~

Repeat from step 1 for different filter case.

Re: row filtering for logical replication

2022-01-31 Thread Andres Freund

On 2022-01-31 14:12:38 +1100, Greg Nancarrow wrote:
> This array was only ever meant to be read-only, and visible only to
> that function.
> IMO removing "static" makes things worse because now that array gets
> initialized each call to the function, which is unnecessary.
> I think it should just be: "static const int map_changetype_pubaction[] = ..."

Yes, static const is good. static alone, not so much.

Re: row filtering for logical replication

2022-01-31 Thread Amit Kapila

On Mon, Jan 31, 2022 at 1:08 PM Amit Kapila  wrote:
>
> On Mon, Jan 31, 2022 at 7:27 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Monday, January 31, 2022 8:53 AM Peter Smith  
> > wrote:
> > >
> > > PSA v73*.
> > >
> > > (A rebase was needed due to recent changes in tab-complete.c.
> > > Otherwise, v73* is the same as v72*).
> >
> > Thanks for the rebase.
> > Attach the V74 patch set which did the following changes:
> >
>
> Few comments:
> =
>

Few more minor comments:
1.
+ if (relentry->attrmap)
+ {
+ TupleDesc tupdesc  = RelationGetDescr(relation);
+ TupleTableSlot *tmp_slot = MakeTupleTableSlot(tupdesc,
+   );
+
+ new_slot = execute_attr_map_slot(relentry->attrmap,
+ new_slot,
+ tmp_slot);

I think we don't need these additional variables tupdesc and tmp_slot.
You can directly use MakeTupleTableSlot instead of tmp_slot, which
will make this and nearby code look better.

2.
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");

Do we really need additional '()' for rwo filter expression here? See
the below output from pg_dump:

ALTER PUBLICATION pub1 ADD TABLE ONLY public.t1 WHERE ((c1 < 100));

3.
+ /* row filter (if any) */
+ if (pset.sversion >= 15)
+ {
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(, " WHERE %s", PQgetvalue(result, i, 1));
+ }

I don't think we need this version check if while forming query we use
NULL as the second column in the corresponding query for v < 15.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-30 Thread Amit Kapila

On Mon, Jan 31, 2022 at 7:27 AM houzj.f...@fujitsu.com
 wrote:
>
> On Monday, January 31, 2022 8:53 AM Peter Smith  wrote:
> >
> > PSA v73*.
> >
> > (A rebase was needed due to recent changes in tab-complete.c.
> > Otherwise, v73* is the same as v72*).
>
> Thanks for the rebase.
> Attach the V74 patch set which did the following changes:
>

Few comments:
=
1.
/* Create or reset the memory context for row filters */
+ entry->cache_expr_cxt = AllocSetContextCreate(cachectx,
+   "Row filter expressions",
+   ALLOCSET_DEFAULT_SIZES);
+
In the new code, we are no longer resetting it here, so we can
probably remove "or reset" from the above comment.

2. You have changed some of the interfaces to pass memory context.
Isn't it better to pass "PGOutputData *" and then use the required
memory context. That will keep the interfaces consistent and we do
something similar in ExecPrepareExpr.

3.
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(MemoryContext cachectx, List *publications,
+ RelationSyncEntry *entry)

In the above comment, the first time doesn't seem to fit well after
your changes because now that has been taken care of by the caller.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-30 Thread Greg Nancarrow

On Mon, Jan 31, 2022 at 12:57 PM houzj.f...@fujitsu.com
 wrote:
>
> Attach the V74 patch set which did the following changes:
>

Hi,

I tested psql and pg_dump after application of this patch, from the
following perspectives:
- "\dRp+" and "\d " (added by the patch, for PostgreSQL
15) show row filters associated with publications and specified
tables, respectively.
- psql is able to connect to the same or older server version
- pg_dump is able to dump from the same or older server version
- dumps can be loaded into newer server versions than that of pg_dump
- PostgreSQL v9 doesn't support publications
- Only PostgreSQL v15 supports row filters (via the patch)

So specifically I tested the following versions (built from the stable
branch): 9.2, 9.6, 10, 11, 12, 13, 14 and 15 and used the following
publication definitions:

create table test1(i int primary key);
create table test2(i int primary key, j text);
create schema myschema;
create table myschema.test3(i int primary key, j text, k text);
create publication pub1 for all tables;
create publication pub2 for table test1 [ where (i > 100); ]
create publication pub3 for table test1 [ where (i > 50), test2 where
(i > 100), myschema.test3 where (i > 200) ] with (publish = 'insert,
update');

(note that for v9, only the above tables and schemas can be defined,
as publications are not supported, and only the row filter "where"
clauses can be defined on v15)

I tested:
- v15 psql connecting to same and older versions, and using "\dRp+"
and "\d " commands
- v15 pg_dump, dumping the above definitions from the same or older
server versions
- Loading dumps from older or same (v15) server version into a v15 server.

I did not detect any issues.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-30 Thread Greg Nancarrow

On Mon, Jan 31, 2022 at 1:12 PM houzj.f...@fujitsu.com
 wrote:
>
> > > +   /*
> > > +* We need this map to avoid relying on ReorderBufferChangeType
> > enums
> > > +* having specific values.
> > > +*/
> > > +   static int map_changetype_pubaction[] = {
> > > +   [REORDER_BUFFER_CHANGE_INSERT] = PUBACTION_INSERT,
> > > +   [REORDER_BUFFER_CHANGE_UPDATE] = PUBACTION_UPDATE,
> > > +   [REORDER_BUFFER_CHANGE_DELETE] = PUBACTION_DELETE
> > > +   };
> >
> > Why is this "static"? Function-local statics only really make sense for 
> > variables
> > that are changed and should survive between calls to a function.
>
> Removed the "static" label.
>

This array was only ever meant to be read-only, and visible only to
that function.
IMO removing "static" makes things worse because now that array gets
initialized each call to the function, which is unnecessary.
I think it should just be: "static const int map_changetype_pubaction[] = ..."

Regards,
Greg Nancarrow
Fujitsu Australia

RE: row filtering for logical replication

2022-01-30 Thread houzj.f...@fujitsu.com

On Saturday, January 29, 2022 8:31 AM Andres Freund  wrote:
> 
> Hi,
> 
> Are there any recent performance evaluations of the overhead of row filters? I
> think it'd be good to get some numbers comparing:

Thanks for looking at the patch! Will test it.

> 1) $workload with master
> 2) $workload with patch, but no row filters
> 3) $workload with patch, row filter matching everything
> 4) $workload with patch, row filter matching few rows
> 
> For workload I think it'd be worth testing:
> a) bulk COPY/INSERT into one table
> b) Many transactions doing small modifications to one table
> c) Many transactions targetting many different tables
> d) Interspersed DDL + small changes to a table
> > +/*
> > + * Initialize for row filter expression execution.
> > + */
> > +static ExprState *
> > +pgoutput_row_filter_init_expr(Node *rfnode) {
> > +   ExprState  *exprstate;
> > +   Expr   *expr;
> > +
> > +   /*
> > +* This is the same code as ExecPrepareExpr() but that is not used
> because
> > +* we want to cache the expression. There should probably be another
> > +* function in the executor to handle the execution outside a normal
> Plan
> > +* tree context.
> > +*/
> > +   expr = expression_planner((Expr *) rfnode);
> > +   exprstate = ExecInitExpr(expr, NULL);
> > +
> > +   return exprstate;
> > +}
> 
> In what memory context does this run? Are we taking care to deal with leaks?
> I'm pretty sure the planner relies on cleanup via memory contexts.

It was running under entry->cache_expr_cxt.

> > +   memset(entry->exprstate, 0, sizeof(entry->exprstate));
> > +
> > +   schemaId = get_rel_namespace(entry->publish_as_relid);
> > +   schemaPubids = GetSchemaPublications(schemaId);
> 
> Isn't this stuff that we've already queried before? If we re-fetch a lot of
> information it's not clear to me that it's actually a good idea to defer 
> building
> the row filter.
> 
> 
> > +   am_partition = get_rel_relispartition(entry->publish_as_relid);
> 
> All this stuff likely can cause some memory "leakage" if you run it in a 
> long-lived
> memory context.
> 
> 
> > +   /*
> > +* Find if there are any row filters for this relation. If there are,
> > +* then prepare the necessary ExprState and cache it in
> > +* entry->exprstate. To build an expression state, we need to ensure
> > +* the following:
...
> > +*
> > +* ALL TABLES IN SCHEMA implies "don't use row filter expression" if
> > +* the schema is the same as the table schema.
> > +*/
> > +   foreach(lc, data->publications)
...
> > +   else if (list_member_oid(schemaPubids, pub->oid))
> > +   {
> > +   /*
> > +* If the publication is FOR ALL TABLES IN SCHEMA and
> it overlaps
> > +* with the current relation in the same schema then 
> > this
> is also
> > +* treated same as if this table has no row filters 
> > (even if
> for
> > +* other publications it does).
> > +*/
> > +   pub_no_filter = true;
> 
> Isn't this basically O(schemas * publications)?

Moved the row filter initialization code to get_rel_sync_entry.

> 
> > +   if (has_filter)
> > +   {
> > +   /* Create or reset the memory context for row filters */
> > +   if (entry->cache_expr_cxt == NULL)
> > +   entry->cache_expr_cxt =
> AllocSetContextCreate(CacheMemoryContext,
> > +
> "Row filter expressions",
> > +
> ALLOCSET_DEFAULT_SIZES);
> > +   else
> > +   MemoryContextReset(entry->cache_expr_cxt);
> 
> I see this started before this patch, but I don't think it's a great idea that
> pgoutput does a bunch of stuff in CacheMemoryContext. That makes it
> unnecessarily hard to debug leaks.
> 
> Seems like all this should live somwhere below ctx->context, allocated in
> pgoutput_startup()?
> 
> Consider what happens in a long-lived replication connection, where
> occasionally there's a transient error causing streaming to stop. At that 
> point
> you'll just loose all knowledge of entry->cache_expr_cxt, no?
> 
> 
> > +
> > +/* Inialitize the slot for storing new and old tuple */ static void
> > +init_tuple_slot(Relation relation, RelationSyncEntry *entry) {
> > +   MemoryContext   oldctx;
> > +   TupleDesc   oldtupdesc;
> > +   TupleDesc   newtupdesc;
> > +
> > +   oldctx = MemoryContextSwitchTo(CacheMemoryContext);
> > +
> > +   /*
> > +* Create tuple table slots. Create a copy of the TupleDesc as it needs 
> > to
> > +* live as long as the cache remains.
> > +*/
> > +   oldtupdesc = CreateTupleDescCopy(RelationGetDescr(relation));
> > +   newtupdesc = CreateTupleDescCopy(RelationGetDescr(relation));
> > +
> > +   entry->old_slot = MakeSingleTupleTableSlot(oldtupdesc,
> );
> > +   entry->new_slot = MakeSingleTupleTableSlot(newtupdesc,
> > +);
> > +
> > +

Re: row filtering for logical replication

2022-01-29 Thread Alvaro Herrera

On 2022-Jan-28, Andres Freund wrote:

> > +   foreach(lc, data->publications)
> > +   {
> > +   Publication *pub = lfirst(lc);

...

> Isn't this basically O(schemas * publications)?

Yeah, there are various places in the logical replication code that seem
pretty careless about this kind of thing -- most of it seems to assume
that there are going to be few publications, so it just looks things up
over and over with abandon, and I saw at least one place where it looped
up an inheritance hierarchy for partitioning doing indexscans at each
level(*).  I think a lot more thought is going to be required to fix
these things in a thorough manner -- a log.repl.-specific caching
mechanism, I imagine.

(*) Before 025b920a3d45, psql was forced to seqscan pg_publication_rel
for one of the describe.c queries, and nobody seems to have noticed.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Y una voz del caos me habló y me dijo
"Sonríe y sé feliz, podría ser peor".
Y sonreí. Y fui feliz.
Y fue peor.

Re: row filtering for logical replication

2022-01-28 Thread Andres Freund

Hi,

Are there any recent performance evaluations of the overhead of row filters? I
think it'd be good to get some numbers comparing:

1) $workload with master
2) $workload with patch, but no row filters
3) $workload with patch, row filter matching everything
4) $workload with patch, row filter matching few rows

For workload I think it'd be worth testing:
a) bulk COPY/INSERT into one table
b) Many transactions doing small modifications to one table
c) Many transactions targetting many different tables
d) Interspersed DDL + small changes to a table


> +/*
> + * Initialize for row filter expression execution.
> + */
> +static ExprState *
> +pgoutput_row_filter_init_expr(Node *rfnode)
> +{
> + ExprState  *exprstate;
> + Expr   *expr;
> +
> + /*
> +  * This is the same code as ExecPrepareExpr() but that is not used 
> because
> +  * we want to cache the expression. There should probably be another
> +  * function in the executor to handle the execution outside a normal 
> Plan
> +  * tree context.
> +  */
> + expr = expression_planner((Expr *) rfnode);
> + exprstate = ExecInitExpr(expr, NULL);
> +
> + return exprstate;
> +}

In what memory context does this run? Are we taking care to deal with leaks?
I'm pretty sure the planner relies on cleanup via memory contexts.


> + memset(entry->exprstate, 0, sizeof(entry->exprstate));
> +
> + schemaId = get_rel_namespace(entry->publish_as_relid);
> + schemaPubids = GetSchemaPublications(schemaId);

Isn't this stuff that we've already queried before? If we re-fetch a lot of
information it's not clear to me that it's actually a good idea to defer
building the row filter.


> + am_partition = get_rel_relispartition(entry->publish_as_relid);

All this stuff likely can cause some memory "leakage" if you run it in a
long-lived memory context.


> + /*
> +  * Find if there are any row filters for this relation. If there are,
> +  * then prepare the necessary ExprState and cache it in
> +  * entry->exprstate. To build an expression state, we need to ensure
> +  * the following:
> +  *
> +  * All publication-table mappings must be checked.
> +  *
> +  * If the relation is a partition and pubviaroot is true, use the row
> +  * filter of the topmost partitioned table instead of the row filter of
> +  * its own partition.
> +  *
> +  * Multiple publications might have multiple row filters for this
> +  * relation. Since row filter usage depends on the DML operation, there
> +  * are multiple lists (one for each operation) to which row filters
> +  * will be appended.
> +  *
> +  * FOR ALL TABLES implies "don't use row filter expression" so it takes
> +  * precedence.
> +  *
> +  * ALL TABLES IN SCHEMA implies "don't use row filter expression" if
> +  * the schema is the same as the table schema.
> +  */
> + foreach(lc, data->publications)
> + {
> + Publication *pub = lfirst(lc);
> + HeapTuple   rftuple = NULL;
> + Datum   rfdatum = 0;
> + boolpub_no_filter = false;
> +
> + if (pub->alltables)
> + {
> + /*
> +  * If the publication is FOR ALL TABLES then it is 
> treated the
> +  * same as if this table has no row filters (even if 
> for other
> +  * publications it does).
> +  */
> + pub_no_filter = true;
> + }
> + else if (list_member_oid(schemaPubids, pub->oid))
> + {
> + /*
> +  * If the publication is FOR ALL TABLES IN SCHEMA and 
> it overlaps
> +  * with the current relation in the same schema then 
> this is also
> +  * treated same as if this table has no row filters 
> (even if for
> +  * other publications it does).
> +  */
> + pub_no_filter = true;

Isn't this basically O(schemas * publications)?




> + if (has_filter)
> + {
> + /* Create or reset the memory context for row filters */
> + if (entry->cache_expr_cxt == NULL)
> + entry->cache_expr_cxt = 
> AllocSetContextCreate(CacheMemoryContext,
> + 
>   "Row filter expressions",
> + 
>   ALLOCSET_DEFAULT_SIZES);
> + else
> + MemoryContextReset(entry->cache_expr_cxt);

I see this started before this patch, but I don't think it's a great idea that
pgoutput does a bunch of stuff in CacheMemoryContext. That makes it
unnecessarily hard to debug leaks.

Seems like all this should

Re: row filtering for logical replication

2022-01-28 Thread Alvaro Herrera

I just pushed a change to tab-complete because of a comment in the
column-list patch series.  I checked and your v72-0002 does not
conflict, but it doesn't fully work either; AFAICT you'll have to change
it so that the WHERE clause appears in the COMPLETE_WITH(",") line I
just added.  As far as I tested it, with that change the completion
works fine.


Unrelated to these two patches:

Frankly I would prefer that these completions offer a ";" in addition to
the "," and "WHERE".  But we have no precedent for doing that (offering
to end the command) anywhere in the completion rules, so I think it
would be a larger change that would merit more discussion.

And while we're talking of larger changes, I would love it if other
commands such as DROP TABLE offered a "," completion after a table name,
so that a command can be tab-completed to drop multiple tables.  (Same
with other commands that process multiple comma-separated objects, of
course.)

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"On the other flipper, one wrong move and we're Fatal Exceptions"
(T.U.X.: Term Unit X  - http://www.thelinuxreview.com/TUX/)

Re: row filtering for logical replication

2022-01-28 Thread Greg Nancarrow

On Fri, Jan 28, 2022 at 2:26 PM houzj.f...@fujitsu.com
 wrote:
>
> Attach the V72 patch set which did the above changes.
>

Thanks for updating the patch set.
One thing I noticed, in the patch commit comment it says:

Psql commands \dRp+ and \d will display any row filters.

However, "\d" by itself doesn't show any row filter information, so I
think it should say:

Psql commands "\dRp+" and "\d " will display any row filters.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-26 Thread Peter Smith

Here are some review comments for v71-0001

~~~

1. Commit Message - database

"...that don't satisfy this WHERE clause will be filtered out. This allows a
database or set of tables to be partially replicated. The row filter is
per table. A new row filter can be added simply by specifying a WHERE..."

I don't know what extra information is conveyed by saying "a
database". Isn't it sufficient to just say "This allows a set of
tables to be partially replicated." ?

~~~

2. Commit message - OR'ed

The row filters are applied before publishing the changes. If the
subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so
that rows satisfying any of the expressions will be replicated.

Shouldn't that say:
"with different filters," --> "with different filters (for the same
publish operation),"

~~~

3. Commit message - typo

This means all the other filters become redundant if (a) one of the
publications have no filter at all, (b) one of the publications was
created using FOR ALL TABLES, (c) one of the publications was created
using FOR ALL TABLES IN SCHEMA and the table belongs to that same schema.

Typo:
"have no filter" --> "has no filter"

~~~

4. Commit message - psql \d+

"Psql commands \dRp+ and \d+ will display any row filters."

Actually, just "\d" (without +) will also display row filters. You do
not need to say "\d+"

~~~

5. src/backend/executor/execReplication.c - CheckCmdReplicaIdentity

+ RelationBuildPublicationDesc(rel, );
+ if (!pubdesc.rf_valid_for_update && cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Column \"%s\" used in the publication WHERE expression is
not part of the replica identity.",
+get_attname(RelationGetRelid(rel),
+pubdesc.invalid_rfcol_update,
+false;
+ else if (!pubdesc.rf_valid_for_delete && cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Column \"%s\" used in the publication WHERE expression is
not part of the replica identity.",
+get_attname(RelationGetRelid(rel),
+pubdesc.invalid_rfcol_delete,
+false;


IMO those conditions should be reversed because (a) it's more optimal
to test the other way around, and (b) for consistency with other code
in this function.

BEFORE
+ if (!pubdesc.rf_valid_for_update && cmd == CMD_UPDATE)
...
+ else if (!pubdesc.rf_valid_for_delete && cmd == CMD_DELETE)
AFTER
+ if (cmd == CMD_UPDATE && !pubdesc.rf_valid_for_update)
...
+ else if (cmd == CMD_DELETE && !pubdesc.rf_valid_for_delete)

~~~

6. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

+ /*
+ * Unchanged toasted replica identity columns are only logged in the
+ * old tuple, copy this over to the new tuple. The changed (or WAL
+ * Logged) toast values are always assembled in memory and set as
+ * VARTAG_INDIRECT. See ReorderBufferToastReplace.
+ */

Something seems not quite right with the comma in that first sentence.
Maybe a period is better?

BEFORE
Unchanged toasted replica identity columns are only logged in the old
tuple, copy this over to the new tuple.
AFTER
Unchanged toasted replica identity columns are only logged in the old
tuple. Copy this over to the new tuple.

~~~

7. src/test/subscription/t/028_row_filter.pl - COPYRIGHT

This TAP file should have a copyright comment that is consistent with
all the other TAP files.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2022-01-26 Thread Greg Nancarrow

On Thu, Jan 27, 2022 at 4:59 PM Peter Smith  wrote:
>
> On Thu, Jan 27, 2022 at 9:40 AM Greg Nancarrow 
wrote:
> >
> > On Wed, Jan 26, 2022 at 2:08 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > There was a miss in the posted patch which didn't initialize the
parameter in
> > > RelationBuildPublicationDesc, sorry for that. Attach the correct
patch this time.
> > >
> >
> > A few comments for the v71-0001 patch:
> ...
> > (2) check_simple_rowfilter_expr_walker
> >
> > In the function header:
> > (i) "etc" should be "etc."
> > (ii)
> > Is
> >
> > + * - (Var Op Const) Bool (Var Op Const)
> >
> >meant to be:
> >
> > + * - (Var Op Const) Logical-Op (Var Op Const)
> >
> > ?
> >
> > It's not clear what "Bool" means here.
>
> The comment is only intended as a generic example of the kinds of
> acceptable expression format.
>
> The names in the comment used are roughly equivalent to the Node* tag
names.
>
> This particular example is for an expression with AND/OR/NOT, which is
> handled by a BoolExpr.
>
> There is no such animal as LogicalOp, so rather than change like your
> suggestion I feel if this comment is going to change then it would be
> better to change to be "boolop" (because the BoolExpr struct has a
> boolop member). e.g.
>
> BEFORE
> + * - (Var Op Const) Bool (Var Op Const)
> AFTER
> + * - (Var Op Const) boolop (Var Op Const)
>

My use of "LogicalOp" was just indicating that the use of "Bool" in that
line was probably meant to mean "Logical Operator", and these are
documented in "9.1 Logical Operators" here:
https://www.postgresql.org/docs/14/functions-logical.html
(PostgreSQL docs don't refer to AND/OR etc. as boolean operators)

Perhaps, to make it clear, the change for the example compound expression
could simply be:

+ * - (Var Op Const) AND/OR (Var Op Const)

or at least say something like "- where boolop is AND/OR".

Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-26 Thread Peter Smith

On Thu, Jan 27, 2022 at 9:40 AM Greg Nancarrow  wrote:
>
> On Wed, Jan 26, 2022 at 2:08 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > There was a miss in the posted patch which didn't initialize the parameter 
> > in
> > RelationBuildPublicationDesc, sorry for that. Attach the correct patch this 
> > time.
> >
>
> A few comments for the v71-0001 patch:
...
> (2) check_simple_rowfilter_expr_walker
>
> In the function header:
> (i) "etc" should be "etc."
> (ii)
> Is
>
> + * - (Var Op Const) Bool (Var Op Const)
>
>meant to be:
>
> + * - (Var Op Const) Logical-Op (Var Op Const)
>
> ?
>
> It's not clear what "Bool" means here.

The comment is only intended as a generic example of the kinds of
acceptable expression format.

The names in the comment used are roughly equivalent to the Node* tag names.

This particular example is for an expression with AND/OR/NOT, which is
handled by a BoolExpr.

There is no such animal as LogicalOp, so rather than change like your
suggestion I feel if this comment is going to change then it would be
better to change to be "boolop" (because the BoolExpr struct has a
boolop member). e.g.

BEFORE
+ * - (Var Op Const) Bool (Var Op Const)
AFTER
+ * - (Var Op Const) boolop (Var Op Const)

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: row filtering for logical replication

2022-01-26 Thread Greg Nancarrow

On Wed, Jan 26, 2022 at 2:08 PM houzj.f...@fujitsu.com
 wrote:
>
> There was a miss in the posted patch which didn't initialize the parameter in
> RelationBuildPublicationDesc, sorry for that. Attach the correct patch this 
> time.
>

I have some additional doc update suggestions for the v71-0001 patch:


(1) Patch commit comment

BEFORE:
row filter evaluates to NULL, it returns false. The WHERE clause only
AFTER:
row filter evaluates to NULL, it is regarded as "false". The WHERE clause only


doc/src/sgml/catalogs.sgml

(2) ALTER PUBLICATION

BEFORE:
+  expression returns
false or null will
AFTER:
+  expression
evaluates to false or null will


doc/src/sgml/ref/alter_subscription.sgml

(3) ALTER SUBSCRIPTION

BEFORE:
+  filter WHERE clause had been modified.
AFTER:
+  filter WHERE clause has since been modified.


doc/src/sgml/ref/create_publication.sgml

(4) CREATE PUBLICATION

BEFORE:
+  which the expression returns
+  false or null will not be published. Note that parentheses are required
AFTER:
+  which the expression evaluates
+  to false or null will not be published. Note that parentheses
are required


doc/src/sgml/ref/create_subscription.sgml

(5) CREATE SUBSCRIPTION

BEFORE:
+   returns false or null will not be published. If the subscription has several
AFTER:
+   evaluates to false or null will not be published. If the
subscription has several


Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-26 Thread Greg Nancarrow

On Wed, Jan 26, 2022 at 2:08 PM houzj.f...@fujitsu.com
 wrote:
>
> There was a miss in the posted patch which didn't initialize the parameter in
> RelationBuildPublicationDesc, sorry for that. Attach the correct patch this 
> time.
>

A few comments for the v71-0001 patch:

doc/src/sgml/catalogs.sgml

(1)

+
+ 
+  
+  prqual pg_node_tree
+  
+  Expression tree (in nodeToString()
+  representation) for the relation's qualifying condition. Null if
+  there is no qualifying condition.
+ 

"qualifying condition" sounds a bit vague here.
Wouldn't it be better to say "publication qualifying condition"?


src/backend/commands/publicationcmds.c

(2) check_simple_rowfilter_expr_walker

In the function header:
(i) "etc" should be "etc."
(ii)
Is

+ * - (Var Op Const) Bool (Var Op Const)

   meant to be:

+ * - (Var Op Const) Logical-Op (Var Op Const)

?

It's not clear what "Bool" means here.

(3) check_simple_rowfilter_expr_walker
We should say "Built-in functions" instead of "System-functions":

+   * User-defined functions are not allowed. System-functions that are

Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-26 Thread Peter Smith

On Tue, Jan 25, 2022 at 2:18 PM houzj.f...@fujitsu.com
 wrote:
>
> On Monday, January 24, 2022 4:38 PM Peter Smith 
> >
...
> > 5. src/include/catalog/pg_publication.h - typedef struct PublicationDesc
> >
> > +typedef struct PublicationDesc
> > +{
> > + /*
> > + * true if the columns referenced in row filters which are used for
> > +UPDATE
> > + * or DELETE are part of the replica identity, or the publication
> > +actions
> > + * do not include UPDATE or DELETE.
> > + */
> > + bool rf_valid_for_update;
> > + bool rf_valid_for_delete;
> > +
> > + AttrNumber invalid_rfcol_update;
> > + AttrNumber invalid_rfcol_delete;
> > +
> > + PublicationActions pubactions;
> > +} PublicationDesc;
> > +
> >
> > I did not see any point really for the pairs of booleans and AttNumbers.
> > AFAIK both of them shared exactly the same validation logic so I think you 
> > can
> > get by using fewer members here.
>
> the pairs of booleans are intended to fix the problem[2] reported earlier.
> [2] 
> https://www.postgresql.org/message-id/OS0PR01MB611367BB85115707CDB2F40CFB5A9%40OS0PR01MB6113.jpnprd01.prod.outlook.com
> >

OK. Thanks for the info.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2022-01-26 Thread Amit Kapila

On Wed, Jan 26, 2022 at 8:37 AM houzj.f...@fujitsu.com
 wrote:
>
> On Monday, January 24, 2022 4:38 PM Peter Smith  wrote:
> >
> >
> > 3. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc
> >
> > +RelationBuildPublicationDesc(Relation relation)
> >  {
> >   List*puboids;
> >   ListCell   *lc;
> >   MemoryContext oldcxt;
> >   Oid schemaid;
> > - PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
> > + List*ancestors = NIL;
> > + Oid relid = RelationGetRelid(relation); AttrNumber invalid_rfcolnum =
> > + InvalidAttrNumber; PublicationDesc *pubdesc =
> > + palloc0(sizeof(PublicationDesc)); PublicationActions *pubactions =
> > + >pubactions;
> > +
> > + pubdesc->rf_valid_for_update = true;
> > + pubdesc->rf_valid_for_delete = true;
> >
> > IMO it wold be better to change the "sense" of those variables.
> > e.g.
> >
> > "rf_valid_for_update" --> "rf_invalid_for_update"
> > "rf_valid_for_delete" --> "rf_invalid_for_delete"
> >
> > That way they have the same 'sense' as the AttrNumbers so it all reads 
> > better to
> > me.
> >
> > Also, it means no special assignment is needed because the palloc0 will set
> > them correctly
>
> Think again, I am not sure it's better to have an invalid_... flag.
> It seems more natural to have a valid_... flag.
>

Can't we do without these valid_ flags? AFAICS, if we check for
"invalid_" attributes, it should serve our purpose because those can
have some attribute number only when the row filter contains some
column that is not part of RI. A few possible optimizations in
RelationBuildPublicationDesc:

a. It calls contain_invalid_rfcolumn with pubid and then does cache
lookup to again find a publication which its only caller has access
to, so can't we pass the same?
b. In RelationBuildPublicationDesc(), we call
GetRelationPublications() to get the list of publications and then
process those publications. I think if none of the publications has
row filter and the relation has replica identity then we don't need to
build the descriptor at all. If we do this optimization inside
RelationBuildPublicationDesc, we may want to rename function as
CheckAndBuildRelationPublicationDesc or something like that?

-- 
With Regards,
Amit Kapila.

RE: row filtering for logical replication

2022-01-25 Thread houzj.f...@fujitsu.com

On Monday, January 24, 2022 4:38 PM Peter Smith  wrote:
> 
> Thanks for all the patches!
> 
> Here are my review comments for v69-0001
> 
> ~~~
> 
> 1. src/backend/executor/execReplication.c  CheckCmdReplicaIdentity call to
> RelationBuildPublicationDesc
> 
> + /*
> + * It is only safe to execute UPDATE/DELETE when all columns,
> + referenced in
> + * the row filters from publications which the relation is in, are
> + valid -
> + * i.e. when all referenced columns are part of REPLICA IDENTITY or the
> + * table does not publish UPDATES or DELETES.
> + */
> + pubdesc = RelationBuildPublicationDesc(rel);
> 
> This code is leaky because never frees the palloc-ed memory for the pubdesc.
> 
> IMO change the RelationBuildPublicationDesc to pass in the
> PublicationDesc* from the call stack then can eliminate the palloc and risk of
> leaks.
> 
> ~~~
> 
> 2. src/include/utils/relcache.h - RelationBuildPublicationDesc
> 
> +struct PublicationDesc;
> +extern struct PublicationDesc *RelationBuildPublicationDesc(Relation
> +relation);
> 
> (Same as the previous comment #1). Suggest to change the function signature
> to be void and pass the PublicationDesc* from stack instead of palloc-ing it
> within the function

Changed in V71.

> 
> 3. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc
> 
> +RelationBuildPublicationDesc(Relation relation)
>  {
>   List*puboids;
>   ListCell   *lc;
>   MemoryContext oldcxt;
>   Oid schemaid;
> - PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
> + List*ancestors = NIL;
> + Oid relid = RelationGetRelid(relation); AttrNumber invalid_rfcolnum =
> + InvalidAttrNumber; PublicationDesc *pubdesc =
> + palloc0(sizeof(PublicationDesc)); PublicationActions *pubactions =
> + >pubactions;
> +
> + pubdesc->rf_valid_for_update = true;
> + pubdesc->rf_valid_for_delete = true;
> 
> IMO it wold be better to change the "sense" of those variables.
> e.g.
> 
> "rf_valid_for_update" --> "rf_invalid_for_update"
> "rf_valid_for_delete" --> "rf_invalid_for_delete"
> 
> That way they have the same 'sense' as the AttrNumbers so it all reads better 
> to
> me.
> 
> Also, it means no special assignment is needed because the palloc0 will set
> them correctly

Think again, I am not sure it's better to have an invalid_... flag.
It seems more natural to have a valid_... flag.

Best regards,
Hou zj

RE: row filtering for logical replication

2022-01-25 Thread houzj.f...@fujitsu.com

On Monday, January 24, 2022 5:36 AM Peter Smith 
> 
> FYI - I noticed the cfbot is reporting a failed test case [1] for the latest 
> v69 patch
> set.
> 
> [21:09:32.183] # Failed test 'check replicated inserts on subscriber'
> [21:09:32.183] # at t/025_rep_changes_for_schema.pl line 202.
> [21:09:32.183] # got: '21|1|2139062143'
> [21:09:32.183] # expected: '21|1|21'
> [21:09:32.183] # Looks like you failed 1 test of 13.
> [21:09:32.183] [21:08:49] t/025_rep_changes_for_schema.pl 

The test passed for the latest v70 patch set. I will keep an eye on the cfbot
and if the error happen again in the future, I will continue to investigate
this.

Best regards,
Hou zj

Re: row filtering for logical replication

2022-01-24 Thread Peter Smith

A couple more comments for the v69-0001 TAP tests.

~~~

1. src/test/subscription/t/027_row_filter.pl

+# The subscription of the ALL TABLES IN SCHEMA publication means
there should be
+# no filtering on the tablesync COPY, so all expect all 5 will be present.
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x)
FROM schema_rf_x.tab_rf_x");
+is($result, qq(5), 'check initial data copy from table tab_rf_x
should not be filtered');
+
+# Similarly, normal filtering after the initial phase will also have
not effect.
+# Expected:
+# tab_rf_x   :  5 initial rows + 2 new rows = 7 rows
+# tab_rf_partition   :  1 initial row  + 1 new row  = 2 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO
schema_rf_x.tab_rf_x (x) VALUES (-99), (99)");
+$node_publisher->safe_psql('postgres', "INSERT INTO
schema_rf_x.tab_rf_partitioned (x) VALUES (5), (25)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x)
FROM schema_rf_x.tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM
public.tab_rf_partition");
+is($result, qq(20
+25), 'check table tab_rf_partition should be filtered');

That comment ("Similarly, normal filtering after the initial phase
will also have not effect.") seems no good:
- it is too vague for the tab_rf_x tablesync
- it seems completely wrong for the tab_rf_partition table (because
that filter is working fine)

I'm not sure exactly what the comment should say, but possibly
something like this (??):

BEFORE:
Similarly, normal filtering after the initial phase will also have not effect.
AFTER:
Similarly, the table filter for tab_rf_x (after the initial phase) has
no effect when combined with the ALL TABLES IN SCHEMA. Meanwhile, the
filter for the tab_rf_partition does work because that partition
belongs to a different schema (and publish_via_partition_root =
false).

~~~

2. src/test/subscription/t/027_row_filter.pl

Here is a 2nd place with the same broken comment:

+# The subscription of the FOR ALL TABLES publication means there should be no
+# filtering on the tablesync COPY, so all expect all 5 will be present.
+my $result = $node_subscriber->safe_psql('postgres', "SELECT count(x)
FROM tab_rf_x");
+is($result, qq(5), 'check initial data copy from table tab_rf_x
should not be filtered');
+
+# Similarly, normal filtering after the initial phase will also have
not effect.
+# Expected: 5 initial rows + 2 new rows = 7 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_rf_x (x)
VALUES (-99), (99)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x)
FROM tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');

Here I also think the comment maybe should just say something like:

BEFORE:
Similarly, normal filtering after the initial phase will also have not effect.
AFTER:
Similarly, the table filter for tab_rf_x (after the initial phase) has
no effect when combined with the ALL TABLES IN SCHEMA.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2022-01-24 Thread Amit Kapila

On Mon, Jan 24, 2022 at 1:19 PM Greg Nancarrow  wrote:
>
> On Mon, Jan 24, 2022 at 5:09 PM Amit Kapila  wrote:
> > But that is not what I am seeing in Logs with a test case where the
> > row filter column has NULL values. Could you please try that see what
> > is printed in LOG?
> >
> > You can change the code to make the elevel as LOG to get the results
> > easily. The test case I tried is as follows:
> > Node-1:
> > postgres=# create table t1(c1 int, c2 int);
> > CREATE TABLE
> > postgres=# create publication pub for table t1 WHERE (c1 > 10);
> > CREATE PUBLICATION
> >
> > Node-2:
> > postgres=# create table t1(c1 int, c2 int);
> > CREATE TABLE
> > postgres=# create subscription sub connection 'dbname=postgres' publication 
> > pub;
> > NOTICE:  created replication slot "sub" on publisher
> > CREATE SUBSCRIPTION
> >
> > After this on publisher-node, I see the LOG as "LOG:  row filter
> > evaluates to false (isnull: true)". I have verified that in the code
> > as well (in slot_deform_heap_tuple), we set the value as 0 for isnull
> > which matches above observation.
> >
>
> There are obviously multiple code paths under which a column can end up as 
> NULL.
> Doing one NULL-column test case, and finding here that
> "DatumGetBool(ret)" is "false" when "isnull" is true, doesn't prove it
> will be like that for ALL possible cases.
>

Sure, I just wanted to see the particular test which leads to failure
so that I or others can know (or debug) why in some cases it behaves
differently. Anyway, for others, the below test can show the results:

CREATE TABLE tab_rowfilter_1 (a int primary key, b text);
alter table tab_rowfilter_1 replica identity full ;
INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600');
CREATE PUBLICATION pub FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b
<> 'filtered');

UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600;

So, we can change this DEBUG log.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-24 Thread Peter Smith

Thanks for all the patches!

Here are my review comments for v69-0001

~~~

1. src/backend/executor/execReplication.c  CheckCmdReplicaIdentity
call to RelationBuildPublicationDesc

+ /*
+ * It is only safe to execute UPDATE/DELETE when all columns, referenced in
+ * the row filters from publications which the relation is in, are valid -
+ * i.e. when all referenced columns are part of REPLICA IDENTITY or the
+ * table does not publish UPDATES or DELETES.
+ */
+ pubdesc = RelationBuildPublicationDesc(rel);

This code is leaky because never frees the palloc-ed memory for the pubdesc.

IMO change the RelationBuildPublicationDesc to pass in the
PublicationDesc* from the call stack then can eliminate the palloc and
risk of leaks.

~~~

2. src/include/utils/relcache.h - RelationBuildPublicationDesc

+struct PublicationDesc;
+extern struct PublicationDesc *RelationBuildPublicationDesc(Relation relation);

(Same as the previous comment #1). Suggest to change the function
signature to be void and pass the PublicationDesc* from stack instead
of palloc-ing it within the function

~~~

3. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc

+RelationBuildPublicationDesc(Relation relation)
 {
  List*puboids;
  ListCell   *lc;
  MemoryContext oldcxt;
  Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ List*ancestors = NIL;
+ Oid relid = RelationGetRelid(relation);
+ AttrNumber invalid_rfcolnum = InvalidAttrNumber;
+ PublicationDesc *pubdesc = palloc0(sizeof(PublicationDesc));
+ PublicationActions *pubactions = >pubactions;
+
+ pubdesc->rf_valid_for_update = true;
+ pubdesc->rf_valid_for_delete = true;

IMO it wold be better to change the "sense" of those variables.
e.g.

"rf_valid_for_update" --> "rf_invalid_for_update"
"rf_valid_for_delete" --> "rf_invalid_for_delete"

That way they have the same 'sense' as the AttrNumbers so it all reads
better to me.

Also, it means no special assignment is needed because the palloc0
will set them correctly

~~~

4. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc

- if (relation->rd_pubactions)
+ if (relation->rd_pubdesc)
  {
- pfree(relation->rd_pubactions);
- relation->rd_pubactions = NULL;
+ pfree(relation->rd_pubdesc);
+ relation->rd_pubdesc = NULL;
  }

What is the purpose of this code? Can't it all just be removed?
e.g. Can't you Assert that relation->rd_pubdesc is NULL at this point?

(if it was not-null the function would have returned immediately from the top)

~~~

5. src/include/catalog/pg_publication.h - typedef struct PublicationDesc

+typedef struct PublicationDesc
+{
+ /*
+ * true if the columns referenced in row filters which are used for UPDATE
+ * or DELETE are part of the replica identity, or the publication actions
+ * do not include UPDATE or DELETE.
+ */
+ bool rf_valid_for_update;
+ bool rf_valid_for_delete;
+
+ AttrNumber invalid_rfcol_update;
+ AttrNumber invalid_rfcol_delete;
+
+ PublicationActions pubactions;
+} PublicationDesc;
+

I did not see any point really for the pairs of booleans and AttNumbers.
AFAIK both of them shared exactly the same validation logic so I think
you can get by using fewer members here.

e.g. (here I also reversed the sense of the bool flag, as per my suggestion #3)

typedef struct PublicationDesc
{
 /*
 * true if the columns referenced in row filters which are used for UPDATE
 * or DELETE are part of the replica identity, or the publication actions
 * do not include UPDATE or DELETE.
 */
 bool rf_invalid_for_upd_del;
 AttrNumber invalid_rfcol_upd_del;

 PublicationActions pubactions;
} PublicationDesc;

~~~

6. src/tools/pgindent/typedefs.list

Missing the new typedef PublicationDesc

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2022-01-23 Thread Greg Nancarrow

On Mon, Jan 24, 2022 at 5:09 PM Amit Kapila  wrote:
>
> On Mon, Jan 24, 2022 at 10:29 AM Greg Nancarrow  wrote:
> >
> > On Mon, Jan 24, 2022 at 2:47 PM Amit Kapila  wrote:
> > >
> > > > (3) pgoutput_row_filter_exec_expr
> > > > pgoutput_row_filter_exec_expr() returns false if "isnull" is true,
> > > > otherwise (if "isnull" is false) returns the value of "ret"
> > > > (true/false).
> > > > So the following elog needs to be changed (Peter Smith previously
> > > > pointed this out, but it didn't get completely changed):
> > > >
> > > > BEFORE:
> > > > + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
> > > > + DatumGetBool(ret) ? "true" : "false",
> > > > + isnull ? "true" : "false");
> > > > AFTER:
> > > > + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
> > > > + isnull ? "false" : DatumGetBool(ret) ? "true" : "false",
> > > > + isnull ? "true" : "false");
> > > >
> > >
> > > Do you see any problem with the current? I find the current one easy
> > > to understand.
> > >
> >
> > Yes, I see a problem.
> >
>
> I tried by inserting NULL value in a column having row filter and the
> result it shows is:
>
>  LOG:  row filter evaluates to false (isnull: true)
>
> This is what is expected.
>
> >
> > But regression tests fail when that code change is made (indicating
> > that there are cases when "isnull" is true but the function returns
> > true instead of false).
> >
>
> But that is not what I am seeing in Logs with a test case where the
> row filter column has NULL values. Could you please try that see what
> is printed in LOG?
>
> You can change the code to make the elevel as LOG to get the results
> easily. The test case I tried is as follows:
> Node-1:
> postgres=# create table t1(c1 int, c2 int);
> CREATE TABLE
> postgres=# create publication pub for table t1 WHERE (c1 > 10);
> CREATE PUBLICATION
>
> Node-2:
> postgres=# create table t1(c1 int, c2 int);
> CREATE TABLE
> postgres=# create subscription sub connection 'dbname=postgres' publication 
> pub;
> NOTICE:  created replication slot "sub" on publisher
> CREATE SUBSCRIPTION
>
> After this on publisher-node, I see the LOG as "LOG:  row filter
> evaluates to false (isnull: true)". I have verified that in the code
> as well (in slot_deform_heap_tuple), we set the value as 0 for isnull
> which matches above observation.
>

There are obviously multiple code paths under which a column can end up as NULL.
Doing one NULL-column test case, and finding here that
"DatumGetBool(ret)" is "false" when "isnull" is true, doesn't prove it
will be like that for ALL possible cases.
As I pointed out, the function is meant to always return false when
"isnull" is true, so if the current logging code is correct (always
logging "DatumGetBool(ret)" as the function return value), then to
match the code to the current logging, we should be able to return
"DatumGetBool(ret)" if "isnull" is true, instead of returning false as
it currently does.
But as I said, when I try that then I get a test failure (make
check-world), proving that there is a case where "DatumGetBool(ret)"
is true when "isnull" is true, and thus showing that the current
logging is not correct because in that case the current log output
would show the return value is true, which won't match the actual
function return value of false.
(I also added some extra logging for this isnull==true test failure
case and found that ret==1)

Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-23 Thread Amit Kapila

On Mon, Jan 24, 2022 at 10:29 AM Greg Nancarrow  wrote:
>
> On Mon, Jan 24, 2022 at 2:47 PM Amit Kapila  wrote:
> >
> > > (3) pgoutput_row_filter_exec_expr
> > > pgoutput_row_filter_exec_expr() returns false if "isnull" is true,
> > > otherwise (if "isnull" is false) returns the value of "ret"
> > > (true/false).
> > > So the following elog needs to be changed (Peter Smith previously
> > > pointed this out, but it didn't get completely changed):
> > >
> > > BEFORE:
> > > + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
> > > + DatumGetBool(ret) ? "true" : "false",
> > > + isnull ? "true" : "false");
> > > AFTER:
> > > + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
> > > + isnull ? "false" : DatumGetBool(ret) ? "true" : "false",
> > > + isnull ? "true" : "false");
> > >
> >
> > Do you see any problem with the current? I find the current one easy
> > to understand.
> >
>
> Yes, I see a problem.
>

I tried by inserting NULL value in a column having row filter and the
result it shows is:

 LOG:  row filter evaluates to false (isnull: true)

This is what is expected.

>
> But regression tests fail when that code change is made (indicating
> that there are cases when "isnull" is true but the function returns
> true instead of false).
>

But that is not what I am seeing in Logs with a test case where the
row filter column has NULL values. Could you please try that see what
is printed in LOG?

You can change the code to make the elevel as LOG to get the results
easily. The test case I tried is as follows:
Node-1:
postgres=# create table t1(c1 int, c2 int);
CREATE TABLE
postgres=# create publication pub for table t1 WHERE (c1 > 10);
CREATE PUBLICATION

Node-2:
postgres=# create table t1(c1 int, c2 int);
CREATE TABLE
postgres=# create subscription sub connection 'dbname=postgres' publication pub;
NOTICE:  created replication slot "sub" on publisher
CREATE SUBSCRIPTION

After this on publisher-node, I see the LOG as "LOG:  row filter
evaluates to false (isnull: true)". I have verified that in the code
as well (in slot_deform_heap_tuple), we set the value as 0 for isnull
which matches above observation.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-23 Thread Peter Smith

On Mon, Jan 24, 2022 at 4:53 PM Peter Smith  wrote:
>
> On Fri, Jan 21, 2022 at 2:04 PM Amit Kapila  wrote:
> >
> > On Thu, Jan 20, 2022 at 7:56 PM Alvaro Herrera  
> > wrote:
> > >
> > > > >  Maybe this was meant to be "validate RF
> > > > > expressions" and return, perhaps, a bitmapset of all invalid columns
> > > > > referenced?
> > > >
> > > > Currently, we stop as soon as we find the first invalid column.
> > >
> > > That seems quite strange.  (And above you say "gather as much info as
> > > possible", so why stop at the first one?)
> > >
> >
> > Because that is an error case, so, there doesn't seem to be any
> > benefit in proceeding further. However, we can build all the required
> > information by processing all publications (aka gather all
> > information) and then later give an error if that idea appeals to you
> > more.
> >
> > > > >  (What is an invalid column in the first place?)
> > > >
> > > > A column that is referenced in the row filter but is not part of
> > > > Replica Identity.
> > >
> > > I do wonder how do these invalid columns reach the table definition in
> > > the first place.  Shouldn't these be detected at DDL time and prohibited
> > > from getting into the definition?
> > >
> >
> > As mentioned by Peter E [1], there are two ways to deal with this: (a)
> > The current approach is that the user can set the replica identity
> > freely, and we decide later based on that what we can replicate (e.g.,
> > no updates). If we follow the same approach for this patch, we don't
> > restrict what columns are part of the row filter, but we check what
> > actions we can replicate based on the row filter. This is what is
> > currently followed in the patch. (b) Add restrictions during DDL which
> > is not as straightforward as it looks.
>
> FYI - I also wanted to highlight that doing the replica identity
> validation at update/delete time is not only following the "current
> approach", as mentioned above, but this is also consistent with the
> *documented* behaviour in PG docs (See [1] since PG v10),
>
> 
> If a table without a replica identity is added to a publication that
> replicates UPDATE or DELETE operations then subsequent UPDATE or
> DELETE operations will cause an error on the publisher.
> 
>
> Specifically,
>
> It does *not* say that the RI validation error will happen when a
> table is added to the publication at CREATE/ALTER PUBLICATION time.
>
> It says that *subsequent* "UPDATE or DELETE operations will cause an error".
>
> ~~
>
> The point is that it is one thing to decide to change something that
> was never officially documented, but to change already *documented*
> behaviour is much more radical and has the potential to upset some
> users.
>
> --

(Sorry, fixed the broken link of the previous post)

[1] https://www.postgresql.org/docs/current/logical-replication-publication.html

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2022-01-23 Thread Peter Smith

On Fri, Jan 21, 2022 at 2:04 PM Amit Kapila  wrote:
>
> On Thu, Jan 20, 2022 at 7:56 PM Alvaro Herrera  
> wrote:
> >
> > > >  Maybe this was meant to be "validate RF
> > > > expressions" and return, perhaps, a bitmapset of all invalid columns
> > > > referenced?
> > >
> > > Currently, we stop as soon as we find the first invalid column.
> >
> > That seems quite strange.  (And above you say "gather as much info as
> > possible", so why stop at the first one?)
> >
>
> Because that is an error case, so, there doesn't seem to be any
> benefit in proceeding further. However, we can build all the required
> information by processing all publications (aka gather all
> information) and then later give an error if that idea appeals to you
> more.
>
> > > >  (What is an invalid column in the first place?)
> > >
> > > A column that is referenced in the row filter but is not part of
> > > Replica Identity.
> >
> > I do wonder how do these invalid columns reach the table definition in
> > the first place.  Shouldn't these be detected at DDL time and prohibited
> > from getting into the definition?
> >
>
> As mentioned by Peter E [1], there are two ways to deal with this: (a)
> The current approach is that the user can set the replica identity
> freely, and we decide later based on that what we can replicate (e.g.,
> no updates). If we follow the same approach for this patch, we don't
> restrict what columns are part of the row filter, but we check what
> actions we can replicate based on the row filter. This is what is
> currently followed in the patch. (b) Add restrictions during DDL which
> is not as straightforward as it looks.

FYI - I also wanted to highlight that doing the replica identity
validation at update/delete time is not only following the "current
approach", as mentioned above, but this is also consistent with the
*documented* behaviour in PG docs (See [1] since PG v10),


If a table without a replica identity is added to a publication that
replicates UPDATE or DELETE operations then subsequent UPDATE or
DELETE operations will cause an error on the publisher.


Specifically,

It does *not* say that the RI validation error will happen when a
table is added to the publication at CREATE/ALTER PUBLICATION time.

It says that *subsequent* "UPDATE or DELETE operations will cause an error".

~~

The point is that it is one thing to decide to change something that
was never officially documented, but to change already *documented*
behaviour is much more radical and has the potential to upset some
users.

--
[1] https://www.postgresql.org/docs/devel/logical-replication-publication.

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2022-01-23 Thread Greg Nancarrow

On Mon, Jan 24, 2022 at 8:36 AM Peter Smith  wrote:
>
> FYI - I noticed the cfbot is reporting a failed test case [1] for the
> latest v69 patch set.
>
> [21:09:32.183] # Failed test 'check replicated inserts on subscriber'
> [21:09:32.183] # at t/025_rep_changes_for_schema.pl line 202.
> [21:09:32.183] # got: '21|1|2139062143'
> [21:09:32.183] # expected: '21|1|21'
> [21:09:32.183] # Looks like you failed 1 test of 13.
> [21:09:32.183] [21:08:49] t/025_rep_changes_for_schema.pl 
>
> --
> [1] https://cirrus-ci.com/task/6280873841524736?logs=test_world#L3970
>

2139062143 is 0x7F7F7F7F, so it looks like a value from uninitialized
memory (debug build) has been copied into the column, or something
similar involving uninitialized memory.
The problem is occurring on FreeBSD.
I tried using similar build flags as that test environment, but
couldn't reproduce the issue.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-23 Thread Greg Nancarrow

On Mon, Jan 24, 2022 at 2:47 PM Amit Kapila  wrote:
>
> > (3) pgoutput_row_filter_exec_expr
> > pgoutput_row_filter_exec_expr() returns false if "isnull" is true,
> > otherwise (if "isnull" is false) returns the value of "ret"
> > (true/false).
> > So the following elog needs to be changed (Peter Smith previously
> > pointed this out, but it didn't get completely changed):
> >
> > BEFORE:
> > + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
> > + DatumGetBool(ret) ? "true" : "false",
> > + isnull ? "true" : "false");
> > AFTER:
> > + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
> > + isnull ? "false" : DatumGetBool(ret) ? "true" : "false",
> > + isnull ? "true" : "false");
> >
>
> Do you see any problem with the current? I find the current one easy
> to understand.
>

Yes, I see a problem. The logging doesn't match what the function code
actually returns when "isnull" is true.
When "isnull" is true, the function always returns false, not the
value of "ret".
For the current logging code to be correct, and match the function
return value, we should be able to change:

  if (isnull)
return false;

to:

  if (isnull)
return ret;

But regression tests fail when that code change is made (indicating
that there are cases when "isnull" is true but the function returns
true instead of false).
So the current logging code is NOT correct, and needs to be updated as
I indicated.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-23 Thread Amit Kapila

On Fri, Jan 21, 2022 at 2:56 PM Greg Nancarrow  wrote:
>
> On Thu, Jan 20, 2022 at 12:12 PM houzj.f...@fujitsu.com
>  wrote:
>
> (3) pgoutput_row_filter_exec_expr
> pgoutput_row_filter_exec_expr() returns false if "isnull" is true,
> otherwise (if "isnull" is false) returns the value of "ret"
> (true/false).
> So the following elog needs to be changed (Peter Smith previously
> pointed this out, but it didn't get completely changed):
>
> BEFORE:
> + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
> + DatumGetBool(ret) ? "true" : "false",
> + isnull ? "true" : "false");
> AFTER:
> + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
> + isnull ? "false" : DatumGetBool(ret) ? "true" : "false",
> + isnull ? "true" : "false");
>

Do you see any problem with the current? I find the current one easy
to understand.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-23 Thread Amit Kapila

On Sat, Jan 22, 2022 at 8:45 PM Alvaro Herrera  wrote:
>
> On 2022-Jan-22, Amit Kapila wrote:
>
> > CREATE TABLE parent (a int primary key, b int not null, c varchar)
> > PARTITION BY RANGE(a);
> > CREATE TABLE child PARTITION OF parent FOR VALUES FROM (0) TO (250);
> > CREATE UNIQUE INDEX b_index on child(b);
> > ALTER TABLE child REPLICA IDENTITY using INDEX b_index;
> >
> > In this, the parent table's replica identity is the primary
> > key(default) and the child table's replica identity is the b_index.
>
> Why is the partition's replica identity different from its parent's?
> Does that even make sense?
>

Parent's RI doesn't matter as we always use a child's RI, so one may
decide not to have RI for a parent. Also, when replicating, the user
might have set up the partitioned table on the publisher-side and
non-partitioned tables on the subscriber-side in which case also there
could be different RI keys on child tables.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-23 Thread Peter Smith

FYI - I noticed the cfbot is reporting a failed test case [1] for the
latest v69 patch set.

[21:09:32.183] # Failed test 'check replicated inserts on subscriber'
[21:09:32.183] # at t/025_rep_changes_for_schema.pl line 202.
[21:09:32.183] # got: '21|1|2139062143'
[21:09:32.183] # expected: '21|1|21'
[21:09:32.183] # Looks like you failed 1 test of 13.
[21:09:32.183] [21:08:49] t/025_rep_changes_for_schema.pl 

--
[1] https://cirrus-ci.com/task/6280873841524736?logs=test_world#L3970

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2022-01-22 Thread Alvaro Herrera

On 2022-Jan-22, Amit Kapila wrote:

> CREATE TABLE parent (a int primary key, b int not null, c varchar)
> PARTITION BY RANGE(a);
> CREATE TABLE child PARTITION OF parent FOR VALUES FROM (0) TO (250);
> CREATE UNIQUE INDEX b_index on child(b);
> ALTER TABLE child REPLICA IDENTITY using INDEX b_index;
> 
> In this, the parent table's replica identity is the primary
> key(default) and the child table's replica identity is the b_index.

Why is the partition's replica identity different from its parent's?
Does that even make sense?

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"La verdad no siempre es bonita, pero el hambre de ella sí"

Re: row filtering for logical replication

2022-01-21 Thread Amit Kapila

On Fri, Jan 21, 2022 at 8:19 PM Alvaro Herrera  wrote:
>
> If ATTACH PARTITION or CREATE TABLE .. PARTITION OF don't let you
> specify replica identity, I suspect it's because both partitioning and
> logical replication were developed in parallel, and neither gave too
> much thought to the other.
>

I think the reason CREATE TABLE .. syntax form doesn't have a way to
specify RI is that we need to have an index for RI. Consider the below
example:

CREATE TABLE parent (a int primary key, b int not null, c varchar)
PARTITION BY RANGE(a);
CREATE TABLE child PARTITION OF parent FOR VALUES FROM (0) TO (250);
CREATE UNIQUE INDEX b_index on child(b);
ALTER TABLE child REPLICA IDENTITY using INDEX b_index;

In this, the parent table's replica identity is the primary
key(default) and the child table's replica identity is the b_index. I
think if we want we can come up with some syntax to combine these
steps and allow to specify replica identity during the second step
(Create ... Partition) but not sure if we have a convincing reason for
this feature per se.

>
> I suspect that a better way to attack this problem is to let ALTER TABLE
> ... ATTACH PARTITION and CREATE TABLE .. PARTITION OF specify a replica
> identity as necessary.
>
> My suggestion is to avoid painting us into a corner from which it will
> be impossible to get out later.
>

Apart from the above reason, here we are just following the current
model of how the update/delete behaves w.r.t RI. Now, I think in the
future we can also think of uplifting some of the restrictions related
to RI for filters if we find a good way to have columns values that
are not in WAL. We have discussed this previously in this thread and
thought that it is sensible to have a RI restriction for
updates/deletes as the patch is doing for the first version.

I am not against inventing some new syntaxes for row/column filter
patches but there doesn't seem to be a very convincing reason for it
and there is a good chance that we won't be able to accomplish that
for the current version.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-21 Thread Alvaro Herrera

On 2022-Jan-21, houzj.f...@fujitsu.com wrote:

> Personally, I'm a little hesitant to put the check at DDL level, because
> adding check at DDLs like ATTACH PARTITION/CREATE PARTITION OF ( [1]
> explained why we need to check these DDLs) looks a bit restrictive and
> user might also complain about that. Put the check in
> CheckCmdReplicaIdentity seems more acceptable because it is consistent
> with the existing behavior which has few complaints from users AFAIK.

I think logical replication is currently so limited that there's very
few people that can put it to real use.  So I suggest we should not take
the small number of complaints about the current behavior as very
valuable, because it just means that not a lot of people are using
logical replication in the first place.  But once these new
functionalities are introduced, it will start to become actually useful
and it will be then when users will exercise and notice weird behavior.

If ATTACH PARTITION or CREATE TABLE .. PARTITION OF don't let you
specify replica identity, I suspect it's because both partitioning and
logical replication were developed in parallel, and neither gave too
much thought to the other.  So these syntax corners went unnoticed.

I suspect that a better way to attack this problem is to let ALTER TABLE
... ATTACH PARTITION and CREATE TABLE .. PARTITION OF specify a replica
identity as necessary.

My suggestion is to avoid painting us into a corner from which it will
be impossible to get out later.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"La espina, desde que nace, ya pincha" (Proverbio africano)

Re: row filtering for logical replication

2022-01-21 Thread Dilip Kumar

On Thu, Jan 20, 2022 at 4:54 PM Amit Kapila  wrote:

> + /*
> + * Unchanged toasted replica identity columns are only detoasted in the
> + * old tuple, copy this over to the new tuple.
> + */
> + if (att->attlen == -1 &&
> + VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i]) &&
> + !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))
> + {
> + if (tmp_new_slot == new_slot)
> + {
> + tmp_new_slot = MakeSingleTupleTableSlot(desc, );
> + ExecClearTuple(tmp_new_slot);
> + ExecCopySlot(tmp_new_slot, new_slot);
> + }
> +
> + tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
> + tmp_new_slot->tts_isnull[i] = old_slot->tts_isnull[i];
> + }
> + }
>
>
> What is the need to assign new_slot to tmp_new_slot at the beginning
> of this part of the code? Can't we do this when we found some
> attribute that needs to be copied from the old tuple?
>
> The other part which is not clear to me by looking at this code and
> comments is how do we ensure that we cover all cases where the new
> tuple doesn't have values?
>

IMHO, the only part we are trying to handle is when the toasted attribute
is not modified in the new tuple.  And if we notice the update WAL the new
tuple is written as it is in the WAL which is getting inserted into the
heap page.  That means if it is external it can only be in
VARATT_IS_EXTERNAL_ONDISK format.  So I don't think we need to worry about
any intermediate format which we use for the in-memory tuples.  Sometimes
in reorder buffer we do use the INDIRECT format as well which internally
can store ON DISK format but we don't need to worry about that case either
because that is only true when we have the complete toast tuple as part of
the WAL and we recreate the tuple in memory in reorder buffer, so even if
it can by ON DISK format inside INDIRECT format but we have complete tuple.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

RE: row filtering for logical replication

2022-01-21 Thread houzj.f...@fujitsu.com

On Thur, Jan 20, 2022 7:25 PM Amit Kapila  wrote:
> 
> On Thu, Jan 20, 2022 at 6:42 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > Attach the V68 patch set which addressed the above comments and changes.
> >
> 
> Few comments and suggestions:
> ==
> 1.
> /*
> + * For updates, if both the new tuple and old tuple are not null, then both
> + * of them need to be checked against the row filter.
> + */
> + tmp_new_slot = new_slot;
> + slot_getallattrs(new_slot);
> + slot_getallattrs(old_slot);
> +
> 
> Isn't it better to add assert like
> Assert(map_changetype_pubaction[*action] == PUBACTION_UPDATE); before
> the above code? I have tried to change this part of the code in the
> attached top-up patch.
> 
> 2.
> + /*
> + * For updates, if both the new tuple and old tuple are not null, then both
> + * of them need to be checked against the row filter.
> + */
> + tmp_new_slot = new_slot;
> + slot_getallattrs(new_slot);
> + slot_getallattrs(old_slot);
> +
> + /*
> + * The new tuple might not have all the replica identity columns, in which
> + * case it needs to be copied over from the old tuple.
> + */
> + for (i = 0; i < desc->natts; i++)
> + {
> + Form_pg_attribute att = TupleDescAttr(desc, i);
> +
> + /*
> + * if the column in the new tuple or old tuple is null, nothing to do
> + */
> + if (tmp_new_slot->tts_isnull[i] || old_slot->tts_isnull[i])
> + continue;
> +
> + /*
> + * Unchanged toasted replica identity columns are only detoasted in the
> + * old tuple, copy this over to the new tuple.
> + */
> + if (att->attlen == -1 &&
> + VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i]) &&
> + !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))
> + {
> + if (tmp_new_slot == new_slot)
> + {
> + tmp_new_slot = MakeSingleTupleTableSlot(desc, );
> + ExecClearTuple(tmp_new_slot);
> + ExecCopySlot(tmp_new_slot, new_slot);
> + }
> +
> + tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
> + tmp_new_slot->tts_isnull[i] = old_slot->tts_isnull[i];
> + }
> + }
> 
> 
> What is the need to assign new_slot to tmp_new_slot at the beginning
> of this part of the code? Can't we do this when we found some
> attribute that needs to be copied from the old tuple?

Thanks for the comments, Changed.

> The other part which is not clear to me by looking at this code and
> comments is how do we ensure that we cover all cases where the new
> tuple doesn't have values?

I will do some research about this and respond soon.

Best regards,
Hou zj

Re: row filtering for logical replication

2022-01-21 Thread Greg Nancarrow

On Thu, Jan 20, 2022 at 12:12 PM houzj.f...@fujitsu.com
 wrote:
>
> Attach the V68 patch set which addressed the above comments and changes.
> The version patch also fix the error message mentioned by Greg[1]
>

Some review comments for the v68 patch, mostly nitpicking:

(1) Commit message
Minor suggested updates:

BEFORE:
Allow specifying row filter for logical replication of tables.
AFTER:
Allow specifying row filters for logical replication of tables.

BEFORE:
If you choose to do the initial table synchronization, only data that
satisfies the row filters is pulled by the subscriber.
AFTER:
If you choose to do the initial table synchronization, only data that
satisfies the row filters is copied to the subscriber.

src/backend/executor/execReplication.c

(2)

BEFORE:
+ * table does not publish UPDATES or DELETES.
AFTER:
+ * table does not publish UPDATEs or DELETEs.

src/backend/replication/pgoutput/pgoutput.c

(3) pgoutput_row_filter_exec_expr
pgoutput_row_filter_exec_expr() returns false if "isnull" is true,
otherwise (if "isnull" is false) returns the value of "ret"
(true/false).
So the following elog needs to be changed (Peter Smith previously
pointed this out, but it didn't get completely changed):

BEFORE:
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
AFTER:
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ isnull ? "false" : DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");

(4) pgoutput_row_filter_init

BEFORE:
+  * we don't know yet if there is/isn't any row filters for this relation.
AFTER:
+  * we don't know yet if there are/aren't any row filters for this relation.

BEFORE:
+  * necessary at all. This avoids us to consume memory and spend CPU cycles
+  * when we don't need to.
AFTER:
+  * necessary at all. So this allows us to avoid unnecessary memory
+  * consumption and CPU cycles.

(5) pgoutput_row_filter

BEFORE:
+ * evaluates the row filter for that tuple and return.
AFTER:
+ * evaluate the row filter for that tuple and return.

Regards,
Greg Nancarrow
Fujitsu Australia

RE: row filtering for logical replication

2022-01-20 Thread houzj.f...@fujitsu.com

On Thur, Jan 20, 2022 10:26 PM Alvaro Herrera  wrote:
> 
> On 2022-Jan-20, Amit Kapila wrote:
> 
> > It returns an invalid column referenced in an RF if any but if not
> > then it helps to form pubactions which is anyway required at a later
> > point in the caller. The idea is that when we are already traversing
> > publications we should store/gather as much info as possible.
> 
> I think this design isn't quite awesome.
> 
> > I think probably the API name is misleading, maybe we should name it
> > something like ValidateAndFetchPubInfo, ValidateAndRememberPubInfo, or
> > something along these lines?
> 
> Maybe RelationBuildReplicationPublicationDesc or just
> RelationBuildPublicationDesc are good names for a routine that fill in
> the publication aspect of the relcache entry, as a parallel to
> RelationBuildPartitionDesc.
> 
> > >  Maybe this was meant to be "validate RF
> > > expressions" and return, perhaps, a bitmapset of all invalid columns
> > > referenced?
> >
> > Currently, we stop as soon as we find the first invalid column.
> 
> That seems quite strange.  (And above you say "gather as much info as
> possible", so why stop at the first one?)
> 
> > >  (What is an invalid column in the first place?)
> >
> > A column that is referenced in the row filter but is not part of
> > Replica Identity.
> 
> I do wonder how do these invalid columns reach the table definition in
> the first place.  Shouldn't these be detected at DDL time and prohibited
> from getting into the definition?

Personally, I'm a little hesitant to put the check at DDL level, because
adding check at DDLs like ATTACH PARTITION/CREATE PARTITION OF ( [1]
explained why we need to check these DDLs) looks a bit restrictive and
user might also complain about that. Put the check in
CheckCmdReplicaIdentity seems more acceptable because it is consistent
with the existing behavior which has few complaints from users AFAIK.

[1] 
https://www.postgresql.org/message-id/CAA4eK1%2Bm45Xyzx7AUY9TyFnB6CZ7_%2B_uooPb7WHSpp7UE%3DYmKg%40mail.gmail.com

Best regards,
Hou zj

RE: row filtering for logical replication

2022-01-20 Thread houzj.f...@fujitsu.com

On Thursday, January 20, 2022 9:14 PM Alvaro Herrera  
wrote:
> 
> I was skimming this and the changes in CheckCmdReplicaIdentity caught my
> attention.  "Is this code running at the publisher side or the subscriber 
> side?" I
> wondered -- because the new error messages being added look intended to
> be thrown at the publisher side; but the existing error messages appear
> intended for the subscriber side.  Apparently there is one caller at the
> publisher side (CheckValidResultRel) and three callers at the subscriber side.
> I'm not fully convinced that this is a problem, but I think it's not great to 
> have it
> that way.  Maybe it's okay with the current coding, but after this patch adds
> this new errors it is definitely weird.  Maybe it should split in two 
> routines, and
> document more explicitly which is one is for which side.

I think the existing CheckCmdReplicaIdentity is intended to run at the
publisher side. Although the CheckCmdReplicaIdentity is invoked in
ExecSimpleRelationInsert which is at subscriber side, but I think that's
because the subscriber side could be a publisher as well which need to check
the RI.

So, the new error message in the patch is consistent with the existing error
message. (all for publisher sider)

Best regards,
Hou zj

Re: row filtering for logical replication

2022-01-20 Thread Amit Kapila

On Thu, Jan 20, 2022 at 7:56 PM Alvaro Herrera  wrote:
>
> > >  Maybe this was meant to be "validate RF
> > > expressions" and return, perhaps, a bitmapset of all invalid columns
> > > referenced?
> >
> > Currently, we stop as soon as we find the first invalid column.
>
> That seems quite strange.  (And above you say "gather as much info as
> possible", so why stop at the first one?)
>

Because that is an error case, so, there doesn't seem to be any
benefit in proceeding further. However, we can build all the required
information by processing all publications (aka gather all
information) and then later give an error if that idea appeals to you
more.

> > >  (What is an invalid column in the first place?)
> >
> > A column that is referenced in the row filter but is not part of
> > Replica Identity.
>
> I do wonder how do these invalid columns reach the table definition in
> the first place.  Shouldn't these be detected at DDL time and prohibited
> from getting into the definition?
>

As mentioned by Peter E [1], there are two ways to deal with this: (a)
The current approach is that the user can set the replica identity
freely, and we decide later based on that what we can replicate (e.g.,
no updates). If we follow the same approach for this patch, we don't
restrict what columns are part of the row filter, but we check what
actions we can replicate based on the row filter. This is what is
currently followed in the patch. (b) Add restrictions during DDL which
is not as straightforward as it looks.

For approach (b), we need to restrict quite a few DDLs like DROP
INDEX/DROP PRIMARY/ALTER REPLICA IDENTITY/ATTACH PARTITION/CREATE
TABLE PARTITION OF/ALTER PUBLICATION SET(publish='update')/ALTER
PUBLICATION SET(publish_via_root), etc.

We need to deal with partition table cases because newly added
partitions automatically become part of publication if any of its
ancestor tables is part of the publication. Now consider the case
where the user needs to use CREATE TABLE PARTITION OF. The problem is
that the user cannot specify the Replica Identity using an index when
creating the table so we can't validate and it will lead to errors
during replication if the parent table is published with a row filter.

[1] - 
https://www.postgresql.org/message-id/2d6c8b74-bdef-767b-bdb6-29705985ed9c%40enterprisedb.com

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-20 Thread Greg Nancarrow

On Fri, Jan 21, 2022 at 12:13 AM Alvaro Herrera 
wrote:
>
> And while wondering about that, I stumbled upon
> GetRelationPublicationActions(), which has a very weird API that it
> always returns a palloc'ed block -- but without saying so.  And
> therefore, its only caller leaks that memory.  Maybe not critical, but
> it looks ugly.  I mean, if we're always going to do a memcpy, why not
> use a caller-supplied stack-allocated memory?  Sounds like it'd be
> simpler.
>

+1
This issue exists on HEAD (i.e. was not introduced by the row filtering
patch) and was already discussed on another thread ([1]) on which I posted
a patch to correct the issue along the same lines that you're suggesting.

[1]
https://postgr.es/m/CAJcOf-d0%3DvQx1Pzbf%2BLVarywejJFS5W%2BM6uR%2B2d0oeEJ2VQ%2BEw%40mail.gmail.com


Regards,
Greg Nancarrow
Fujitsu Australia

Re: row filtering for logical replication

2022-01-20 Thread Alvaro Herrera

On 2022-Jan-20, Amit Kapila wrote:

> It returns an invalid column referenced in an RF if any but if not
> then it helps to form pubactions which is anyway required at a later
> point in the caller. The idea is that when we are already traversing
> publications we should store/gather as much info as possible.

I think this design isn't quite awesome.

> I think probably the API name is misleading, maybe we should name it
> something like ValidateAndFetchPubInfo, ValidateAndRememberPubInfo, or
> something along these lines?

Maybe RelationBuildReplicationPublicationDesc or just
RelationBuildPublicationDesc are good names for a routine that fill in
the publication aspect of the relcache entry, as a parallel to
RelationBuildPartitionDesc.

> >  Maybe this was meant to be "validate RF
> > expressions" and return, perhaps, a bitmapset of all invalid columns
> > referenced?
> 
> Currently, we stop as soon as we find the first invalid column.

That seems quite strange.  (And above you say "gather as much info as
possible", so why stop at the first one?)

> >  (What is an invalid column in the first place?)
> 
> A column that is referenced in the row filter but is not part of
> Replica Identity.

I do wonder how do these invalid columns reach the table definition in
the first place.  Shouldn't these be detected at DDL time and prohibited
from getting into the definition?

... so if I do
  ADD TABLE foobar WHERE (col_not_in_replident = 42)
then I should get an error immediately, rather than be forced to
construct a relcache entry with "invalid" data in it.  Likewise if I
change the replica identity to one that causes one of these to be
invalid.  Isn't this the same approach we discussed for column
filtering?

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
Voy a acabar con todos los humanos / con los humanos yo acabaré
voy a acabar con todos (bis) / con todos los humanos acabaré ¡acabaré! (Bender)

Re: row filtering for logical replication

2022-01-20 Thread Amit Kapila

On Thu, Jan 20, 2022 at 6:43 PM Alvaro Herrera  wrote:
>
> And the actual reason I was looking at this code, is that I had stumbled
> upon the new GetRelationPublicationInfo() function, which has an even
> weirder API:
>
> >  * Get the publication information for the given relation.
> >  *
> >  * Traverse all the publications which the relation is in to get the
> >  * publication actions and validate the row filter expressions for such
> >  * publications if any. We consider the row filter expression as invalid if 
> > it
> >  * references any column which is not part of REPLICA IDENTITY.
> >  *
> >  * To avoid fetching the publication information, we cache the publication
> >  * actions and row filter validation information.
> >  *
> >  * Returns the column number of an invalid column referenced in a row filter
> >  * expression if any, InvalidAttrNumber otherwise.
> >  */
> > AttrNumber
> > GetRelationPublicationInfo(Relation relation, bool validate_rowfilter)
>
> "Returns *an* invalid column referenced in a RF if any"?  That sounds
> very strange.  And exactly what info is it getting, given that there is
> no actual returned info?
>

It returns an invalid column referenced in an RF if any but if not
then it helps to form pubactions which is anyway required at a later
point in the caller. The idea is that when we are already traversing
publications we should store/gather as much info as possible. I think
probably the API name is misleading, maybe we should name it something
like ValidateAndFetchPubInfo, ValidateAndRememberPubInfo, or something
along these lines?

>  Maybe this was meant to be "validate RF
> expressions" and return, perhaps, a bitmapset of all invalid columns
> referenced?
>

Currently, we stop as soon as we find the first invalid column.

>  (What is an invalid column in the first place?)
>

A column that is referenced in the row filter but is not part of
Replica Identity.

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2022-01-20 Thread Alvaro Herrera

I was skimming this and the changes in CheckCmdReplicaIdentity caught my
attention.  "Is this code running at the publisher side or the subscriber
side?" I wondered -- because the new error messages being added look
intended to be thrown at the publisher side; but the existing error
messages appear intended for the subscriber side.  Apparently there is
one caller at the publisher side (CheckValidResultRel) and three callers
at the subscriber side.  I'm not fully convinced that this is a problem,
but I think it's not great to have it that way.  Maybe it's okay with
the current coding, but after this patch adds this new errors it is
definitely weird.  Maybe it should split in two routines, and document
more explicitly which is one is for which side.

And while wondering about that, I stumbled upon
GetRelationPublicationActions(), which has a very weird API that it
always returns a palloc'ed block -- but without saying so.  And
therefore, its only caller leaks that memory.  Maybe not critical, but
it looks ugly.  I mean, if we're always going to do a memcpy, why not
use a caller-supplied stack-allocated memory?  Sounds like it'd be
simpler.

And the actual reason I was looking at this code, is that I had stumbled
upon the new GetRelationPublicationInfo() function, which has an even
weirder API:

>  * Get the publication information for the given relation.
>  *
>  * Traverse all the publications which the relation is in to get the
>  * publication actions and validate the row filter expressions for such
>  * publications if any. We consider the row filter expression as invalid if it
>  * references any column which is not part of REPLICA IDENTITY.
>  *
>  * To avoid fetching the publication information, we cache the publication
>  * actions and row filter validation information.
>  *
>  * Returns the column number of an invalid column referenced in a row filter
>  * expression if any, InvalidAttrNumber otherwise.
>  */
> AttrNumber
> GetRelationPublicationInfo(Relation relation, bool validate_rowfilter)

"Returns *an* invalid column referenced in a RF if any"?  That sounds
very strange.  And exactly what info is it getting, given that there is
no actual returned info?  Maybe this was meant to be "validate RF
expressions" and return, perhaps, a bitmapset of all invalid columns
referenced?  (What is an invalid column in the first place?)


In many function comments you see things like "Check, if foo is bar" or
"Returns true, if blah".  These commas there needs to be removed.

Thanks

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)

Re: row filtering for logical replication

2022-01-20 Thread Amit Kapila

On Thu, Jan 20, 2022 at 5:03 PM tanghy.f...@fujitsu.com
 wrote:
>
> On Thu, Jan 20, 2022 9:13 AM houzj.f...@fujitsu.com  
> wrote:
> > Attach the V68 patch set which addressed the above comments and changes.
> > The version patch also fix the error message mentioned by Greg[1]
> >
>
> I saw a problem about this patch, which is related to Replica Identity check.
>
> For example:
> -- publisher --
> create table tbl (a int);
> create publication pub for table tbl where (a>10) with (publish='delete');
> insert into tbl values (1);
> update tbl set a=a+1;
>
> postgres=# update tbl set a=a+1;
> ERROR:  cannot update table "tbl"
> DETAIL:  Column "a" used in the publication WHERE expression is not part of 
> the replica identity.
>
> I think it shouldn't report the error because the publication didn't publish 
> UPDATES.
>

Right, I also don't see any reason why an error should be thrown in
this case. The problem here is that the patch doesn't have any
correspondence between the pubaction and RI column validation for a
particular publication. I think we need to do that and cache that
information unless the publication publishes both updates and deletes
in which case it is okay to directly return invalid column in row
filter as we are doing now.

-- 
With Regards,
Amit Kapila.

RE: row filtering for logical replication

2022-01-20 Thread tanghy.f...@fujitsu.com

On Thu, Jan 20, 2022 9:13 AM houzj.f...@fujitsu.com  
wrote:
> Attach the V68 patch set which addressed the above comments and changes.
> The version patch also fix the error message mentioned by Greg[1]
> 

I saw a problem about this patch, which is related to Replica Identity check.

For example:
-- publisher --
create table tbl (a int);
create publication pub for table tbl where (a>10) with (publish='delete');
insert into tbl values (1);
update tbl set a=a+1;

postgres=# update tbl set a=a+1;
ERROR:  cannot update table "tbl"
DETAIL:  Column "a" used in the publication WHERE expression is not part of the 
replica identity.

I think it shouldn't report the error because the publication didn't publish 
UPDATES.
Thoughts?

Regards,
Tang

Re: row filtering for logical replication

2022-01-20 Thread Amit Kapila

On Thu, Jan 20, 2022 at 6:42 AM houzj.f...@fujitsu.com
 wrote:
>
> Attach the V68 patch set which addressed the above comments and changes.
>

Few comments and suggestions:
==
1.
/*
+ * For updates, if both the new tuple and old tuple are not null, then both
+ * of them need to be checked against the row filter.
+ */
+ tmp_new_slot = new_slot;
+ slot_getallattrs(new_slot);
+ slot_getallattrs(old_slot);
+

Isn't it better to add assert like
Assert(map_changetype_pubaction[*action] == PUBACTION_UPDATE); before
the above code? I have tried to change this part of the code in the
attached top-up patch.

2.
+ /*
+ * For updates, if both the new tuple and old tuple are not null, then both
+ * of them need to be checked against the row filter.
+ */
+ tmp_new_slot = new_slot;
+ slot_getallattrs(new_slot);
+ slot_getallattrs(old_slot);
+
+ /*
+ * The new tuple might not have all the replica identity columns, in which
+ * case it needs to be copied over from the old tuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /*
+ * if the column in the new tuple or old tuple is null, nothing to do
+ */
+ if (tmp_new_slot->tts_isnull[i] || old_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are only detoasted in the
+ * old tuple, copy this over to the new tuple.
+ */
+ if (att->attlen == -1 &&
+ VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i]) &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))
+ {
+ if (tmp_new_slot == new_slot)
+ {
+ tmp_new_slot = MakeSingleTupleTableSlot(desc, );
+ ExecClearTuple(tmp_new_slot);
+ ExecCopySlot(tmp_new_slot, new_slot);
+ }
+
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ tmp_new_slot->tts_isnull[i] = old_slot->tts_isnull[i];
+ }
+ }


What is the need to assign new_slot to tmp_new_slot at the beginning
of this part of the code? Can't we do this when we found some
attribute that needs to be copied from the old tuple?

The other part which is not clear to me by looking at this code and
comments is how do we ensure that we cover all cases where the new
tuple doesn't have values?

Attached top-up patch with some minor changes. Kindly review.

-- 
With Regards,
Amit Kapila.


v68_changes_amit_1.patch
Description: Binary data

Re: row filtering for logical replication

2022-01-19 Thread Peter Smith

On Thu, Jan 20, 2022 at 2:29 PM Amit Kapila  wrote:
>
> On Thu, Jan 20, 2022 at 7:51 AM Peter Smith  wrote:
> >
> > Here are some review comments for v68-0001.
> >
> >
> > 3. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter_init
> >
> > + /* If no filter found, clean up the memory and return */
> > + if (!has_filter)
> > + {
> > + if (entry->cache_expr_cxt != NULL)
> > + MemoryContextDelete(entry->cache_expr_cxt);
> > +
> > + entry->exprstate_valid = true;
> > + return;
> > + }
> >
> > IMO this should be refactored to have if/else, so the function has
> > just a single point of return and a single point where the
> > exprstate_valid is set. e.g.
> >
> > if (!has_filter)
> > {
> > /* If no filter found, clean up the memory and return */
> > ...
> > }
> > else
> > {
> > /* Create or reset the memory context for row filters */
> > ...
> > /*
> > * Now all the filters for all pubactions are known. Combine them when
> > * their pubactions are same.
> >  ...
> > }
> >
> > entry->exprstate_valid = true;
> >
>
> This part of the code is changed in v68 which makes the current code
> more suitable as we don't need to deal with memory context in if part.
> I am not sure if it is good to add the else block here but I think
> that is just a matter of personal preference.
>

Sorry, my mistake - I quoted the v67 source instead of the v68 source.

There is no else needed now at all. My suggestion becomes. e.g.

if (has_filter)
{
// deal with mem ctx ...
// combine filters ...
}
entry->exprstate_valid = true;
return;

--
Kind Regards,
Peter Smith
Fujitsu Australia

1 2 3 4 5 6 7 >

1 - 100 of 624 matches

Mail list logo