Hi Shlok. Here are my review comments for patch v10-0003

======
General.

1.
The patch has lots of conditions like:
if (att->attgenerated && (att->attgenerated !=
ATTRIBUTE_GENERATED_STORED || !include_generated_columns))
 continue;

IMO these are hard to read. Although more verbose, please consider if
all those (for the sake of readability) would be better re-written
like below :

if (att->generated)
{
  if (!include_generated_columns)
    continue;

  if (att->attgenerated != ATTRIBUTE_GENERATED_STORED)
    continue;
}

======
contrib/test_decoding/test_decoding.c

tuple_to_stringinfo:

NITPICK = refactored the code and comments a bit here to make it easier
NITPICK - there is no need to mention "virtual". Instead, say we only
support STORED

======
src/backend/catalog/pg_publication.c

publication_translate_columns:

NITPICK - introduced variable 'att' to simplify this code

~

2.
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot use virtual generated column \"%s\" in publication
column list",
+    colname));

Is it better to avoid referring to "virtual" at all? Instead, consider
rearranging the wording to say something like "generated column \"%s\"
is not STORED so cannot be used in a publication column list"

~~~

pg_get_publication_tables:

NITPICK - split the condition code for readability

======
src/backend/replication/logical/relation.c

3. logicalrep_rel_open

+ if (attr->attgenerated && attr->attgenerated != ATTRIBUTE_GENERATED_STORED)
+ continue;
+

Isn't this missing some code to say "entry->attrmap->attnums[i] =
-1;", same as all the other nearby code is doing?

~~~

4.
I felt all the "generated column" logic should be kept together, so
this new condition should be combined with the other generated column
condition, like:

if (attr->attgenerated)
{
  /* comment... */
  if (!MySubscription->includegencols)
  {
    entry->attrmap->attnums[i] = -1;
    continue;
  }

  /* comment... */
  if (attr->attgenerated != ATTRIBUTE_GENERATED_STORED)
  {
    entry->attrmap->attnums[i] = -1;
    continue;
  }
}

======
src/backend/replication/logical/tablesync.c

5.
+ if (gencols_allowed)
+ {
+ /* Replication of generated cols is supported, but not VIRTUAL cols. */
+ appendStringInfo(&cmd, " AND a.attgenerated != 'v'");
+ }

Is it better here to use the ATTRIBUTE_GENERATED_VIRTUAL macro instead
of the hardwired 'v'? (Maybe add another TODO comment to revisit
this).

Alternatively, consider if it is more future-proof to rearrange so it
just says what *is* supported instead of what *isn't* supported:
e.g. "AND a.attgenerated IN ('', 's')"

======
src/test/subscription/t/011_generated.pl

NITPICK - some comments are missing the word "stored"
NITPICK - sometimes "col" should be plural "cols"
NITPICK = typo "GNERATED"

======

6.
In a previous review [1, comment #3] I mentioned that there should be
some docs updates on the "Logical Replication Message Formats" section
53.9. So, I expected patch 0001 would make some changes and then patch
0003 would have to update it again to say something about "STORED".
But all that is missing from the v10* patches.

======

99.
See also my nitpicks diff which is a top-up patch addressing all the
nitpick comments mentioned above. Please apply all of these that you
agree with.

======
[1] 
https://www.postgresql.org/message-id/CAHut%2BPvQ8CLq-JysTTeRj4u5SC9vTVcx3AgwTHcPUEOh-UnKcQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

On Mon, Jun 24, 2024 at 10:56 PM Shlok Kyal <shlok.kyal....@gmail.com> wrote:
>
> On Fri, 21 Jun 2024 at 09:03, Peter Smith <smithpb2...@gmail.com> wrote:
> >
> > Hi, here are some review comments for patch v9-0002.
> >
> > ======
> > src/backend/replication/logical/relation.c
> >
> > 1. logicalrep_rel_open
> >
> > - if (attr->attisdropped)
> > + if (attr->attisdropped ||
> > + (!MySubscription->includegencols && attr->attgenerated))
> >
> > You replied to my question from the previous review [1, #2] as follows:
> > In case 'include_generated_columns' is 'true'. column list in
> > remoterel will have an entry for generated columns. So, in this case
> > if we skip every attr->attgenerated, we will get a missing column
> > error.
> >
> > ~
> >
> > TBH, the reason seems very subtle to me. Perhaps that
> > "(!MySubscription->includegencols && attr->attgenerated))" condition
> > should be coded as a separate "if", so then you can include a comment
> > similar to your answer, to explain it.
> Fixed
>
> > ======
> > src/backend/replication/logical/tablesync.c
> >
> > make_copy_attnamelist:
> >
> > NITPICK - punctuation in function comment
> > NITPICK - add/reword some more comments
> > NITPICK - rearrange comments to be closer to the code they are commenting
> Applied the changes
>
> > ~
> >
> > 2. make_copy_attnamelist.
> >
> > + /*
> > + * Construct column list for COPY.
> > + */
> > + for (int i = 0; i < rel->remoterel.natts; i++)
> > + {
> > + if(!gencollist[i])
> > + attnamelist = lappend(attnamelist,
> > +   makeString(rel->remoterel.attnames[i]));
> > + }
> >
> > IIUC isn't this assuming that the attribute number (aka column order)
> > is the same on the subscriber side (e.g. gencollist idx) and on the
> > publisher side (e.g. remoterel.attnames[i]).  AFAIK logical
> > replication does not require this ordering must be match like that,
> > therefore I am suspicious this new logic is accidentally imposing new
> > unwanted assumptions/restrictions. I had asked the same question
> > before [1-#4] about this code, but there was no reply.
> >
> > Ideally, there would be more test cases for when the columns
> > (including the generated ones) are all in different orders on the
> > pub/sub tables.
> 'gencollist' is set according to the remoterel
> +           gencollist[attnum] = true;
> where attnum is the attribute number of the corresponding column on remote 
> rel.
>
> I have also added the tests to confirm the behaviour
>
> > ~~~
> >
> > 3. General - varnames.
> >
> > It would help with understanding if the 'attgenlist' variables in all
> > these functions are re-named to make it very clear that this is
> > referring to the *remote* (publisher-side) table genlist, not the
> > subscriber table genlist.
> Fixed
>
> > ~~~
> >
> > 4.
> > + int i = 0;
> > +
> >   appendStringInfoString(&cmd, "COPY (SELECT ");
> > - for (int i = 0; i < lrel.natts; i++)
> > + foreach_ptr(ListCell, l, attnamelist)
> >   {
> > - appendStringInfoString(&cmd, quote_identifier(lrel.attnames[i]));
> > - if (i < lrel.natts - 1)
> > + appendStringInfoString(&cmd, quote_identifier(strVal(l)));
> > + if (i < attnamelist->length - 1)
> >   appendStringInfoString(&cmd, ", ");
> > + i++;
> >   }
> >
> > 4a.
> > I think the purpose of the new macros is to avoid using ListCell, and
> > also 'l' is an unhelpful variable name. Shouldn't this code be more
> > like:
> > foreach_node(String, att_name, attnamelist)
> >
> > ~
> >
> > 4b.
> > The code can be far simpler if you just put the comma (", ") always
> > up-front except the *first* iteration, so you can avoid checking the
> > list length every time. For example:
> >
> > if (i++)
> >   appendStringInfoString(&cmd, ", ");
> Fixed
>
> > ======
> > src/test/subscription/t/011_generated.pl
> >
> > 5. General.
> >
> > Hmm. This patch 0002 included many formatting changes to tables tab2
> > and tab3 and subscriptions sub2 and sub3 but they do not belong here.
> > The proper formatting for those needs to be done back in patch 0001
> > where they were introduced. Patch 0002 should just concentrate only on
> > the new stuff for patch 0002.
> Fixed
>
> > ~
> >
> > 6. CREATE TABLES would be better in pairs
> >
> > IMO it will be better if the matching CREATE TABLE for pub and sub are
> > kept together, instead of separating them by doing all pub then all
> > sub. I previously made the same comment for patch 0001, so maybe it
> > will be addressed next time...
> Fixed
>
> > ~
> >
> > 7. SELECT *
> >
> > +$result =
> > +  $node_subscriber->safe_psql('postgres', "SELECT * FROM tab4 ORDER BY a");
> >
> > It will be prudent to do explicit "SELECT a,b,c" instead of "SELECT
> > *", for readability and so there are no surprises.
> Fixed
>
> > ======
> >
> > 99.
> > Please also refer to my attached nitpicks diff for numerous cosmetic
> > changes, and apply if you agree.
> Applied the changes.
>
> > ======
> > [1] 
> > https://www.postgresql.org/message-id/CAHut%2BPtAsEc3PEB1KUk1kFF5tcCrDCCTcbboougO29vP1B4E2Q%40mail.gmail.com
>
> I have attached a v10 patch to address the comments:
> v10-0001 - Not Modified
> v10-0002 - Support replication of generated columns during initial sync.
> v10-0003 - Fix behaviour for Virtual Generated Columns.
>
> Thanks and Regards,
> Shlok Kyal
diff --git a/contrib/test_decoding/test_decoding.c 
b/contrib/test_decoding/test_decoding.c
index abce99a..db584c8 100644
--- a/contrib/test_decoding/test_decoding.c
+++ b/contrib/test_decoding/test_decoding.c
@@ -557,14 +557,19 @@ tuple_to_stringinfo(StringInfo s, TupleDesc tupdesc, 
HeapTuple tuple,
                if (attr->attisdropped)
                        continue;
 
-               /*
-                * Don't print virtual generated column. Don't print stored
-                * generated column if 'include_generated_columns' is false.
-                *
-                * TODO: can use ATTRIBUTE_GENERATED_VIRTUAL to simpilfy
-                */
-               if (attr->attgenerated && (attr->attgenerated != 
ATTRIBUTE_GENERATED_STORED || !include_generated_columns))
-                       continue;
+               if (attr->attgenerated)
+               {
+                       /*
+                        * Don't print generated columns when
+                        * 'include_generated_columns' is false.
+                        */
+                       if (!include_generated_columns)
+                               continue;
+
+                       /* Don't print generated columns unless they are 
STORED. */
+                       if (attr->attgenerated != ATTRIBUTE_GENERATED_STORED)
+                               continue;
+               }
 
                /*
                 * Don't print system columns, oid will already have been 
printed if
diff --git a/src/backend/catalog/pg_publication.c 
b/src/backend/catalog/pg_publication.c
index e5e5aef..935ba81 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -521,6 +521,7 @@ publication_translate_columns(Relation targetrel, List 
*columns,
        {
                char       *colname = strVal(lfirst(lc));
                AttrNumber      attnum = 
get_attnum(RelationGetRelid(targetrel), colname);
+               Form_pg_attribute att;
 
                if (attnum == InvalidAttrNumber)
                        ereport(ERROR,
@@ -534,11 +535,8 @@ publication_translate_columns(Relation targetrel, List 
*columns,
                                        errmsg("cannot use system column \"%s\" 
in publication column list",
                                                   colname));
 
-               /*
-                * TODO: simplify the expression
-                */
-               if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated &&
-                       TupleDescAttr(tupdesc, attnum - 1)->attgenerated != 
ATTRIBUTE_GENERATED_STORED)
+               att = TupleDescAttr(tupdesc, attnum - 1);
+               if (att->attgenerated && att->attgenerated != 
ATTRIBUTE_GENERATED_STORED)
                        ereport(ERROR,
                                        
errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
                                        errmsg("cannot use virtual generated 
column \"%s\" in publication column list",
@@ -1236,7 +1234,10 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
                        {
                                Form_pg_attribute att = TupleDescAttr(desc, i);
 
-                               if (att->attisdropped || (att->attgenerated && 
att->attgenerated != ATTRIBUTE_GENERATED_STORED))
+                               if (att->attisdropped)
+                                       continue;
+
+                               if (att->attgenerated && att->attgenerated != 
ATTRIBUTE_GENERATED_STORED)
                                        continue;
 
                                attnums[nattnums++] = att->attnum;
diff --git a/src/test/subscription/t/011_generated.pl 
b/src/test/subscription/t/011_generated.pl
index 361d9f3..edcb149 100644
--- a/src/test/subscription/t/011_generated.pl
+++ b/src/test/subscription/t/011_generated.pl
@@ -39,7 +39,7 @@ $node_subscriber->safe_psql('postgres',
        "CREATE TABLE tab2 (a int, b int)"
 );
 
-# publisher-side tab3 has stored generated col 'b' but subscriber-side tab2 
has DIFFERENT COMPUTATION generated col 'b'.
+# publisher-side tab3 has stored generated col 'b' but subscriber-side tab2 
has DIFFERENT COMPUTATION stored generated col 'b'.
 $node_publisher->safe_psql('postgres',
        "CREATE TABLE tab3 (a int, b int GENERATED ALWAYS AS (a + 10) STORED)"
 );
@@ -48,7 +48,7 @@ $node_subscriber->safe_psql('postgres',
        "CREATE TABLE tab3 (a int, b int GENERATED ALWAYS AS (a + 20) STORED)"
 );
 
-# tab4: publisher-side stored generated col 'b' and 'c' --> subscriber-side 
non-generated col 'b', and stored generated-col 'c'
+# tab4: publisher-side stored generated cols 'b' and 'c' --> subscriber-side 
non-generated col 'b', and stored generated col 'c'
 $node_publisher->safe_psql('postgres',
        "CREATE TABLE tab4 (a int , b int GENERATED ALWAYS AS (a * 2) STORED, c 
int GENERATED ALWAYS AS (a * 2) STORED)"
 );
@@ -63,7 +63,7 @@ $node_publisher->safe_psql('postgres', "CREATE TABLE tab5 (a 
int, b int)");
 $node_subscriber->safe_psql('postgres',
        "CREATE TABLE tab5 (a int, b int GENERATED ALWAYS AS (a * 22) STORED)");
 
-# tab6: publisher-side stored generated col 'b' and 'c' --> subscriber-side 
non-generated col 'b', and stored generated-col 'c'
+# tab6: publisher-side stored generated cols 'b' and 'c' --> subscriber-side 
non-generated col 'b', and stored generated-col 'c'
 # columns on subscriber in different order
 $node_publisher->safe_psql('postgres',
        "CREATE TABLE tab6 (a int, b int GENERATED ALWAYS AS (a * 2) STORED, c 
int GENERATED ALWAYS AS (a * 2) STORED)");
@@ -184,7 +184,7 @@ $node_publisher->safe_psql('postgres', "INSERT INTO tab6 
VALUES (4), (5)");
 
 $node_publisher->wait_for_catchup('sub6');
 
-# stored gen-col 'b' and 'c' in publisher replicating to NOT gen-col 'b' and 
gen-col 'c' on subscriber
+# stored gen-cols 'b' and 'c' in publisher replicating to NOT gen-col 'b' and 
stored gen-col 'c' on subscriber
 # order of column is different on subscriber
 $result =
   $node_subscriber->safe_psql('postgres', "SELECT a, b, c FROM tab6 ORDER BY 
a");

Reply via email to