On Fri, Nov 1, 2024 at 7:10 AM Peter Smith <smithpb2...@gmail.com> wrote: > > > ====== > doc/src/sgml/protocol.sgml > > 3. > <para> > - Next, one of the following submessages appears for each column: > + Next, one of the following submessages appears for each column > (except generated columns): > > Hmm. Now that generated column replication is supported is this change > still required? >
How about changing it to: "Next, one of the following submessages appears for each published column:"? This is because the column may not be sent because either it is not in the column list or a generated one (with publish_generated_columns as false for respective publication). > ====== > doc/src/sgml/ref/create_publication.sgml > > 4. > + > + <varlistentry > id="sql-createpublication-params-with-publish-generated-columns"> > + <term><literal>publish_generated_columns</literal> > (<type>boolean</type>)</term> > + <listitem> > + <para> > + Specifies whether the generated columns present in the tables > + associated with the publication should be replicated. > + The default is <literal>false</literal>. > + </para> > + </listitem> > + </varlistentry> > + > > I know that the subsequent DOCS patch V1-0002 will explain more about > this, but as a stand-alone patch 0001 maybe you need to clarify that a > publication column list will override this 'publish_generated_columns' > parameter. > It is better to leave it to 0002 patch. But note in that patch, we should add some reference link for the column_list behavior in the create publication page as well. > ====== > src/backend/catalog/pg_publication.c > > > pub_getallcol_bitmapset: > > 6. > +/* > + * Return a column list bitmap for the specified table. > + * > + * Generated columns are included if pubgencols is true. > + * > + * If mcxt isn't NULL, build the bitmapset in that context. > + */ > +Bitmapset * > +pub_getallcol_bitmapset(Relation relation, bool pubgencols, > + MemoryContext mcxt) > > IIUC this is a BMS of the table columns to be published. The function > comment seems confusing to me when it says "column list bitmap" > because IIUC this function is not really anything to do with a > publication "column list", which is an entirely different thing. > We can probably name it pub_form_cols_map() and change the comments accordingly. > ====== > src/backend/replication/logical/proto.c > > 7. > static void logicalrep_write_attrs(StringInfo out, Relation rel, > - Bitmapset *columns); > + Bitmapset *columns, bool pubgencols); > static void logicalrep_write_tuple(StringInfo out, Relation rel, > TupleTableSlot *slot, > - bool binary, Bitmapset *columns); > + bool binary, Bitmapset *columns, > + bool pubgencols); > > The meaning of all these new 'pubgencols' are ambiguous. e.g. Are they > (a) The value of the CREATE PUBLICATION 'publish_generate_columns' > parameter, or does it mean (b) Just some generated column is being > published (maybe via column list or maybe not). > > I think it means (a) but, if true, that could be made much more clear > by changing all of these names to 'pubgencols_option' or something > similar. Actually, now I have doubts about that also -- I think this > might be magically assigned to false if no generated columns exist in > the table. Anyway, please do whatever you can to disambiguate this. > To make it clear we can name this parameter as include_gencols. Similarly, change the name of RelationSyncEntry's new member. > ~~~ > > 9. > bool > logicalrep_should_publish_column(Form_pg_attribute att, Bitmapset *columns, > bool pubgencols) > { > if (att->attisdropped) > return false; > > /* > * Skip publishing generated columns if they are not included in the > * column list or if the option is not specified. > */ > if (!columns && !pubgencols && att->attgenerated) > return false; > > /* > * Check if a column is covered by a column list. > */ > if (columns && !bms_is_member(att->attnum, columns)) > return false; > > return true; > } > > Same as mentioned before in my previous v46-0001 review comments, I > feel that the conditionals of this function are over-complicated and > that there are more 'return' points than necessary. The alternative > code below looks simpler to me. > > SUGGESTION > bool > logicalrep_should_publish_column(Form_pg_attribute att, Bitmapset *columns, > bool pubgencols_option) > { > if (att->attisdropped) > return false; > > if (columns) > { > /* > * Has a column list: > * Publish only cols named in that list. > */ > return bms_is_member(att->attnum, columns); > } > else > { > /* > * Has no column list: > * Publish generated cols only if 'publish_generated_cols' is true. > * Publish all non-generated cols. > */ > return att->attgenerated ? pubgencols_option : true; > } > } > Fair enough but do we need else in the above code? > ====== > src/backend/replication/pgoutput/pgoutput.c > > 10. > + /* Include publishing generated columns */ > + bool pubgencols; > + > > There is similar ambiguity here with this field-name as was mentioned > about for other 'pbgencols' function params. I had initially thought > that this this just caries around same value as the publication option > 'publish_generated_columns' but now (after looking at function > check_and_init_gencol) I think that might not be the case because I > saw it can be assigned false (overriding the publication option?). > > Anyway, the comment needs to be made much clearer about what is the > true meaning of this field. Or, rename it if there is a better name. > As suggested above, we can name it as include_gencols. > > send_relation_and_attrs: > > 12. > - if (!logicalrep_should_publish_column(att, columns)) > + if (!logicalrep_should_publish_column(att, columns, relentry->pubgencols)) > continue; > It seemed a bit strange/inconsistent that 'columns' was assigned to a > local var, but 'pubgencols' was not, given they are both fields of the > same struct. Maybe removing this 'columns' var would be consistent > with other code in this patch. > I think the other way would be better. I mean take another local variable for this function. We don't need to always do the same in such cases. > ~~~ > > 13. > check_and_init_gencol: > > nit - missing periods for comments. > > ~~~ > > 14. > + /* There is no generated columns to be published * > > /There is no generated columns/There are no generated columns/ > > ~~~ > > 15. > + foreach(lc, publications) > + { > + Publication *pub = lfirst(lc); > > AFAIK this can be re-written using a different macro to avoid needing > the 'lc' var. > > ~~~ > > pgoutput_column_list_init: > > 16. > + bool collistpubexist = false; > > This seemed like not a very good name, How about 'found_pub_with_collist'; > > ~~~ > > 17. > bool pub_no_list = true; > > nit - Not caused by this patch, but it's closely related; In passing > we should declare this variable at a lower scope, and rename it to > 'isnull' which is more in keeping with the comments around it. > Moving to local scope is okay but doing more than that in this patch is not advisable even if your suggestion is a good idea which I am not sure. > ~ > > 20b. > There is a GENERAL problem that applies for lots of comments of this > patch (including this comment) because the new publication option is > referred to inconsistently in many different ways: > > e.g. > - the generated columns option. > - if the option is not specified > - publish_generated_columns option. > - the pubgencols option > - 'publish_generated_columns' option > > All these references should be made the same. My personal preference > is the last one ('publish_generated_columns' option). > I have responded with a better name for other places. Here, the proposed name seems okay to me. -- With Regards, Amit Kapila.