On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > > > > I suggest taking a couple of steps back from the minutiae of the > > patch, and spending some hard effort thinking about how the thing > > would be controlled in a useful fashion (that is, a real design for > > the filtering that was mentioned at the very outset), and about the > > security issues, and about how we could get to a committable patch. > > > > Agreed. I'll try to summarize the discussion we have till now on this > and share my thoughts on the same in a separate email. >
The idea to control what could be replicated is to introduce a new publication option 'ddl' along with current options 'publish' and 'publish_via_partition_root'. The values of this new option could be 'table', 'function', 'all', etc. Here 'all' enables the replication of all supported DDL commands. Example usage for this would be: Example: Create a new publication with all ddl replication enabled: CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); Enable table ddl replication for an existing Publication: ALTER PUBLICATION pub2 SET (ddl = 'table'); This is what seems to have been discussed but I think we can even extend it to support based on operations/commands, say one would like to publish only 'create' and 'drop' of tables. Then we can extend the existing publish option to have values like 'create', 'alter', and 'drop'. Another thing we are considering related to this is at what level these additional options should be specified. We have three variants FOR TABLE, FOR ALL TABLES, and FOR TABLES IN SCHEMA that enables replication. Now, for the sake of simplicity, this new option is discussed to be provided only with FOR ALL TABLES variant but I think we can provide it with other variants with some additional restrictions like with FOR TABLE, we can only specify 'alter' and 'drop' for publish option. Now, though possible, it brings additional complexity to support it with variants other than FOR ALL TABLES because then we need to ensure additional filtering and possible modification of the content we have to send to downstream. So, we can even decide to first support it only FOR ALL TABLES variant. The other point to consider for publish option 'ddl = table' is whether we need to allow replicating dependent objects like say some user-defined type is used in the table. I guess the difficulty here would be to identify which dependents we want to allow. I think in the first version we should allow to replicate only some of the objects instead of everything. For example, can we consider only allowing tables and indexes in the first version? Then extend it in a phased manner? AFAICR, we have discussed two things related to security. (a) ownership of objects created via DDL replication. We have discussed providing an option at subscription level to allow objects to have the same ownership (as it has on the publisher) after apply to the subscriber. If that option is not enabled the objects will be owned by the subscription owner. (b) Allow use of functions replicated to be used even if they don't use schema qualify objects. Currently, we override the search_path in apply worker to an empty string to ensure that apply worker doesn't execute arbitrary expressions as it works with the privileges of the subscription owner which would be a superuser. We have discussed providing a search_path as an option at the subscription level or a GUC to allow apply workers to use a specified search_path. Do you have anything else in mind? -- With Regards, Amit Kapila.