On Tue, Jun 21, 2022 at 5:49 PM houzj.f...@fujitsu.com <houzj.f...@fujitsu.com> wrote: > > On Monday, June 20, 2022 11:32 AM houzj.f...@fujitsu.com > <houzj.f...@fujitsu.com> wrote: > > > > On Saturday, June 18, 2022 3:38 AM Zheng Li <zhengl...@gmail.com> wrote: > > > On Wed, Jun 15, 2022 at 12:00 AM Amit Kapila <amit.kapil...@gmail.com> > > > wrote: > > > > > > > > On Wed, Jun 15, 2022 at 5:44 AM Zheng Li <zhengl...@gmail.com> wrote: > > > > > > > > > > > > > > > While I agree that the deparser is needed to handle the potential > > > > > syntax differences between the pub/sub, I think it's only relevant > > > > > for the use cases where only a subset of tables in the database are > > > > > replicated. For other use cases where all tables, functions and > > > > > other objects need to be replicated, (for example, creating a > > > > > logical replica for major version upgrade) there won't be any syntax > > > > > difference to handle and the schemas are supposed to match exactly > > > > > between the pub/sub. In other words the user seeks to create an > > > > > identical replica of the source database and the DDLs should be > > > > > replicated as is in this case. > > > > > > > > > > > > > I think even for database-level replication we can't assume that > > > > source and target will always have the same data in which case "Create > > > > Table As ..", "Alter Table .. " kind of statements can't be replicated > > > > as it is because that can lead to different results. > > > "Create Table As .." is already handled by setting the skipData flag of > > > the > > > statement parsetreee before replay: > > > > > > /* > > > * Force skipping data population to avoid data inconsistency. > > > * Data should be replicated from the publisher instead. > > > */ > > > castmt->into->skipData = true; > > > > > > "Alter Table .. " that rewrites with volatile expressions can also be > > > handled > > > without any syntax change, by enabling the table rewrite replication and > > > converting the rewrite inserts to updates. ZJ's patch introduced this > > > solution. > > > I've also adopted this approach in my latest patch > > > 0012-Support-replication-of-ALTER-TABLE-commands-that-rew.patch > > > > > > > The other point > > > > is tomorrow we can extend the database level option/syntax to exclude > > > > a few objects (something like [1]) as well in which case we again need > > > > to filter at the publisher level > > > > > > I think for such cases it's not full database replication and we could > > > treat it as > > > table level DDL replication, i.e. use the the deparser format. > > > > Hi, > > > > Here are some points in my mind about the two approaches discussed here. > > > > 1) search_patch vs schema qualify > > > > Again, I still think it will bring more flexibility and security by schema > > qualify the > > objects in DDL command as mentioned before[1]. > > > > Besides, a schema qualified DDL is also more appropriate for other use > > cases(e.g. a table-level replication). As it's possible the schema is > > different > > between pub/sub and it's easy to cause unexpected and undetectable failure > > if > > we just log the search_path. > > > > It makes more sense to me to have the same style WAL log(schema qualified) > > for > > both database level or table level replication as it will bring more > > flexibility. > > > > > > > "Create Table As .." is already handled by setting the skipData flag of > > > the > > > statement parsetreee before replay: > > > > 2) About the handling of CREATE TABLE AS: > > > > I think it's not a appropriate approach to set the skipdata flag on > > subscriber > > as it cannot handle EXECUTE command in CTAS. > > > > CREATE TABLE q5_prep_results AS EXECUTE q5(200, 'DTAAAA'); > > > > The Prepared statement is a temporary object which we don't replicate. So if > > you directly execute the original SQL on subscriber, even if you set > > skipdata > > it will fail. > > > > I think it difficult to make this work as you need handle the create/drop of > > this prepared statement. And even if we extended subscriber's code to make > > it > > work, it doesn't seems like a standard and elegant approach. > > > > > > > "Alter Table .. " that rewrites with volatile expressions can also be > > > handled > > > without any syntax change, by enabling the table rewrite replication and > > > converting the rewrite inserts to updates. ZJ's patch introduced this > > > solution. > > > > 3) About the handling of ALTER TABLE rewrite. > > > > The approach I proposed before is based on the event trigger + deparser > > approach. We were able to improve that approach as we don't need to > > replicate > > the rewrite in many cases. For example: we don't need to replicate rewrite > > dml > > if there is no volatile/mutable function. We should check and filter these > > case > > at publisher (e.g. via deparser) instead of checking that at subscriber. > > > > Besides, as discussed, we need to give warning or error for the cases when > > DDL > > contains volatile function which would be executed[2]. We should check this > > at > > publisher as well(via deparser). > > > > > > > I think for such cases it's not full database replication and we could > > > treat it as > > > table level DDL replication, i.e. use the the deparser format. > > > > 4) I think the point could be that we should make the WAL log format > > extendable > > so that we can extend it to support more useful feature(table filter/schema > > maps/DDL filter). If we just WAL log the original SQL, it seems it's > > difficult > > to extend it in the future ? > > Attach the new version patch set which added support for CREATE/DROP/ATER > Sequence and CREATE/DROP Schema ddl commands which are provided by Ajin > Cherian off list. > > The new version patch will also check function's volatility[1] in ALTER TABLE > command. If any function to be executed is volatile, we report an ERROR. > Whether WARNING is better to be used here is still under consideration.
Few comments: 1) I found a null pointer access while trying to use the ddl feature: #0 0x000055e9deb2904b in EventTriggerTableInitWrite (real_create=0x55e9e0eb0288, address=...) at event_trigger.c:989 #1 0x000055e9deb15745 in create_ctas_internal (attrList=0x55e9e0eb0140, into=0x55e9e0e86710) at createas.c:154 #2 0x000055e9deb16181 in intorel_startup (self=0x55e9e0e86d00, operation=1, typeinfo=0x55e9e0ea99d0) at createas.c:526 #3 0x000055e9debdcfdc in standard_ExecutorRun (queryDesc=0x55e9e0e8f240, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:352 #4 0x000055e9debdcf0b in ExecutorRun (queryDesc=0x55e9e0e8f240, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:307 #5 0x000055e9deb15cd2 in ExecCreateTableAs (pstate=0x55e9e0e86830, stmt=0x55e9e0e717e8, params=0x0, queryEnv=0x0, qc=0x7fff45517190) at createas.c:346 #6 0x000055e9dee3202a in ProcessUtilitySlow (pstate=0x55e9e0e86830, pstmt=0x55e9e0e70b18, queryString=0x55e9e0e483e0 "CREATE TEMP TABLE pg_temp_5.pg_temp_24961_2 AS SELECT mv.ctid AS tid, newdata.*::pg_temp_5.pg_temp_24961 AS newdata FROM public.sro_index_mv mv FULL JOIN pg_temp_5.pg_temp_24961 newdata ON (newdata.c "..., context=PROCESS_UTILITY_QUERY, params=0x0, queryEnv=0x0, dest=0x55e9df2e7640 <spi_printtupDR>, qc=0x7fff45517190) at utility.c:1669 #7 0x000055e9dee30b5d in standard_ProcessUtility (pstmt=0x55e9e0e70b18, queryString=0x55e9e0e483e0 "CREATE TEMP TABLE pg_temp_5.pg_temp_24961_2 AS SELECT mv.ctid AS tid, newdata.*::pg_temp_5.pg_temp_24961 AS newdata FROM public.sro_index_mv mv FULL JOIN pg_temp_5.pg_temp_24961 newdata ON (newdata.c "..., readOnlyTree=true, context=PROCESS_UTILITY_QUERY, params=0x0, queryEnv=0x0, dest=0x55e9df2e7640 <spi_printtupDR>, qc=0x7fff45517190) at utility.c:1074 #8 0x000055e9dee2fac7 in ProcessUtility (pstmt=0x55e9e0e19538, queryString=0x55e9e0e483e0 "CREATE TEMP TABLE pg_temp_5.pg_temp_24961_2 AS SELECT mv.ctid AS tid, newdata.*::pg_temp_5.pg_temp_24961 AS newdata FROM public.sro_index_mv mv FULL JOIN pg_temp_5.pg_temp_24961 newdata ON (newdata.c "..., readOnlyTree=true, context=PROCESS_UTILITY_QUERY, params=0x0, queryEnv=0x0, dest=0x55e9df2e7640 <spi_printtupDR>, qc=0x7fff45517190) at utility.c:530 #9 0x000055e9dec3c5f1 in _SPI_execute_plan (plan=0x7fff45517230, options=0x7fff45517200, snapshot=0x0, crosscheck_snapshot=0x0, fire_triggers=true) at spi.c:2693 #10 0x000055e9dec38ea8 in SPI_execute ( src=0x55e9e0e483e0 "CREATE TEMP TABLE pg_temp_5.pg_temp_24961_2 AS SELECT mv.ctid AS tid, newdata.*::pg_temp_5.pg_temp_24961 AS newdata FROM public.sro_index_mv mv FULL JOIN pg_temp_5.pg_temp_24961 newdata ON (newdata.c "..., read_only=false, tcount=0) at spi.c:618 #11 0x000055e9dec38efd in SPI_exec ( src=0x55e9e0e483e0 "CREATE TEMP TABLE pg_temp_5.pg_temp_24961_2 AS SELECT mv.ctid AS tid, newdata.*::pg_temp_5.pg_temp_24961 AS newdata FROM public.sro_index_mv mv FULL JOIN pg_temp_5.pg_temp_24961 newdata ON (newdata.c "..., tcount=0) at spi.c:630 #12 0x000055e9deb5360a in refresh_by_match_merge (matviewOid=24961, tempOid=24966, relowner=24909, save_sec_context=0) at matview.c:795 #13 0x000055e9deb528ca in ExecRefreshMatView (stmt=0x55e9e0d3e670, queryString=0x55e9e0d3dc18 "REFRESH MATERIALIZED VIEW CONCURRENTLY sro_index_mv;", params=0x0, qc=0x7fff45517d40) at matview.c:317 currentCommand seems to be null here: + /* + * Also do nothing if our state isn't set up, which it won't be if there + * weren't any relevant event triggers at the start of the current DDL + * command. This test might therefore seem optional, but it's + * *necessary*, because EventTriggerCommonSetup might find triggers that + * didn't exist at the time the command started. + */ + if (!currentEventTriggerState) + return; + + command = currentEventTriggerState->currentCommand; + + runlist = EventTriggerCommonSetup(command->parsetree, + EVT_TableInitWrite, + "table_init_write", + &trigdata); 2) Missing copyright information: diff --git a/src/include/tcop/ddl_deparse.h b/src/include/tcop/ddl_deparse.h new file mode 100644 index 0000000..4f3c55f --- /dev/null +++ b/src/include/tcop/ddl_deparse.h @@ -0,0 +1,12 @@ +#ifndef DDL_DEPARSE_H +#define DDL_DEPARSE_H + +#include "tcop/deparse_utility.h" 3) This line removal is not required: diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 87aa571..8aa636c 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -11884,5 +11884,10 @@ proname => 'brin_minmax_multi_summary_send', provolatile => 's', prorettype => 'bytea', proargtypes => 'pg_brin_minmax_multi_summary', prosrc => 'brin_minmax_multi_summary_send' }, - +{ oid => '4642', descr => 'ddl deparse', + proname => 'ddl_deparse_to_json', prorettype => 'text', 4) We could add few comments: +typedef enum +{ + SpecTypename, + SpecOperatorname, + SpecDottedName, + SpecString, + SpecNumber, + SpecStringLiteral, + SpecIdentifier, + SpecRole +} convSpecifier; + +typedef enum +{ + tv_absent, + tv_true, + tv_false +} trivalue; 5) Missing function header: +static void fmtstr_error_callback(void *arg); +char *ddl_deparse_json_to_string(char *jsonb); + +static trivalue +find_bool_in_jsonbcontainer(JsonbContainer *container, char *keyname) +{ + JsonbValue key; + JsonbValue *value; + trivalue result; 6) This can be removed: +/* +removed feature + case AT_AddOidsRecurse: + case AT_AddOids: + tmp = new_objtree_VA("SET WITH OIDS", 1, + "type", ObjTypeString, "set with oids"); + subcmds = lappend(subcmds, new_object_object(tmp)); + break; +*/ Regards, Vignesh