Hi Brandon,

Kristian Nielsen via commits <comm...@lists.mariadb.org> writes:

> Check thd->transaction->all.modified_non_trans_table at the end of an
> rpl_group_info::SPECULATE_OPTIMISTIC event group, and stop replication
> with an error if a non-transactional table was modified.

What do you think of this RFC patch?

This is something that always bothered me for optimistic and aggressive
parallel replication. The code detects some properties of the event groups
(notably the "transactional" status) during execution on the master, and
marks it in the GTID event in the binlog. This is arguably the wrong place,
since the information might not be accurate on the slave (if the master and
slave tables differ eg. in storage engines). But it is tricky to get the
information on the slave, as it only becomes available after parsing events
and opening tables.

While ideally the data on master and slave should be identical, we do
support to some extent replicating with difference in data in various ways;
and in fact this is one of the primary benefits of having logical
replication. And it is really not good that users have to know on their own
initiative that optimistic parallel replication cannot work with different
storage engines on master and slave (unlike non-parallel replication), and
they might get silently wrong data on the slave.

I guess I always thought the slave had to somehow fall back to
SPECULATE_WAIT for any non-transactional table found during
SPECULATE_OPTIMISTIC, which is not trivial to do. But I think this pragmatic
solution of simply detecting the problem and stopping with an error to
inform the user, is a good compromise; and it seems easy to implement, as in
this patch.

 - Kristian.

>
> This helps the user avoid a problem when replicating from InnoDB
> tables on the master to MyISAM/Aria tables on the slave, for example
> if --default-storage-engine is configured differently on the two
> servers. Since the transactional state of an event group is detected
> on the master, optimistic and aggressive parallel replication cannot
> support replicating from InnoDB to MyISAM, and this can lead to
> strange errors or silent incorrect data on slave.
>
> Detecting the problem and giving an error helps detect this problem
> for users.
>
> Signed-off-by: Kristian Nielsen <kniel...@knielsen-hq.org>
> ---
>  .../suite/rpl/r/rpl_parallel_nontrans.result  | 55 +++++++++++++++++++
>  .../suite/rpl/t/rpl_parallel_nontrans.test    | 54 ++++++++++++++++++
>  sql/rpl_parallel.cc                           | 29 ++++++++++
>  sql/rpl_rli.cc                                |  1 +
>  sql/rpl_rli.h                                 |  6 ++
>  5 files changed, 145 insertions(+)
>  create mode 100644 mysql-test/suite/rpl/r/rpl_parallel_nontrans.result
>  create mode 100644 mysql-test/suite/rpl/t/rpl_parallel_nontrans.test
>
> diff --git a/mysql-test/suite/rpl/r/rpl_parallel_nontrans.result 
> b/mysql-test/suite/rpl/r/rpl_parallel_nontrans.result
> new file mode 100644
> index 00000000000..4f8ab2cd6bd
> --- /dev/null
> +++ b/mysql-test/suite/rpl/r/rpl_parallel_nontrans.result
> @@ -0,0 +1,55 @@
> +include/master-slave.inc
> +[connection master]
> +connection slave;
> +include/stop_slave.inc
> +SET STATEMENT sql_log_bin=0 FOR
> +ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
> +SET STATEMENT sql_log_bin=0 FOR
> +CALL mtr.add_suppression("Slave SQL: Modifying non-transactional 
> table\\(s\\) in GTID .* not allowed in optimistic/aggressive parallel 
> replication mode");
> +SET @old_default_engine= @@GLOBAL.default_storage_engine;
> +SET GLOBAL default_storage_engine=MyISAM;
> +SET @old_parallel_threads= @@GLOBAL.slave_parallel_threads;
> +SET GLOBAL slave_parallel_threads= 4;
> +SET @old_parallel_mode= @@GLOBAL.slave_parallel_mode;
> +SET GLOBAL slave_parallel_mode= optimistic;
> +include/start_slave.inc
> +connection master;
> +SET default_storage_engine=InnoDB;
> +CREATE TABLE t1 (a INT PRIMARY KEY, b INT);
> +SHOW CREATE TABLE t1;
> +Table        t1
> +Create Table CREATE TABLE `t1` (
> +  `a` int(11) NOT NULL,
> +  `b` int(11) DEFAULT NULL,
> +  PRIMARY KEY (`a`)
> +) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci
> +INSERT INTO t1 VALUES (1, 0);
> +INSERT INTO t1 VALUES (2, 1), (3, 1), (4, 1);
> +INSERT INTO t1 VALUES (10, 2);
> +INSERT INTO t1 VALUES (11, 2);
> +UPDATE t1 SET b=3 WHERE a=1;
> +DELETE FROM t1 WHERE a=3;
> +UPDATE t1 SET b=4 WHERE a=10;
> +include/save_master_gtid.inc
> +connection slave;
> +include/wait_for_slave_sql_error.inc [errno=0]
> +include/stop_slave_io.inc
> +SET GLOBAL slave_parallel_mode= conservative;
> +include/start_slave.inc
> +include/sync_with_master_gtid.inc
> +SHOW CREATE TABLE t1;
> +Table        t1
> +Create Table CREATE TABLE `t1` (
> +  `a` int(11) NOT NULL,
> +  `b` int(11) DEFAULT NULL,
> +  PRIMARY KEY (`a`)
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci
> +connection slave;
> +include/stop_slave.inc
> +SET GLOBAL default_storage_engine= @old_default_engine;
> +SET GLOBAL slave_parallel_threads= @old_parallel_threads;
> +SET GLOBAL slave_parallel_mode= @old_parallel_mode;
> +include/start_slave.inc
> +connection master;
> +DROP TABLE t1;
> +include/rpl_end.inc
> diff --git a/mysql-test/suite/rpl/t/rpl_parallel_nontrans.test 
> b/mysql-test/suite/rpl/t/rpl_parallel_nontrans.test
> new file mode 100644
> index 00000000000..511823a62b8
> --- /dev/null
> +++ b/mysql-test/suite/rpl/t/rpl_parallel_nontrans.test
> @@ -0,0 +1,54 @@
> +--source include/have_innodb.inc
> +--source include/have_binlog_format_row.inc
> +--source include/master-slave.inc
> +
> +--connection slave
> +--source include/stop_slave.inc
> +SET STATEMENT sql_log_bin=0 FOR
> +  ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
> +SET STATEMENT sql_log_bin=0 FOR
> +  CALL mtr.add_suppression("Slave SQL: Modifying non-transactional 
> table\\(s\\) in GTID .* not allowed in optimistic/aggressive parallel 
> replication mode");
> +
> +SET @old_default_engine= @@GLOBAL.default_storage_engine;
> +SET GLOBAL default_storage_engine=MyISAM;
> +SET @old_parallel_threads= @@GLOBAL.slave_parallel_threads;
> +SET GLOBAL slave_parallel_threads= 4;
> +SET @old_parallel_mode= @@GLOBAL.slave_parallel_mode;
> +SET GLOBAL slave_parallel_mode= optimistic;
> +--source include/start_slave.inc
> +
> +--connection master
> +SET default_storage_engine=InnoDB;
> +CREATE TABLE t1 (a INT PRIMARY KEY, b INT);
> +query_vertical SHOW CREATE TABLE t1;
> +
> +INSERT INTO t1 VALUES (1, 0);
> +INSERT INTO t1 VALUES (2, 1), (3, 1), (4, 1);
> +INSERT INTO t1 VALUES (10, 2);
> +INSERT INTO t1 VALUES (11, 2);
> +UPDATE t1 SET b=3 WHERE a=1;
> +DELETE FROM t1 WHERE a=3;
> +UPDATE t1 SET b=4 WHERE a=10;
> +
> +--source include/save_master_gtid.inc
> +--connection slave
> +--let $slave_sql_errno= 0
> +--source include/wait_for_slave_sql_error.inc
> +
> +--source include/stop_slave_io.inc
> +SET GLOBAL slave_parallel_mode= conservative;
> +--source include/start_slave.inc
> +--source include/sync_with_master_gtid.inc
> +query_vertical SHOW CREATE TABLE t1;
> +
> +--connection slave
> +--source include/stop_slave.inc
> +SET GLOBAL default_storage_engine= @old_default_engine;
> +SET GLOBAL slave_parallel_threads= @old_parallel_threads;
> +SET GLOBAL slave_parallel_mode= @old_parallel_mode;
> +--source include/start_slave.inc
> +
> +--connection master
> +DROP TABLE t1;
> +
> +--source include/rpl_end.inc
> diff --git a/sql/rpl_parallel.cc b/sql/rpl_parallel.cc
> index 70b80e8884f..07564039315 100644
> --- a/sql/rpl_parallel.cc
> +++ b/sql/rpl_parallel.cc
> @@ -1549,6 +1549,9 @@ handle_rpl_parallel_thread(void *arg)
>        if (likely(!rgi->worker_error) && !skip_event_group)
>        {
>          ++rgi->retry_event_count;
> +        if (group_ending)
> +          rgi->modified_nontrans_table=
> +            thd->transaction->all.modified_non_trans_table;
>  #ifndef DBUG_OFF
>          err= 0;
>          DBUG_EXECUTE_IF("rpl_parallel_simulate_temp_err_xid",
> @@ -1612,8 +1615,34 @@ handle_rpl_parallel_thread(void *arg)
>        }
>        if (end_of_group)
>        {
> +        rpl_group_info::enum_speculation speculation= rgi->speculation;
> +
>          in_event_group= false;
>          finish_event_group(rpt, event_gtid_sub_id, entry, rgi);
> +        if (unlikely(rgi->modified_nontrans_table) &&
> +            speculation == rpl_group_info::SPECULATE_OPTIMISTIC)
> +        {
> +          /*
> +            We give this error here after finish_event_group() to try to let
> +            the current event group complete before stopping replication with
> +            an error, to have a better chance of the replication position
> +            being still valid.
> +          */
> +          sql_print_error("Slave SQL: Modifying non-transactional table(s) "
> +                          "in GTID %u-%u-%llu not allowed in "
> +                          "optimistic/aggressive parallel replication mode. "
> +                          "Either ensure tables on slave use a transactional 
> "
> +                          "storage engine (eg. InnoDB) whenever the master "
> +                          "does, or disable optimistic/aggressive parallel "
> +                          "replication mode", rgi->current_gtid.domain_id,
> +                          rgi->current_gtid.server_id,
> +                          rgi->current_gtid.seq_no);
> +          if (!rgi->worker_error)
> +          {
> +            slave_output_error_info(rgi, thd);
> +            signal_error_to_sql_driver_thread(thd, rgi, 1);
> +          }
> +        }
>          rpt->loc_free_rgi(rgi);
>          thd->rgi_slave= group_rgi= rgi= NULL;
>          skip_event_group= false;
> diff --git a/sql/rpl_rli.cc b/sql/rpl_rli.cc
> index c16980e57c9..e60a0064748 100644
> --- a/sql/rpl_rli.cc
> +++ b/sql/rpl_rli.cc
> @@ -2151,6 +2151,7 @@ rpl_group_info::reinit(Relay_log_info *rli)
>    row_stmt_start_timestamp= 0;
>    long_find_row_note_printed= false;
>    did_mark_start_commit= false;
> +  modified_nontrans_table= false;
>    gtid_ev_flags2= 0;
>    gtid_ev_flags_extra= 0;
>    gtid_ev_sa_seq_no= 0;
> diff --git a/sql/rpl_rli.h b/sql/rpl_rli.h
> index cf991584573..b5c805a6569 100644
> --- a/sql/rpl_rli.h
> +++ b/sql/rpl_rli.h
> @@ -786,6 +786,12 @@ struct rpl_group_info
>      counting one event group twice.
>    */
>    bool did_mark_start_commit;
> +  /*
> +    Copy of thd->transaction->all.modified_non_trans_table at the end of
> +    an event group. Used to catch problems where table definitions are
> +    non-transactional on the slave, breaking optimistic mode.
> +  */
> +  bool modified_nontrans_table;
>    /* Copy of flags2 from GTID event. */
>    uchar gtid_ev_flags2;
>    /* Copy of flags3 from GTID event. */
_______________________________________________
developers mailing list -- developers@lists.mariadb.org
To unsubscribe send an email to developers-le...@lists.mariadb.org

Reply via email to