Howdy Monty! I took to review it as well having thinking (along with Sanja) that this bug may be critical for the ES release.
Let me sum up the patch purpose as the following. The 2 issues of > - Annotate rows could be written after row events or in the next GTID > event. > - See rpl_row_mixing_engines > - Annotate_rows was not always written to binary log in case of error > with a transactional table (rolled back) but not transactinal table > was updated. > - See sp_trans_log, binlog_row_mix_innodb_myisam are caused by the fact that the Annotate event is not always written in the NON-transactional cache while the current statement modifies some of non-transactional tables along with the transactional ones. Your patch makes sure to that. I would agree with this approach, but it is not perfect as the relocated Annotate event detaches from a transactional table map. Like in a test that gets updated: include/show_binlog_events.inc Log_name Pos Event_type Server_id End_log_pos Info master-bin.000001 # Gtid # # BEGIN GTID #-#-# +master-bin.000001 # Annotate_rows # # UPDATE tt_4, nt_4 SET tt_4.info= "new text 210 --> 2", nt_4.info= "new text 210 --> 2" where nt_4.trans_id = tt_4.trans_id and tt_4.trans_id = 1 master-bin.000001 # Table_map # # table_id: # (test.nt_4) master-bin.000001 # Update_rows_v1 # # table_id: # flags: STMT_END_F master-bin.000001 # Query # # COMMIT master-bin.000001 # Gtid # # BEGIN GTID #-#-# -master-bin.000001 # Annotate_rows # # UPDATE tt_4, nt_4 SET tt_4.info= "new text 210 --> 2", nt_4.info= "new text 210 --> 2" where nt_4.trans_id = tt_4.trans_id and tt_4.trans_id = 1 master-bin.000001 # Table_map # # table_id: # (test.tt_4) master-bin.000001 # Update_rows_v1 # # table_id: # flags: STMT_END_F master-bin.000001 # Query # # use `test`; INSERT INTO tt_1(trans_id, stmt_id) VALUES (210, 4) As you see in the patched version the 2nd group of rows events is not annotated. Sure with just a single multi-table query one can deduce relations, but it'd be rather uncomfortable in case of the multi-statement transaction. I would consider to duplicate (rather that relocate) the Annotate event in such cases. I was not able to confirm the 3rd complain of the commit message: > - Annotate_rows could be written multiple times for same event. > - See rpl_non_direct_mixed_mixing_engines Now to the patch itself, I think we could optimize your method of finding what is the right cache for the Annotate. That can be computed within the current single loop, like the following @@ -6328,9 +6337,17 @@ bool THD::binlog_write_table_maps() } if (table->file->row_logging) { - if (binlog_write_table_map(table, with_annotate)) + /* + When the table list has a non-transactional table wait for + its loop to place the Annotate there. + */ + bool do_annot= with_annotate ? + (stmt_modifies_non_trans_table ? !table->file->row_logging_has_trans : + true) : false; + if (binlog_write_table_map(table, do_annot)) DBUG_RETURN(1); - with_annotate= 0; + if (do_annot) + with_annotate= 0; In the modified loop I refer to a new `bool THD::stmt_modifies_non_trans_table` which is naturally (cheaply) computed by `THD::decide_logging_format`. Even though there could be a succession of its calls we still can remember any discovered fact of write-locked NON-transaction table. I'll do this way @@ -6257,7 +6257,16 @@ void THD::binlog_prepare_for_row_logging() { if (table->query_id == query_id && table->current_lock == F_WRLCK) table->file->prepare_for_row_logging(); - } + }; + /* + Remember any discovered intent to modify a non-transactional table + while possibly proceeding through a succession of decide_logging_format() + and until table maps have been written to caches. + */ + if (!binlog_table_maps) + stmt_modifies_non_trans_table= stmt_modifies_non_trans_table || + lex->stmt_accessed_table(LEX::STMT_WRITES_NON_TRANS_TABLE); + DBUG_VOID_RETURN; } And when the flag is UP that means `THD::binlog_write_table_maps()` 's loop must (asserted) eventually go through such table and at that time the former `with_annotate` is really `true`. The new bool resets along with `THD::binlog_table_maps` @@ -3187,6 +3193,7 @@ class THD: public THD_count, /* this must be first */ void reset_binlog_for_next_statement() { binlog_table_maps= 0; + stmt_modifies_non_trans_table= 0; } And this three parts comprise a solution that produces the same updates to the test results of your patch. I can commit it to a github branch, please let me know if you're interested in this approach. If you do I think the new flag `bool THD::stmt_modifies_non_trans_table` could be better positioned in `binlog_cache_mngr` class naturally in order to avoid THD swell. Cheers, Andrei _______________________________________________ developers mailing list -- developers@lists.mariadb.org To unsubscribe send an email to developers-le...@lists.mariadb.org