Re: [Maria-developers] c80f867b0df: MDEV-23844 Atomic DROP TABLE
Hi! On Wed, May 12, 2021 at 3:18 PM Sergei Golubchik wrote: > > Hi, Michael! > > On May 12, Michael Widenius wrote: > > revision-id: c80f867b0df (mariadb-10.6.0-78-gc80f867b0df) > > parent(s): eebebe8ef75 > > author: Michael Widenius > > committer: Michael Widenius > > timestamp: 2021-05-04 21:09:11 +0300 > > message: > > > > MDEV-23844 Atomic DROP TABLE > ... > > diff --git a/sql/sql_view.cc b/sql/sql_view.cc > > index e6d726b30d7..ff8aa55adef 100644 > > --- a/sql/sql_view.cc > > +++ b/sql/sql_view.cc > > @@ -1861,8 +1868,18 @@ bool mysql_drop_view(THD *thd, TABLE_LIST *views, > > enum_drop_mode drop_mode) > > not_exists_count++; > >continue; > > } > > +if (!first_table++) > > yuck. so your "first_table" actually means "not_first_view", > because when it's FALSE, it's the first view. > > why not to rename it to "not_first_view" ? I have renamed it to view_count. I prefer to use a counter as it is easer to set and test at the same time. Also, when debugging it also helps to see where one is. > > +{ > > + if (ddl_log_drop_view_init(thd, _log_state)) > > +DBUG_RETURN(TRUE); > > what about a crash here? the first entry is written, the second is not. The init creates the bases for a set of views to drop. We do init once and then we only need to do ddl_log_drop_view for the rest. Most ddl_log operations has only one call. Only those that can have multiple operations, like drop or rename has an init call. > > +} > > +if (ddl_log_drop_view(thd, _log_state, , >db, > > + >table_name)) > > + DBUG_RETURN(TRUE); > > diff --git a/sql/log.cc b/sql/log.cc > > index 6e0a9706ec8..dba999054e4 100644 > > --- a/sql/log.cc > > +++ b/sql/log.cc > > @@ -10471,16 +10483,16 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const > > char *last_log_name, > >bool last_gtid_standalone= false; > >bool last_gtid_valid= false; > > #endif > > + DBUG_ENTER("TC_LOG_BINLOG::recover"); > > > >if (! fdle->is_valid() || > > - (do_xa && my_hash_init(key_memory_binlog_recover_exec, , > > - _charset_bin, TC_LOG_PAGE_SIZE/3, 0, > > - sizeof(my_xid), 0, 0, MYF(0 > > + (my_hash_init(key_memory_binlog_recover_exec, , > > +_charset_bin, TC_LOG_PAGE_SIZE/3, 0, > > +sizeof(my_xid), 0, 0, MYF(0 > > goto err1; > > > > - if (do_xa) > > -init_alloc_root(key_memory_binlog_recover_exec, _root, > > -TC_LOG_PAGE_SIZE, TC_LOG_PAGE_SIZE, MYF(0)); > > + init_alloc_root(key_memory_binlog_recover_exec, _root, > > + TC_LOG_PAGE_SIZE, TC_LOG_PAGE_SIZE, MYF(0)); > > I don't think these two changes are correct. do_xa is TRUE on crash recovery > and FALSE on gtid state recovery on 5.5->10.0 upgade or if gtid state > file was deleted. The problem is that do_xa is not always true on crash recovery. For example if the following is not true in do_binlog_recovery: if (ev->flags & LOG_EVENT_BINLOG_IN_USE_F) It will execute binglog recovery later with do_xa_recovery == false trough the following code inTC_LOG_BINLOG::open(const char *opt_name) if (!binlog_state_recover_done) { binlog_state_recover_done= true; if (do_binlog_recovery(opt_bin_logname, false)) DBUG_RETURN(1); } Here is a trace: #0 MYSQL_BIN_LOG::recover (this=0x2783d80 , linfo=0x7fff8590, last_log_name=0x7fff8100 "./master-bin.01", first_log=0x7fff8410, fdle=0x35685a8, do_xa=false) at /my/maria-10.6/sql/log.cc:10474 #1 0x00dc4c10 in MYSQL_BIN_LOG::do_binlog_recovery (this=0x2783d80 , opt_name=0x319d098 "master-bin", do_xa_recovery=false) at /my/maria-10.6/sql/log.cc:10779 #2 0x00db0c90 in MYSQL_BIN_LOG::open (this=0x2783d80 , log_name=0x319d098 "master-bin", new_name=0x0, next_log_number=0, io_cache_type_arg=WRITE_CACHE, max_size_arg=1073741824, null_created_arg=false, need_mutex=true) at /my/maria-10.6/sql/log.cc:3617 #3 0x007c0d24 in init_server_components () at We also have to be able to collect xa events if someone has deleted the gtid state file for ddl_recovery to work. > There is no point to initialize the xid hash or mem_root, > as it's not a crash recovery and there can be no incomplete transactions > or ddl statements. > Yes, I've seen the commit comment: > > > - TC_LOG_BINLOG::recover() was changed to always collect Xid's from the > > binary log and always call ddl_log_close_binlogged_events(). This was > > needed to be able to collect DROP TABLE events with embedded Xid's > > (used by ddl log). > > but I cannot understand how it applies here. The issue is that because of ddl log, ::recover must ALWAYS collect binlog 'xid's to be able to do ddl log recovery at all. You can verify this by reversing the patch and do: mtr atomic.create_view This will fail with a different result that shows that ddl recovery was reverting things wrongly because it
Re: [Maria-developers] c80f867b0df: MDEV-23844 Atomic DROP TABLE
Hi, Michael! On May 12, Michael Widenius wrote: > revision-id: c80f867b0df (mariadb-10.6.0-78-gc80f867b0df) > parent(s): eebebe8ef75 > author: Michael Widenius > committer: Michael Widenius > timestamp: 2021-05-04 21:09:11 +0300 > message: > > MDEV-23844 Atomic DROP TABLE ... > diff --git a/sql/sql_view.cc b/sql/sql_view.cc > index e6d726b30d7..ff8aa55adef 100644 > --- a/sql/sql_view.cc > +++ b/sql/sql_view.cc > @@ -1861,8 +1868,18 @@ bool mysql_drop_view(THD *thd, TABLE_LIST *views, > enum_drop_mode drop_mode) > not_exists_count++; >continue; > } > +if (!first_table++) yuck. so your "first_table" actually means "not_first_view", because when it's FALSE, it's the first view. why not to rename it to "not_first_view" ? > +{ > + if (ddl_log_drop_view_init(thd, _log_state)) > +DBUG_RETURN(TRUE); what about a crash here? the first entry is written, the second is not. > +} > +if (ddl_log_drop_view(thd, _log_state, , >db, > + >table_name)) > + DBUG_RETURN(TRUE); > +debug_crash_here("ddl_log_drop_before_delete_view"); > if (unlikely(mysql_file_delete(key_file_frm, path, MYF(MY_WME >delete_error= TRUE; > +debug_crash_here("ddl_log_drop_after_delete_view"); > > some_views_deleted= TRUE; > > diff --git a/sql/log.cc b/sql/log.cc > index 6e0a9706ec8..dba999054e4 100644 > --- a/sql/log.cc > +++ b/sql/log.cc > @@ -10471,16 +10483,16 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const > char *last_log_name, >bool last_gtid_standalone= false; >bool last_gtid_valid= false; > #endif > + DBUG_ENTER("TC_LOG_BINLOG::recover"); > >if (! fdle->is_valid() || > - (do_xa && my_hash_init(key_memory_binlog_recover_exec, , > - _charset_bin, TC_LOG_PAGE_SIZE/3, 0, > - sizeof(my_xid), 0, 0, MYF(0 > + (my_hash_init(key_memory_binlog_recover_exec, , > +_charset_bin, TC_LOG_PAGE_SIZE/3, 0, > +sizeof(my_xid), 0, 0, MYF(0 > goto err1; > > - if (do_xa) > -init_alloc_root(key_memory_binlog_recover_exec, _root, > -TC_LOG_PAGE_SIZE, TC_LOG_PAGE_SIZE, MYF(0)); > + init_alloc_root(key_memory_binlog_recover_exec, _root, > + TC_LOG_PAGE_SIZE, TC_LOG_PAGE_SIZE, MYF(0)); I don't think these two changes are correct. do_xa is TRUE on crash recovery and FALSE on gtid state recovery on 5.5->10.0 upgade or if gtid state file was deleted. There is no point to initialize the xid hash or mem_root, as it's not a crash recovery and there can be no incomplete transactions or ddl statements. Yes, I've seen the commit comment: > - TC_LOG_BINLOG::recover() was changed to always collect Xid's from the > binary log and always call ddl_log_close_binlogged_events(). This was > needed to be able to collect DROP TABLE events with embedded Xid's > (used by ddl log). but I cannot understand how it applies here. >fdle->flags&= ~LOG_EVENT_BINLOG_IN_USE_F; // abort on the first error > > diff --git a/sql/sql_table.cc b/sql/sql_table.cc > index 7a88ffdfd1b..0a83396ae14 100644 > --- a/sql/sql_table.cc > +++ b/sql/sql_table.cc > @@ -1363,6 +1373,17 @@ int mysql_rm_table_no_locks(THD *thd, TABLE_LIST > *tables, bool if_exists, > > thd->replication_flags= 0; > was_view= table_type == TABLE_TYPE_VIEW; > + > +if (!first_table++) first_table -> not_first_table > +{ > + LEX_CSTRING comment= {comment_start, (size_t) comment_len}; > + if (ddl_log_drop_table_init(thd, _log_state, )) > + { > +error= 1; > +goto err; > + } > +} > + > if ((table_type == TABLE_TYPE_UNKNOWN) || (was_view && !drop_view) || > (table_type != TABLE_TYPE_SEQUENCE && drop_sequence)) > { > @@ -1565,6 +1611,7 @@ int mysql_rm_table_no_locks(THD *thd, TABLE_LIST > *tables, bool if_exists, > table_name.str, (uint)table_name.length); >mysql_audit_drop_table(thd, table); > } > +ddl_log_update_phase(_log_state, DDL_DROP_PHASE_COLLECT); better call it DDL_DROP_PHASE_BINLOG. It would've saved me ~10 minutes reading the code back and forth, as ""DDL_DROP_PHASE_BINLOG" immediately explains what this phase is for. > > if (!dont_log_query && > (!error || table_dropped || non_existing_table_error(error))) > diff --git a/sql/ddl_log.h b/sql/ddl_log.h > index 316e6708f22..b843404fc3c 100644 > --- a/sql/ddl_log.h > +++ b/sql/ddl_log.h > @@ -97,6 +101,14 @@ enum enum_ddl_log_rename_table_phase { >DDL_RENAME_PHASE_TABLE, > }; > > +enum enum_ddl_log_drop_table_phase { > + DDL_DROP_PHASE_TABLE=0, > + DDL_DROP_PHASE_TRIGGER, no phase for deleting data from stat tables? > + DDL_DROP_PHASE_COLLECT, > + DDL_DROP_PHASE_RESET, /* Reset found list of dropped tables */ this one is unused (also unused in further patches, I've checked) > + DDL_DROP_PHASE_END > +};