Hi Sergei, thanks for looking into this. On Wed, Aug 28, 2019 at 7:54 PM Sergei Golubchik <s...@mariadb.org> wrote:
> Hi, Robert! > > On Aug 28, Robert Bindar wrote: > > revision-id: 5ba0b11c5d1 (mariadb-10.4.7-35-g5ba0b11c5d1) > > parent(s): 4a490d1a993 > > author: Robert Bindar <rob...@mariadb.org> > > committer: Robert Bindar <rob...@mariadb.org> > > timestamp: 2019-08-27 11:38:52 +0300 > > message: > > > > MDEV-20257 Fix crash in Grant_table_base::init_read_record > > > > The bug shows up when a 10.4+ server starts on a <10.4 datadir > > with a crashed mysql.user table. > > Grant_tables::open_and_lock tries to open the list of requested > > tables (host, db, global_priv,..) and because global_priv doesn't > > exist (pre-10.4 datadir), it falls back to opening mysql.user. > > The first open_tables call for [host, db, global_priv,..] works > > just fine. The second open_tables call (trying to open mysql.user > > alone) sees the crashed user table and performs 3 steps: > > - closes all the open tables > > - attempts a table fix for mysql.user (succeeds) > > - opens the tables initially passed as arguments > > In an ideal world, the first step should only close the tables > > passed as arguments (i.e. mysql.user), but for some reasons > > close_tables_for_reopen makes a close_thread_tables call which > > closes everything in THD::open_tables (nevertheless, this is > > probably the intended behavior). > > This side effect causes all the tables opened in the first > > open_tables call to now be closed and Grant_table_base::init_read_record > > tries to read from a released table. > > > > diff --git a/mysql-test/main/mdev20257.test > b/mysql-test/main/mdev20257.test > > new file mode 100644 > > index 00000000000..8506292d57c > > --- /dev/null > > +++ b/mysql-test/main/mdev20257.test > > @@ -0,0 +1,51 @@ > > +--source include/not_embedded.inc > > +--echo # > > +--echo # MDEV 20257 Server crashes in Grant_table_base::init_read_record > > +--echo # upon crash-upgrade > > +--echo # > > + > > +let $MYSQLD_DATADIR= `select @@datadir`; > > + > > +rename table mysql.global_priv to mysql.global_priv_bak; > > +rename table mysql.user to mysql.user_bak; > > +rename table mysql.db to mysql.db_bak; > > +rename table mysql.proxies_priv to mysql.proxies_priv_bak; > > +rename table mysql.roles_mapping to mysql.roles_mapping_bak; > > + > > +--source include/shutdown_mysqld.inc > > + > > +# Bring in a crashed user table > > +# Ideally we should've copied only the crashed mysql.user, but this > > +# would make mysqld crash in some other place before hitting the > > +# crash spot described in MDEV-20257 which is what we're trying to > > +# test against. > > I don't understand that. Where does it crash then? > What difference does it make that you replace all tables - are they all > corrupted? > Only mysql.user is corrupted. Assuming we take out the fix in acl, if we don't replace the db table with the 10.2 version provided by Elena, the server crashes a bit earlier in lock_tables (called from Grant_tables::open_and_lock) for reasons I don't yet understand, my guess is it has something to do with the aria/myisam change between 10.2 and 10.4. The host table needs to be copied because the reported crash happened when acl_load tried to read the host table and in 10.4 it is not there anymore. And for proxies_priv and roles_mapping, it is enough if we just hide them (otherwise the same Aria crash happens). > > +--copy_file std_data/mdev20257/user.frm $MYSQLD_DATADIR/mysql/user.frm > > +--copy_file std_data/mdev20257/user.MYD $MYSQLD_DATADIR/mysql/user.MYD > > +--copy_file std_data/mdev20257/user.MYI $MYSQLD_DATADIR/mysql/user.MYI > > + > > +--copy_file std_data/mdev20257/host.frm $MYSQLD_DATADIR/mysql/host.frm > > +--copy_file std_data/mdev20257/host.MYD $MYSQLD_DATADIR/mysql/host.MYD > > +--copy_file std_data/mdev20257/host.MYI $MYSQLD_DATADIR/mysql/host.MYI > > + > > +--copy_file std_data/mdev20257/db.frm $MYSQLD_DATADIR/mysql/db.frm > > +--copy_file std_data/mdev20257/db.MYD $MYSQLD_DATADIR/mysql/db.MYD > > +--copy_file std_data/mdev20257/db.MYI $MYSQLD_DATADIR/mysql/db.MYI > > +--copy_file std_data/mdev20257/db.opt $MYSQLD_DATADIR/mysql/db.opt > > + > > +--source include/start_mysqld.inc > > + > > +call mtr.add_suppression("Table \'\..mysql\.user\' is marked as crashed > and should be repaired"); > > +call mtr.add_suppression("Checking table: \'\..mysql\.user\'"); > > +call mtr.add_suppression("mysql.user: 1 client is using or hasn't > closed the table properly"); > > +call mtr.add_suppression("Missing system table mysql..*; please run > mysql_upgrade to create it"); > > + > > +# Cleanup > > +drop table mysql.user; > > +drop table mysql.db; > > +drop table mysql.host; > > +rename table mysql.global_priv_bak to mysql.global_priv; > > +rename table mysql.user_bak to mysql.user; > > +rename table mysql.db_bak to mysql.db; > > +rename table mysql.proxies_priv_bak to mysql.proxies_priv; > > +rename table mysql.roles_mapping_bak to mysql.roles_mapping; > > + > > diff --git a/mysql-test/std_data/mdev20257/db.opt > b/mysql-test/std_data/mdev20257/db.opt > > --- /dev/null > > +++ b/mysql-test/std_data/mdev20257/db.opt > > @@ -0,0 +1,2 @@ > > +default-character-set=latin1 > > +default-collation=latin1_swedish_ci > > why is that? > I agree we don't need db.opt at all. > > > diff --git a/sql/sql_acl.cc b/sql/sql_acl.cc > > index 847d2bd777b..e028d084703 100644 > > --- a/sql/sql_acl.cc > > +++ b/sql/sql_acl.cc > > - DBUG_RETURN(res); > > + first = build_open_list(tables, which_tables, lock_type, true); > > + > > + uint counter; > > + if (int rv= really_open(thd, first, &counter)) > > + DBUG_RETURN(rv); > > + > > + TABLE *user_table= tables[USER_TABLE].table; > > + if ((which_tables & Table_user) && !user_table) > > + { > > + close_thread_tables(thd); > > + first = build_open_list(tables, which_tables, lock_type, false); > > Hmm... Here you always close/reopen all tables, just to account for the > case when the user table is found corrupted. > > I'd argue that a corrupted user table happens rarely (compared to the > non-corrupted), so it makes sense to optimize for the normal use case. > > That is, only open the user table as before. After it's opened you > check if other tables are still opened - if they aren't you can close > and reopen everything. This way you only close/reopen all tables if the > user table was corrupted. > > Makes sense? > > It makes sense. Is there any other way to check if a table is opened except for going through thd->open_tables and see if the table is there? Touching TABLE_LIST::table that we have in open_and_lock might not be a good idea if the table instance is already released to table cache. > > + if (int rv= really_open(thd, first, &counter)) > > + DBUG_RETURN(rv); > > + > > + p_user_table= &m_user_table_tabular; > > + user_table= tables[USER_TABLE + 1].table; > > + } > > Regards, > Sergei > VP of MariaDB Server Engineering > and secur...@mariadb.org >
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp