Thanks for the review, Kristian! I always appreciate learning the history of why some things exist as they are :)
And thanks for the reference to the comment explaining why the master thread_id is logged. Reading the whole comment really solidified it. I'll incorporate your feedback. Brandon On Mon, Jan 29, 2024 at 2:56 PM Kristian Nielsen <kniel...@knielsen-hq.org> wrote: > Oh, and another thing (which Monty pointed out in the MDEV-7850): > > When GTID is binlogged on the slave (--log-slave-updates), the thread id > should be preserved from the original event (the original master's > thread_id). See this comment in log_event_server.cc: > > "To avoid this, we log the thread's thread id EXCEPT for the SQL > slave thread for which we log the original (master's) thread id." > > So this needs to be changed, so that the thread id gets set from the user > query > on the original master and then propagates through the levels of the > replication topology. > > This way, the value will be consistent with the Query_event (historically, > there used to be a BEGIN query event in place of the GTID event, which > provided the thread id for row events; MDEV-7850 is the desire to have that > back). > > - Kristian. > > Kristian Nielsen via developers <developers@lists.mariadb.org> writes: > > >> commit c37b2087b4abe576f1b0391c8d379dba6299dcb5 > (origin/bb-11.4-MDEV-7850) > >> Author: Brandon Nesterenko <brandon.nestere...@mariadb.com> > >> Date: Mon Jul 10 18:53:19 2023 +0300 > >> > >> MDEV-7850: Extend GTID Binlog Events with Thread Id > >> > >> This patch augments Gtid_log_event with the user thread-id. > >> In particular that compensates for the loss of this info in > >> Rows_log_events. > > > > Here's my review of this patch (which was already pushed, but there are > some > > misunderstandings that should be fixed; Monty asked me about it as he had > > been told that I had reviewed the patch, but that is not the case (until > > now)): > > > >> diff --git a/sql/log_event.cc b/sql/log_event.cc > >> index 41ddc038594..f167284ad1d 100644 > >> --- a/sql/log_event.cc > >> +++ b/sql/log_event.cc > > > >> @@ -1901,11 +1904,12 @@ Query_log_event::begin_event(String *packet, > ulong ev_offset, > >> /* > >> Currently we only need to replace GTID event. > >> The length of GTID differs depending on whether it contains commit > id. > >> + And/or thread id. > >> */ > >> - DBUG_ASSERT(data_len == LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN || > >> - data_len == LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + 2); > >> - if (data_len != LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN && > >> - data_len != LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + 2) > >> + DBUG_ASSERT(data_len >= LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN && > >> + data_len <= LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + 2 + > 9); > >> + if (data_len < LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN || > >> + data_len > LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + 2 + 9) > > > > This change of the asserts/checks doesn't make any sense. > > They were there originally, because we are hand-crafting a BEGIN query > > event, and there were only two cases of the length, so two fixed-sized > > events were crafted by two different code paths. And the asserts are > there > > to safe-guard if the length would change. > > > > But now this check was already wrong, as more optional info has been > added > > to the GTID event without updating the code for > > Query_log_event::begin_event() (XID, flags_extra, extra_engines). > > > > But the new code in this patch is generic and can handle any length GTID > > event, so these checks no longer make any sense, and should just be > removed > > (rather than attempt to correct them). > > > >> @@ -1926,9 +1930,19 @@ Query_log_event::begin_event(String *packet, > ulong ev_offset, > >> } > >> else > >> { > >> - DBUG_ASSERT(data_len == LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + > 2); > >> - /* Put in an empty time_zone_str to take up the extra 2 bytes. */ > >> - int2store(q + Q_STATUS_VARS_LEN_OFFSET, 2); > > > >> + DBUG_ASSERT(data_len <= LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + > 11); > > > > This assert should not be there. The length can be larger than 11, and > the > > code below works fine for arbitrary length. > > > >> + DBUG_ASSERT(data_len >= LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + > 2); > >> + > >> + /* Put in an empty time_zone_str to take up the extra 2 plus the > number of > >> + dummy_bytes. */ > >> + size_t dummy_bytes= > >> + data_len - (LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + 2); > >> + > >> + int2store(q + Q_STATUS_VARS_LEN_OFFSET, dummy_bytes + 2); > >> + for (size_t i= 0; i < dummy_bytes; i++) > >> + q[Q_DATA_OFFSET + i]= Q_DUMMY; > >> + q+= dummy_bytes; > >> + > >> q[Q_DATA_OFFSET]= Q_TIME_ZONE_CODE; > >> q[Q_DATA_OFFSET+1]= 0; /* Zero length for empty > time_zone_str */ > >> q[Q_DATA_OFFSET+2]= 0; /* Zero terminator for > empty db */ > > > > This new code with Q_DUMMY is generic and independent of the length of > the > > GTID event. > > > > That's good! > > > > But it needs a comment explaining (that the Q_DUMMY is unknown to the old > > slave, which will make the old slave just skip the rest of the user vars, > > which are just there to fill up the extra bytes). > > > > And now the code that adds the dummy empty time_zone_str is redundant, > and > > should just be removed, as should the code for the case of data_len == > > LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN. Since now any extra bytes in the > > GTID can just be replaced by Q_DUMMY. > > > > This way, the code will be fixed to work for all the possible GTID > lengths, > > and be shorter and simpler (and a lot less confusing). > > > > > >> @@ -1584,6 +1584,9 @@ Query_log_event::Query_log_event(const uchar > *buf, uint event_len, > >> sa_seq_no = uint8korr(pos); > >> pos+= 8; > >> } > >> + } > >> + case Q_DUMMY: > >> + { > >> break; > >> } > >> default: > >> /* That's why you must write status vars in growing order of > code */ > >> DBUG_PRINT("info",("Query_log_event has unknown status vars > (first has code: %u), skipping the rest of them", (uint) *(pos-1))); > >> pos= (const uchar*) end; // Break loop > > > > There's no point in having this code. Q_DUMMY is supposed to be higher > than > > any known status var type, so it will be ignored anyway in the `default` > case. > > And besides, server that knows about Q_DUMMY can never receive it in the > > first place. > > > > (But if this is really wanted, then don't delete the `break` statement in > > the prior switch case, that's likely to introduce a bug the next time a > case > > is added. And assign Q_DUMMY the next free value, not the value 255). > > > > > >> diff --git a/sql/log_event.h b/sql/log_event.h > >> index 473506fafed..d324642ad5e 100644 > >> --- a/sql/log_event.h > >> +++ b/sql/log_event.h > >> @@ -324,6 +324,7 @@ class String; > >> > >> #define Q_HRNOW 128 > >> #define Q_XID 129 > >> +#define Q_DUMMY 255 > > > > This needs a good comment, explaining how Q_DUMMY is a code that will be > > unknown to any server version, and can be used to make the slave skip > this > > (and any following) user vars. > > > > > > diff --git a/mysql-test/suite/rpl/t/rpl_gtid_thread_id.test > > b/mysql-test/suite/rpl/t/rpl_gtid_thread_id.test > > new file mode 100644 > > index 00000000000..b8ed1865da9 > > --- /dev/null > > +++ b/mysql-test/suite/rpl/t/rpl_gtid_thread_id.test > > @@ -0,0 +1,125 @@ > > +# > > +# Verify that GTID log events are written into the binary log along > with the > > +# id of the thread which executed the transaction. On the primary, this > is the > > <snip> > > > > Disclaimer: I did not review this testcase. > > > > - Kristian. > > _______________________________________________ > > developers mailing list -- developers@lists.mariadb.org > > To unsubscribe send an email to developers-le...@lists.mariadb.org >
_______________________________________________ developers mailing list -- developers@lists.mariadb.org To unsubscribe send an email to developers-le...@lists.mariadb.org