[MariaDB developers] Re: Review of MDEV-7850: Extend GTID Binlog Events with Thread Id

Brandon Nesterenko via developers Tue, 30 Jan 2024 13:14:22 -0800

Thanks for the review, Kristian!

I always appreciate learning the history of why some things exist as they
are :)


And thanks for the reference to the comment explaining why the master
thread_id
is logged. Reading the whole comment really solidified it. I'll incorporate
your
feedback.

Brandon

On Mon, Jan 29, 2024 at 2:56 PM Kristian Nielsen <kniel...@knielsen-hq.org>
wrote:

> Oh, and another thing (which Monty pointed out in the MDEV-7850):
>
> When GTID is binlogged on the slave (--log-slave-updates), the thread id
> should be preserved from the original event (the original master's
> thread_id). See this comment in log_event_server.cc:
>
>     "To avoid this, we log the thread's thread id EXCEPT for the SQL
>     slave thread for which we log the original (master's) thread id."
>
> So this needs to be changed, so that the thread id gets set from the user
> query
> on the original master and then propagates through the levels of the
> replication topology.
>
> This way, the value will be consistent with the Query_event (historically,
> there used to be a BEGIN query event in place of the GTID event, which
> provided the thread id for row events; MDEV-7850 is the desire to have that
> back).
>
>  - Kristian.
>
> Kristian Nielsen via developers <developers@lists.mariadb.org> writes:
>
> >> commit c37b2087b4abe576f1b0391c8d379dba6299dcb5
> (origin/bb-11.4-MDEV-7850)
> >> Author: Brandon Nesterenko <brandon.nestere...@mariadb.com>
> >> Date:   Mon Jul 10 18:53:19 2023 +0300
> >>
> >>     MDEV-7850: Extend GTID Binlog Events with Thread Id
> >>
> >>     This patch augments Gtid_log_event with the user thread-id.
> >>     In particular that compensates for the loss of this info in
> >>     Rows_log_events.
> >
> > Here's my review of this patch (which was already pushed, but there are
> some
> > misunderstandings that should be fixed; Monty asked me about it as he had
> > been told that I had reviewed the patch, but that is not the case (until
> > now)):
> >
> >> diff --git a/sql/log_event.cc b/sql/log_event.cc
> >> index 41ddc038594..f167284ad1d 100644
> >> --- a/sql/log_event.cc
> >> +++ b/sql/log_event.cc
> >
> >> @@ -1901,11 +1904,12 @@ Query_log_event::begin_event(String *packet,
> ulong ev_offset,
> >>    /*
> >>      Currently we only need to replace GTID event.
> >>      The length of GTID differs depending on whether it contains commit
> id.
> >> +    And/or thread id.
> >>    */
> >> -  DBUG_ASSERT(data_len == LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN ||
> >> -              data_len == LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + 2);
> >> -  if (data_len != LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN &&
> >> -      data_len != LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + 2)
> >> +  DBUG_ASSERT(data_len >= LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN &&
> >> +              data_len <= LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + 2 +
> 9);
> >> +  if (data_len < LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN ||
> >> +      data_len > LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + 2 + 9)
> >
> > This change of the asserts/checks doesn't make any sense.
> > They were there originally, because we are hand-crafting a BEGIN query
> > event, and there were only two cases of the length, so two fixed-sized
> > events were crafted by two different code paths. And the asserts are
> there
> > to safe-guard if the length would change.
> >
> > But now this check was already wrong, as more optional info has been
> added
> > to the GTID event without updating the code for
> > Query_log_event::begin_event() (XID, flags_extra, extra_engines).
> >
> > But the new code in this patch is generic and can handle any length GTID
> > event, so these checks no longer make any sense, and should just be
> removed
> > (rather than attempt to correct them).
> >
> >> @@ -1926,9 +1930,19 @@ Query_log_event::begin_event(String *packet,
> ulong ev_offset,
> >>    }
> >>    else
> >>    {
> >> -    DBUG_ASSERT(data_len == LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN +
> 2);
> >> -    /* Put in an empty time_zone_str to take up the extra 2 bytes. */
> >> -    int2store(q + Q_STATUS_VARS_LEN_OFFSET, 2);
> >
> >> +    DBUG_ASSERT(data_len <= LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN +
> 11);
> >
> > This assert should not be there. The length can be larger than 11, and
> the
> > code below works fine for arbitrary length.
> >
> >> +    DBUG_ASSERT(data_len >= LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN +
> 2);
> >> +
> >> +    /* Put in an empty time_zone_str to take up the extra 2 plus the
> number of
> >> +       dummy_bytes. */
> >> +    size_t dummy_bytes=
> >> +        data_len - (LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN + 2);
> >> +
> >> +    int2store(q + Q_STATUS_VARS_LEN_OFFSET, dummy_bytes + 2);
> >> +    for (size_t i= 0; i < dummy_bytes; i++)
> >> +      q[Q_DATA_OFFSET + i]= Q_DUMMY;
> >> +    q+= dummy_bytes;
> >> +
> >>      q[Q_DATA_OFFSET]= Q_TIME_ZONE_CODE;
> >>      q[Q_DATA_OFFSET+1]= 0;           /* Zero length for empty
> time_zone_str */
> >>      q[Q_DATA_OFFSET+2]= 0;                  /* Zero terminator for
> empty db */
> >
> > This new code with Q_DUMMY is generic and independent of the length of
> the
> > GTID event.
> >
> > That's good!
> >
> > But it needs a comment explaining (that the Q_DUMMY is unknown to the old
> > slave, which will make the old slave just skip the rest of the user vars,
> > which are just there to fill up the extra bytes).
> >
> > And now the code that adds the dummy empty time_zone_str is redundant,
> and
> > should just be removed, as should the code for the case of data_len ==
> > LOG_EVENT_HEADER_LEN + GTID_HEADER_LEN. Since now any extra bytes in the
> > GTID can just be replaced by Q_DUMMY.
> >
> > This way, the code will be fixed to work for all the possible GTID
> lengths,
> > and be shorter and simpler (and a lot less confusing).
> >
> >
> >> @@ -1584,6 +1584,9 @@ Query_log_event::Query_log_event(const uchar
> *buf, uint event_len,
> >>          sa_seq_no = uint8korr(pos);
> >>          pos+= 8;
> >>        }
> >> +    }
> >> +    case Q_DUMMY:
> >> +    {
> >>        break;
> >>      }
> >>      default:
> >>        /* That's why you must write status vars in growing order of
> code */
> >>        DBUG_PRINT("info",("Query_log_event has unknown status vars
> (first has code: %u), skipping the rest of them", (uint) *(pos-1)));
> >>        pos= (const uchar*) end;                         // Break loop
> >
> > There's no point in having this code. Q_DUMMY is supposed to be higher
> than
> > any known status var type, so it will be ignored anyway in the `default`
> case.
> > And besides, server that knows about Q_DUMMY can never receive it in the
> > first place.
> >
> > (But if this is really wanted, then don't delete the `break` statement in
> > the prior switch case, that's likely to introduce a bug the next time a
> case
> > is added. And assign Q_DUMMY the next free value, not the value 255).
> >
> >
> >> diff --git a/sql/log_event.h b/sql/log_event.h
> >> index 473506fafed..d324642ad5e 100644
> >> --- a/sql/log_event.h
> >> +++ b/sql/log_event.h
> >> @@ -324,6 +324,7 @@ class String;
> >>
> >>  #define Q_HRNOW 128
> >>  #define Q_XID   129
> >> +#define Q_DUMMY 255
> >
> > This needs a good comment, explaining how Q_DUMMY is a code that will be
> > unknown to any server version, and can be used to make the slave skip
> this
> > (and any following) user vars.
> >
> >
> > diff --git a/mysql-test/suite/rpl/t/rpl_gtid_thread_id.test
> > b/mysql-test/suite/rpl/t/rpl_gtid_thread_id.test
> > new file mode 100644
> > index 00000000000..b8ed1865da9
> > --- /dev/null
> > +++ b/mysql-test/suite/rpl/t/rpl_gtid_thread_id.test
> > @@ -0,0 +1,125 @@
> > +#
> > +#   Verify that GTID log events are written into the binary log along
> with the
> > +# id of the thread which executed the transaction. On the primary, this
> is the
> > <snip>
> >
> > Disclaimer: I did not review this testcase.
> >
> >  - Kristian.
> > _______________________________________________
> > developers mailing list -- developers@lists.mariadb.org
> > To unsubscribe send an email to developers-le...@lists.mariadb.org
>

_______________________________________________
developers mailing list -- developers@lists.mariadb.org
To unsubscribe send an email to developers-le...@lists.mariadb.org

[MariaDB developers] Re: Review of MDEV-7850: Extend GTID Binlog Events with Thread Id

Reply via email to