[MariaDB developers] Re: Review of cdebe6e: Semi-sync: Refactor...

Kristian Nielsen via developers Wed, 14 Jan 2026 03:17:11 -0800

Brandon Nesterenko <[email protected]> writes:

> Thanks for undertaking this work! It is good to see GTID support being
> added in semi-sync. This email only entails reviewing the refactor. I'll
> follow-up with another email to review the GTID additions.


You're welcome! A couple general points:

1. With these patches, I tried to do the minimal amount of changes to the
logic and structure of the Semi-sync code. It is 3k lines of diff, and I
wanted to make it as easy as possible to see that the functionality of
semi-sync is unchanged, as an attempt to avoid introducing more bugs into
the semi-sync.

I think there are many improvements that could be made to the semi-sync
code, but I deliberately tried to avoid this these patches, for the above
reason. But could definitely be added on top.

2. In this approach, I tried to follow a "standard best-practices" approach
to use class inheritance and virfual functions and so to separate the
file/pos and gtid code path "cleanly" - as compared to just adding
appropriate conditionals / if-statements. That is the reason for use of the
different classes, for the C++ lambda's, etc. For this case it ended up
relatively clean, but still somewhat convoluted in places. Personally I tend
to lean towards keeping things simple, and I would be happy with using a
conditional-based approach instead (or in selected places).

> 1. When do you envision this to go live? I think it is too big/late to go
> into
>    12.3 at this point, given it wasn't in the preview.

Ok, sure. Less work for me this way :-)

I don't have much interest in the semi-sync, but it was the last thing
missing to make the binlog-in-engine support of existing features complete.
Now with this patch, that should be covered as well, if anyone wants it.

> +  /*
> +    The repl_semisync_master object needs to be constructed early, as it
> can be
> +    called into from eg. ha_commit_trans(). However the initialization of
> +    semi-sync happens only later.
>
> Do we want to keep this pattern? Another option would be to split the logic

No (but kept in this patch series due to reason (1) above).

This pattern was used to have the semisync_master object be statically
allocated, which is always awkward in C++. I agree with your idea that it
could instead just be constructed when the server state is ready (at the
place where currently init_object() is called). This will probably require
some NULL checks in places where it currently calls into the semisync_master
object before it is initialized.

> +  */
> +  if (binlog_engine_used)
> +    repl_semisync_master= new Repl_semi_sync_master_gtid();
>
> This may be confusing to see when going through history, as it would be more
> natural to see this in the follow-up patch. Perhaps add a bit to the git
> commit
> message explaining why you added the GTID-subclass instantiation to this
> commit.

Sure. (In fact this is kind of dead code in the patches as they currently
are, as semi-sync cannot be enabled when --binlog-storage-engine, this needs
to be done in a follow-up patch).

> -            if (repl_semisync_master.report_binlog_update(thd, thd,
> -                                                         log_file_name,
> offset))
> +            if (repl_semisync_master->report_binlog_update(thd, thd,
> +                                                           log_file_name,
> offset,
> +                                                           commit_gtid))
>
>
> This ties into a general concern I have with Trans_binlog_info (described
> later
> in my review, but perhaps instead of passing all log_file_name, offset, and
> commit_gtid into all these functions; there could be some factory which
> creates
> some correct sub-class of only a GTID- or offset- based transaction
> identifier.
> When in GTID mode, I don't think log_file_name and offset are needed, so if
> we
> can abstract that away as well, I think it would make the interface for
> Repl_semi_sync_trx_info/Trans_binlog_info a bit clearer.

Actually, we do always have all of filename, offset, and GTID available, in
either of the modes. The comparison/lookups of transactions uses only
file/pos respectively GTID, but the other part is used in informational
messages.

And my understanding was that you had some task where having the last GTID
acked by the semi-sync slave would be desirable; this way the GTID is
available also when using file/pos (eg. against an old slave).

So having both GTID- and offset-based parameters here is deliberate.

> diff --git a/sql/sql_repl.cc b/sql/sql_repl.cc
> index b192dd644c0..ac14b24ca99 100644
> --- a/sql/sql_repl.cc
> +++ b/sql/sql_repl.cc
>
> @@ -2356,37 +2365,48 @@ send_event_to_slave(binlog_send_info *info,
> Log_event_type event_type,
>
>                Or the until condition is specified as SQL_BEFORE_GTIDS
>              */
> -            info->gtid_skip_group = (flags2 &
> Gtid_log_event::FL_STANDALONE ?
> -                                GTID_SKIP_STANDALONE :
> GTID_SKIP_TRANSACTION);
> +            info->gtid_skip_group = true;
>            }
>          }
>        }
>      }
>    }
>
> -  /*
> -    Skip event group if we have not yet reached the correct slave GTID
> position.
> +  if (info->in_event_group)
> +  {
> +    if (info->standalone)
> +    {
> +      if (!Log_event::is_part_of_group(event_type))
> +      {
> +        info->in_event_group= false;
> +        end_of_group_event= true;
> +      }
> +    }
> +    else
> +    {
> +      if (event_type == XID_EVENT || event_type == XA_PREPARE_LOG_EVENT ||
> +          (event_type == QUERY_EVENT && /* QUERY_COMPRESSED_EVENT would
> never be commmit or rollback */
> +           Query_log_event::peek_is_commit_rollback((uchar*) packet->ptr()
> +
> +                                                    ev_offset,
> +                                                    len - ev_offset,
> +                                                    current_checksum_alg)))
> +      {
> +        info->in_event_group= false;
> +        end_of_group_event= true;
> +      }
> +    }
>
> Perhaps make an inline function to encapsulate all this, as the result of
> each
> check through the various blocks is the same.

Not sure what exactly you mean here. Do you mean a function that would be
used here and in is_until_reached() which has similar checks? Like this:

  bool is_end_of_group_event(bool standalone, Log_event_type event_type,
      String *packet, my_off_t ev_offset, size_t len,
      enum_binlog_checksum_alg current_checksum_alg)
  {
    if (standalone)
      return Log_event::is_part_of_group(event_type);
    else
      return
         (event_type == XID_EVENT || event_type == XA_PREPARE_LOG_EVENT ||
          (event_type == QUERY_EVENT && /* QUERY_COMPRESSED_EVENT would never 
be commmit or rollback */
           Query_log_event::peek_is_commit_rollback((uchar*) packet->ptr()
                                                    ev_offset,
                                                    len - ev_offset,
                                                    current_checksum_alg)));
  }

If so, sure, sounds fine.

> diff --git a/sql/semisync_master.h b/sql/semisync_master.h
> index b1b8c9633e1..ae48ec85bf6 100644
> --- a/sql/semisync_master.h
> +++ b/sql/semisync_master.h

> @@ -32,11 +32,44 @@ struct Tranx_node {
>    char              log_name[FN_REFLEN];
>    bool              thd_valid;             /* thd is valid for signalling */
>    my_off_t          log_pos;
> +  rpl_gtid          gtid;
>    THD               *thd;                   /* The thread awaiting an ACK */
>    struct Tranx_node *next;            /* the next node in the sorted list */
>    struct Tranx_node *hash_next;    /* the next node during hash collision
> */
>  };
>
> +
> +/*
> +  Encapsulate a reference to a binlog position.
> +*/
> +struct Repl_semi_sync_trx_info {
> +  rpl_gtid gtid;
> +  const char *log_file;
> +  my_off_t log_pos;
> +
> +  Repl_semi_sync_trx_info();
> +  Repl_semi_sync_trx_info(const char *file_name, my_off_t log_pos);
> +  Repl_semi_sync_trx_info(const char *file_name, my_off_t log_pos,
> +                          const rpl_gtid *gtid);
> +};
> +
> +
> +/*
> +  Structure to save transaction log filename and position
> +*/
> +struct Trans_binlog_info : public Repl_semi_sync_trx_info {
> +  char file_name_buf[FN_REFLEN];
> +
> +  Trans_binlog_info();
> +  Trans_binlog_info(const char *file_name, my_off_t log_pos);
> +  Trans_binlog_info(const char *file_name, my_off_t log_pos, const
> rpl_gtid *gtid);
> +  void set(const Repl_semi_sync_trx_info *inf);
> +  void set(const char *log_file, my_off_t log_pos, const rpl_gtid *gtid);
> +  void clear();
> +  bool is_clear() { return log_pos == 0; }
> +};
>
> The above hierarchy of Repl_semi_sync_trx_info and Trans_binlog_info isn't
> really clear. It takes a fair bit of investigation to understand the purpose
> of each class, and I'm still not entirely sure. It looks like

Yes, these are a bit ackward, and the purpose is basically reason (1) above,
to keep the existing structure of the code as much as possible.

The struct Repl_semi_sync_trx_info replaces the explicit (log_file, log_pos)
function arguments that appear in numerous places in the old code. Maybe I
could have instead added an extra rpl_gtid *gtid argument to all these
functions, but it seems cleaner to have a single Repl_semi_sync_trx_info *inf
argument passed around instead (though maybe the name for the type could be
better).

Note that the Repl_semi_sync_trx_info, just like the old log_file function
argument, is just a char *, the underlying memory of the string is owned by
the caller, who must control the lifetime (again this is to follow the logic
of the existing code).

The Trans_binlog_info in contrast is used to store the file/pos (and now
also the GTID) with explicit buffer and ownership of the memory for the
filename. This was already used for THD::semisync_info. I added to use it to
replace the old char m_reply_file_name[FN_REFLEN] and similar . I moved the
declaration to make it available in semisync_master.h.

So the reason for the two different structures is to handle the different
ownership requirements of the memory for the file name. I made
Trans_binlog_info inherit from Repl_semi_sync_trx_info just to avoid having
to copy each field when calling a function (which is what the old code was
effectively doing).

In summary: Repl_semi_sync_trx_info is for passing the info between
functions. The Trans_binlog_info is for storing the info in object.

>  2. Would it be worth abstracting away the way a transaction is binlogged
> from
>     here as well? (Mentioned in my previous comment).

I'm not sure what you're asking here.

>    /* Enable the object to enable semi-sync replication inside the master.
> */
> -  int enable_master();
> +  virtual int enable_master() = 0;
> +protected:
> +  template <typename F> int
> +  enable_master(F tranx_alloc);
>
> Why is this a template? Both implementations give lambdas with the same
> signatures. I'd think this would be clearer if the type was a
> function-pointer, or whatever C++ equivalent you want.

Yeah, this is an example of reason (2) above. It's not pretty, is it?

My understanding is that this is how you do function objects in C++. The
structures generated for the two lambdas are not the same, as their capture
is different. Then there is std::function() and std::bind() and stuff like
that, which is cool and clever and all, but in this case would probably only
make things even more needlessly complex.

Yes, a C-style function pointer would be fine here (it doesn't even need any
capture pointer in this case), or even just a single if()-statement inside
enable_master().

But do we even need the class Active_tranx? Couldn't it simply be folded
into the class Repl_semi_sync_master (and same with the subclasses)?

There seems to be a one-to-one relation between the two object. Except that
the m_active_tranxs is allocated and de-allocated when semi-sync is
enabled/disabled, but that doesn't seem necessary. If the two classes were
folded into one, this issue of allocating the right sub-class of
Active_tranx in the common code in Repl_semi_sync_master::enable_master()
would not even exist, things would just happen naturally in the appropriate
constructor.

>    while (entry != NULL)
>    {
> -    if (!compare(entry, log_file_name, log_file_pos))
> +    if (!compare(entry, inf))
> +    {
> +      if (inf->gtid.seq_no == 0)
> +        inf->gtid= entry->gtid;
>
> Overwriting the Repl_semi_sync_trx_info::gtid should be documented at the
> function-level (also probably at the definition in Repl_semi_sync_trx_info
> to help explain its life-cycle. It is very hidden here.

Yeah, this is ugly, agree. In fact I removed it in the following patch (as
you also remarked in another place, the separation between the patches isn't
perfect).

 - Kristian.
_______________________________________________
developers mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[MariaDB developers] Re: Review of cdebe6e: Semi-sync: Refactor...

Reply via email to