Brandon Nesterenko <notificati...@github.com> writes:

> MDEV-36290: ALTER TABLE with multi-master can cause data loss

>   https://github.com/MariaDB/server/pull/3967

Hi Brandon,

Here is my initial review of the MDEV-36290 draft patch.

Overall, it is clear that this is not something that's appropriate for 10.6.
It implements a new way to apply row events, based on column names (if
available) rather than column number. It is something for next dev version,
12.x.

> commit de27bfbafc22601d53ddc0567401fcb26d28a91c
> Author: Brandon Nesterenko <brandon.nestere...@mariadb.com>
> Date:   Thu Apr 3 12:37:09 2025 -0600
> 
>     MDEV-36290: ALTER TABLE with multi-master can cause data loss

Please do not use the term "data loss" in the description (I know it is not
you who introduced the term ;-).

The issue here is conflicting queries run in parallel against multiple
masters in a multi-master setup. In general, the support of multi-master
in asynchronous replication requires the user to ensure that no conflicting
queries are run, or to ensure that such conflicts will not cause undesirable
behaviour. The behaviour addressed here is just one of many ways that
conflicting ALTER TABLE could cause different data on slave and master.
Using the term "data loss" can cause misunderstanding about what the problem
is and what the change does.

Of course, every case that is handled more correctly is in principle better.

In this case, we introduce the facility to configure
(--binlog_row_metadata=FULL) replication so that it is possible to replicate
between tables on master and slave with different order of the columns. Not
sure how useful such an option is in practice, seems to be very few users
that will really need this, and such need has to be weighed against the
increase in complexity in the code and potential for bugs.

With that said, the approach in the patch generally looks reasonable. A few
initial detail comments below, as you mentioned this is still a draft patch.

As you also mentioned to me earlier, the logic in the patch is not trivial.
I did not yet try to understand in detail all parts of the logic and the
different corner cases.


> +  unsigned char *get_optional_metadata_str()
> +  {
> +    return m_optional_metadata;
> +  }
> +
> +  unsigned int get_optional_metadata_len()
> +  {
> +    return m_optional_metadata_len;
> +  }
> +

Generally I do not like such do-nothing accessor functions. They just add
abstraction and reduce grepability without doing anything useful. That's a
style preference of course.

> +  /*
> +    Maps column index from master to slave. This is determined using the 
> field
> +    names (provied by optional metadata when the master is configured with
> +    binlog_row_metadata=FULL).
> +  */
> +  std::map<uint, uint> master_to_slave_index_map;
> +
> +  /*
> +    When using field names to map from master->slave columns, this keeps 
> track
> +    of column indices on the master which aren't present on the slave.
> +
> +    It is used to skip columns when checking type-conversions and unpacking
> +    row data.
> +  */
> +  std::set<uint> master_unmatched_cols;

Is using std::map/std::set going to be expensive here, this is operations
that will happen on every row-based event (when --binlog-row-metadata=FULL)?
The std:: functions can imply a lot of memory allocations that are not very
visible, and we usually try to avoid them in performance-relevant server
code.

master_to_slave_index_map is just a lookup table of numbers 0..#columns-1,
right? So it doesn't need to be an associative array, a normal array is
sufficient, and doesn't need to be re-allocated each event.

And master_unmatched_cols is again just a set of numbers 0..#columns-1,
so a bitmap could be used.

 - Kristian.
_______________________________________________
developers mailing list -- developers@lists.mariadb.org
To unsubscribe send an email to developers-le...@lists.mariadb.org

Reply via email to