On 3/27/23 21:42, Ilya Maximets wrote: > Currently, database schema conversion in case of clustered database > produces a transaction record with both new schema and converted > database data. So, the sequence of events is following: > > 1. Get the new schema. > 2. Convert the database to a new schema. > 3. Translate the newly converted database into JSON. > 4. Write the schema + data JSON to the storage. > 5. Destroy converted version of a database. > 6. Read schema + data JSON from the storage and parse. > 7. Create a new database from a parsed database data. > 8. Replace current database with the new one. > > Most of these steps are very computationally expensive. Also, > conversion to/from JSON is much more expensive than direct database > conversion with ovsdb_convert() that can make use of shallow data > copies. > > Instead of doing all that, let's make use of previously introduced > ability to not write the converted data into the storage. The process > will look like this then: > > 1. Get the new schema. > 2. Convert the database to a new schema > (to verify that it is possible). > 3. Write the schema to the storage. > 4. Destroy converted version of a database. > 5. Read the new schema from the storage and parse. > 6. Convert the database to a new schema. > 7. Replace current database with the new one. >
Smart! One minor comment below. Otherwise, LGTM. Acked-by: Dumitru Ceara <dce...@redhat.com> > Most of the operations here are performed on the small schema object, > instead of the actual database data. Two remaining data operations > (actual conversion) are noticeably faster than conversion to/from > JSON due to reference counting and shallow data copies. > > Steps 4-6 can be optimized later to not convert twice on the > process that initiates the conversion. > > The change results in following performance improvements in conversion > of OVN_Southbound database schema from version 20.23.0 to 20.27.0 > (measured on a single-server RAFT cluster with no clients): > > | Before | After > +---------+-------------------+---------+------------------ > DB size | Total | Max poll interval | Total | Max poll interval > --------+---------+-------------------+---------+------------------ > 542 MB | 47 sec. | 26 sec. | 15 sec. | 10 sec. > 225 MB | 19 sec. | 10 sec. | 6 sec. | 4.5 sec. > > 542 MB database had 19.5 M atoms, 225 MB database had 7.5 M atoms. > > Overall performance improvement is about 3x. > > Also, note that before this change database conversion basically > doubles the database file on disk. Now it only writes a small > schema JSON. > > Since the change requires backward-incompatible database file format > changes, documentation is updated on how to perform an upgrade. > Handled the same way as we did for the previous incompatible format > change in 2.15 (column diffs). > > Reported-at: > https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html > Signed-off-by: Ilya Maximets <i.maxim...@ovn.org> > --- > Documentation/ref/ovsdb.7.rst | 63 +++++++++++++++++++++++++++++++++++ > NEWS | 10 ++++++ > ovsdb/ovsdb-server.c | 7 ++++ > ovsdb/ovsdb.c | 34 +++++++++++++++++++ > ovsdb/ovsdb.h | 3 ++ > ovsdb/trigger.c | 11 ++++-- > 6 files changed, 125 insertions(+), 3 deletions(-) > > diff --git a/Documentation/ref/ovsdb.7.rst b/Documentation/ref/ovsdb.7.rst > index 980ba29e7..84b153d24 100644 > --- a/Documentation/ref/ovsdb.7.rst > +++ b/Documentation/ref/ovsdb.7.rst > @@ -213,6 +213,12 @@ Open vSwitch 2.6 introduced support for the > active-backup service model. > `Upgrading from version 2.14 and earlier to 2.15 and later`_ and > `Downgrading from version 2.15 and later to 2.14 and earlier`_. > > + Another change happened in version 3.2. To upgrade/downgrade the > + ``ovsdb-server`` processes across this version follow the instructions > + described under > + `Upgrading from version 3.1 and earlier to 3.2 and later`_ and > + `Downgrading from version 3.2 and later to 3.1 and earlier`_. > + > Clustered Database Service Model > -------------------------------- > > @@ -287,6 +293,12 @@ schema, which is covered later under `Upgrading or > Downgrading a Database`_.) > `Upgrading from version 2.14 and earlier to 2.15 and later`_ and > `Downgrading from version 2.15 and later to 2.14 and earlier`_. > > + Another change happened in version 3.2. To upgrade/downgrade the > + ``ovsdb-server`` processes across this version follow the instructions > + described under > + `Upgrading from version 3.1 and earlier to 3.2 and later`_ and > + `Downgrading from version 3.2 and later to 3.1 and earlier`_. > + > Clustered OVSDB does not support the OVSDB "ephemeral columns" feature. > ``ovsdb-tool`` and ``ovsdb-client`` change ephemeral columns into persistent > ones when they work with schemas for clustered databases. Future versions of > @@ -341,6 +353,57 @@ For all service models it's required to: > > 3. Downgrade and restart ``ovsdb-server`` processes. > > +Upgrading from version 3.1 and earlier to 3.2 and later > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +There is another change of a database file format in version 3.2 that doesn't > +allow older versions of ``ovsdb-server`` to read the database file modified > by > +the ``ovsdb-server`` version 3.2 or later. This also affects runtime > +communications between servers in **cluster** service models. To upgrade the > +``ovsdb-server`` processes from one version of Open vSwitch (3.1 or earlier) > to > +another (3.2 or higher) instructions below should be followed. (This is > +different from upgrading a database schema, which is covered later under > +`Upgrading or Downgrading a Database`_.) > + > +In case of **standalone** or **active-backup** service model no special > +handling during upgrade is required. > + > +For the **cluster** service model recommended upgrade strategy is following: > + > +1. Upgrade processes one at a time. Each ``ovsdb-server`` process after > + upgrade should be started with ``--disable-file-no-data-conversion`` > command > + line argument. > + > +2. When all ``ovsdb-server`` processes upgraded, use ``ovs-appctl`` to invoke > + ``ovsdb/file/no-data-conversion-enable`` command on each of them or > restart > + all ``ovsdb-server`` processes one at a time without > + ``--disable-file-no-data-conversion`` command line option. > + > +Downgrading from version 3.2 and later to 3.1 and earlier > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +Similar to upgrading covered under `Upgrading from version 3.1 and earlier to > +3.2 and later`_, downgrading from the ``ovsdb-server`` version 3.2 and later > +to 3.1 and earlier requires additional steps. (This is different from > +upgrading a database schema, which is covered later under > +`Upgrading or Downgrading a Database`_.) > + > +For all service models it's required to: > + > +1. Compact all database files via ``ovsdb-server/compact`` command with > + ``ovs-appctl`` utility. This should be done for each involved > + ``ovsdb-server`` process separately (single process for **standalone** > + service model, all involved processes for **active-backup** and > **cluster** > + service models). > + > +2. Stop all ``ovsdb-server`` processes. Make sure that no database schema > + conversion operations were performed between steps 1 and 2. For > + **standalone** and **active-backup** service models, the database > compaction > + can be performed after stopping all the processes instead with the > + ``ovsdb-tool compact`` command. > + > +3. Downgrade and restart ``ovsdb-server`` processes. > + > Understanding Cluster Consistency > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > diff --git a/NEWS b/NEWS > index 8771ee618..cf9df6106 100644 > --- a/NEWS > +++ b/NEWS > @@ -1,5 +1,15 @@ > Post-v3.1.0 > -------------------- > + - OVSDB: > + * Changed format in which ovsdb schema conversion operations are stored > in > + clustered database files. Such operations now may not contain the > data, > + only the new schema. This allows to significantly improve the schema Nit: I'd rephrase this: Such operations are now allowed to contain the bare schema (without data). _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev