On 7/1/22 01:34, Ilya Maximets wrote: > Conversion of the database data into JSON object, serialization > and destruction of that object are the most heavy operations > during the database compaction. If these operations are moved > to a separate thread, the main thread can continue processing > database requests in the meantime. >
Hi Ilya, > With this change, the compaction is split in 3 phases: > > 1. Initialization: > - Create a copy of the database. > - Remember current database index. > - Start a separate thread to convert a copy of the database > into serialized JSON object. > > 2. Wait: > - Continue normal operation until compaction thread is done. > - Meanwhile, compaction thread: > * Convert database copy to JSON. > * Serialize resulted JSON. > * Destroy original JSON object. > > 3. Finish: > - Destroy the database copy. > - Take the snapshot created by the thread. > - Write on disk. > The approach sounds good to me. > The key for this schema to be fast is the ability to create > a shallow copy of the database. This doesn't take too much > time allowing the thread to do most of work. > > Database copy is created and destroyed only by the main thread, > so there is no need for synchronization. > > Such solution allows to reduce the time main thread is blocked > by compaction by 80-90%. For example, in ovn-heater tests > with 120 node density-heavy scenario, where compaction normally > takes 5-6 seconds at the end of a test, measured compaction > times was all below 1 second with the change applied. Also, > note that these measured times are the sum of phases 1 and 3, > so actual poll intervals are about half a second in this case. Nice! > > Only implemented for raft storage for now. The implementation > for standalone databases can be added later by using a file > offset as a database index and copying newly added changes > from the old file to a new one during ovsdb_log_replace(). > > Reported-at: https://bugzilla.redhat.com/2069108 > Signed-off-by: Ilya Maximets <[email protected]> > --- > ovsdb/ovsdb-server.c | 18 +++++- > ovsdb/ovsdb.c | 143 +++++++++++++++++++++++++++++++++++++++---- > ovsdb/ovsdb.h | 24 ++++++++ > ovsdb/raft.c | 8 ++- > ovsdb/raft.h | 3 +- > ovsdb/row.c | 17 +++++ > ovsdb/row.h | 1 + > ovsdb/storage.c | 11 ++-- > ovsdb/storage.h | 3 +- > 9 files changed, 204 insertions(+), 24 deletions(-) > > diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c > index 5549b4e3a..eae2f6679 100644 > --- a/ovsdb/ovsdb-server.c > +++ b/ovsdb/ovsdb-server.c > @@ -252,7 +252,9 @@ main_loop(struct server_config *config, > remove_db(config, node, > xasprintf("removing database %s because storage " > "disconnected permanently", node->name)); > - } else if (ovsdb_storage_should_snapshot(db->db->storage)) { > + } else if (!ovsdb_snapshot_in_progress(db->db) > + && (ovsdb_storage_should_snapshot(db->db->storage) || > + ovsdb_snapshot_ready(db->db))) { > log_and_free_error(ovsdb_snapshot(db->db, trim_memory)); > } > } > @@ -287,6 +289,7 @@ main_loop(struct server_config *config, > ovsdb_trigger_wait(db->db, time_msec()); > ovsdb_storage_wait(db->db->storage); > ovsdb_storage_read_wait(db->db->storage); > + ovsdb_snapshot_wait(db->db); > } > if (run_process) { > process_wait(run_process); > @@ -1552,11 +1555,20 @@ ovsdb_server_compact(struct unixctl_conn *conn, int > argc, > ? !strcmp(node->name, db_name) > : node->name[0] != '_') { > if (db->db) { > + struct ovsdb_error *error = NULL; > + > VLOG_INFO("compacting %s database by user request", > node->name); > > - struct ovsdb_error *error = ovsdb_snapshot(db->db, > - trim_memory); > + error = ovsdb_snapshot(db->db, trim_memory); > + if (!error && ovsdb_snapshot_in_progress(db->db)) { > + while (ovsdb_snapshot_in_progress(db->db)) { > + ovsdb_snapshot_wait(db->db); > + poll_block(); > + } > + error = ovsdb_snapshot(db->db, trim_memory); > + } > + > if (error) { > char *s = ovsdb_error_to_string(error); > ds_put_format(&reply, "%s\n", s); > diff --git a/ovsdb/ovsdb.c b/ovsdb/ovsdb.c > index 91b4a01af..8cbefbe3d 100644 > --- a/ovsdb/ovsdb.c > +++ b/ovsdb/ovsdb.c > @@ -25,9 +25,13 @@ > #include "file.h" > #include "monitor.h" > #include "openvswitch/json.h" > +#include "openvswitch/poll-loop.h" > +#include "ovs-thread.h" > #include "ovsdb-error.h" > #include "ovsdb-parser.h" > #include "ovsdb-types.h" > +#include "row.h" > +#include "seq.h" > #include "simap.h" > #include "storage.h" > #include "table.h" > @@ -461,6 +465,21 @@ ovsdb_destroy(struct ovsdb *db) > if (db) { > struct shash_node *node; > > + /* Need to wait for compaction thread to finish the work. */ > + while (ovsdb_snapshot_in_progress(db)) { > + ovsdb_snapshot_wait(db); > + poll_block(); > + } > + if (ovsdb_snapshot_ready(db)) { > + struct ovsdb_error *error = ovsdb_snapshot(db, false); > + > + if (error) { > + char *s = ovsdb_error_to_string_free(error); > + VLOG_INFO("%s: %s", db->name, s); > + free(s); Nit: this is very similar to log_and_free_error(). For now the rest of the patch looks good to me. I still want to test it some more though. Thanks, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
