On 7/1/22 01:34, Ilya Maximets wrote:
> Conversion of the database data into JSON object, serialization
> and destruction of that object are the most heavy operations
> during the database compaction.  If these operations are moved
> to a separate thread, the main thread can continue processing
> database requests in the meantime.
> 

Hi Ilya,

> With this change, the compaction is split in 3 phases:
> 
> 1. Initialization:
>    - Create a copy of the database.
>    - Remember current database index.
>    - Start a separate thread to convert a copy of the database
>      into serialized JSON object.
> 
> 2. Wait:
>    - Continue normal operation until compaction thread is done.
>    - Meanwhile, compaction thread:
>      * Convert database copy to JSON.
>      * Serialize resulted JSON.
>      * Destroy original JSON object.
> 
> 3. Finish:
>    - Destroy the database copy.
>    - Take the snapshot created by the thread.
>    - Write on disk.
> 

The approach sounds good to me.

> The key for this schema to be fast is the ability to create
> a shallow copy of the database.  This doesn't take too much
> time allowing the thread to do most of work.
> 
> Database copy is created and destroyed only by the main thread,
> so there is no need for synchronization.
> 
> Such solution allows to reduce the time main thread is blocked
> by compaction by 80-90%.  For example, in ovn-heater tests
> with 120 node density-heavy scenario, where compaction normally
> takes 5-6 seconds at the end of a test, measured compaction
> times was all below 1 second with the change applied.  Also,
> note that these measured times are the sum of phases 1 and 3,
> so actual poll intervals are about half a second in this case.

Nice!

> 
> Only implemented for raft storage for now.  The implementation
> for standalone databases can be added later by using a file
> offset as a database index and copying newly added changes
> from the old file to a new one during ovsdb_log_replace().
> 
> Reported-at: https://bugzilla.redhat.com/2069108
> Signed-off-by: Ilya Maximets <[email protected]>
> ---
>  ovsdb/ovsdb-server.c |  18 +++++-
>  ovsdb/ovsdb.c        | 143 +++++++++++++++++++++++++++++++++++++++----
>  ovsdb/ovsdb.h        |  24 ++++++++
>  ovsdb/raft.c         |   8 ++-
>  ovsdb/raft.h         |   3 +-
>  ovsdb/row.c          |  17 +++++
>  ovsdb/row.h          |   1 +
>  ovsdb/storage.c      |  11 ++--
>  ovsdb/storage.h      |   3 +-
>  9 files changed, 204 insertions(+), 24 deletions(-)
> 
> diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
> index 5549b4e3a..eae2f6679 100644
> --- a/ovsdb/ovsdb-server.c
> +++ b/ovsdb/ovsdb-server.c
> @@ -252,7 +252,9 @@ main_loop(struct server_config *config,
>                  remove_db(config, node,
>                            xasprintf("removing database %s because storage "
>                                      "disconnected permanently", node->name));
> -            } else if (ovsdb_storage_should_snapshot(db->db->storage)) {
> +            } else if (!ovsdb_snapshot_in_progress(db->db)
> +                       && (ovsdb_storage_should_snapshot(db->db->storage) ||
> +                           ovsdb_snapshot_ready(db->db))) {
>                  log_and_free_error(ovsdb_snapshot(db->db, trim_memory));
>              }
>          }
> @@ -287,6 +289,7 @@ main_loop(struct server_config *config,
>              ovsdb_trigger_wait(db->db, time_msec());
>              ovsdb_storage_wait(db->db->storage);
>              ovsdb_storage_read_wait(db->db->storage);
> +            ovsdb_snapshot_wait(db->db);
>          }
>          if (run_process) {
>              process_wait(run_process);
> @@ -1552,11 +1555,20 @@ ovsdb_server_compact(struct unixctl_conn *conn, int 
> argc,
>              ? !strcmp(node->name, db_name)
>              : node->name[0] != '_') {
>              if (db->db) {
> +                struct ovsdb_error *error = NULL;
> +
>                  VLOG_INFO("compacting %s database by user request",
>                            node->name);
>  
> -                struct ovsdb_error *error = ovsdb_snapshot(db->db,
> -                                                           trim_memory);
> +                error = ovsdb_snapshot(db->db, trim_memory);
> +                if (!error && ovsdb_snapshot_in_progress(db->db)) {
> +                    while (ovsdb_snapshot_in_progress(db->db)) {
> +                        ovsdb_snapshot_wait(db->db);
> +                        poll_block();
> +                    }
> +                    error = ovsdb_snapshot(db->db, trim_memory);
> +                }
> +
>                  if (error) {
>                      char *s = ovsdb_error_to_string(error);
>                      ds_put_format(&reply, "%s\n", s);
> diff --git a/ovsdb/ovsdb.c b/ovsdb/ovsdb.c
> index 91b4a01af..8cbefbe3d 100644
> --- a/ovsdb/ovsdb.c
> +++ b/ovsdb/ovsdb.c
> @@ -25,9 +25,13 @@
>  #include "file.h"
>  #include "monitor.h"
>  #include "openvswitch/json.h"
> +#include "openvswitch/poll-loop.h"
> +#include "ovs-thread.h"
>  #include "ovsdb-error.h"
>  #include "ovsdb-parser.h"
>  #include "ovsdb-types.h"
> +#include "row.h"
> +#include "seq.h"
>  #include "simap.h"
>  #include "storage.h"
>  #include "table.h"
> @@ -461,6 +465,21 @@ ovsdb_destroy(struct ovsdb *db)
>      if (db) {
>          struct shash_node *node;
>  
> +        /* Need to wait for compaction thread to finish the work. */
> +        while (ovsdb_snapshot_in_progress(db)) {
> +            ovsdb_snapshot_wait(db);
> +            poll_block();
> +        }
> +        if (ovsdb_snapshot_ready(db)) {
> +            struct ovsdb_error *error = ovsdb_snapshot(db, false);
> +
> +            if (error) {
> +                char *s = ovsdb_error_to_string_free(error);
> +                VLOG_INFO("%s: %s", db->name, s);
> +                free(s);

Nit: this is very similar to log_and_free_error().

For now the rest of the patch looks good to me.  I still want to test it
some more though.

Thanks,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to