On Thu, 2022-12-15 at 19:04 +0100, Frode Nordahl wrote: > Hello, Dan, > > On Thu, Dec 15, 2022 at 6:57 PM Dan Williams <[email protected]> wrote: > > > > Signed-off-by: Dan Williams <[email protected]> > > --- > > utilities/ovs-lib.in | 13 +++++++++---- > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in > > index 13477a6a9e991..36e63312b6cba 100644 > > --- a/utilities/ovs-lib.in > > +++ b/utilities/ovs-lib.in > > @@ -475,11 +475,16 @@ upgrade_db () { > > } > > > > upgrade_cluster () { > > - local DB_SCHEMA=$1 DB_SERVER=$2 > > + local DB_SCHEMA=$1 DB_SERVER=$2 TIMEOUT_SECONDS=$3 > > local schema_name=$(ovsdb-tool schema-name $1) || return 1 > > > > - action "Waiting for $schema_name to come up" ovsdb-client -t > > 30 wait "$DB_SERVER" "$schema_name" connected || return $? > > - local db_version=$(ovsdb-client -t 10 get-schema-version > > "$DB_SERVER" "$schema_name") || return $? > > + timeout_arg=30 > > + if [ -n "$TIMEOUT_SECONDS" ]; then > > + timeout_arg="$TIMEOUT_SECONDS" > > + fi > > + > > + action "Waiting for $schema_name to come up" ovsdb-client -t > > $timeout_arg wait "$DB_SERVER" "$schema_name" connected || return > > $? > > + local db_version=$(ovsdb-client -t $timeout_arg get-schema- > > version "$DB_SERVER" "$schema_name") || return $? > > local target_version=$(ovsdb-tool schema-version "$DB_SCHEMA") > > || return $? > > > > if ovsdb-tool compare-versions "$db_version" == > > "$target_version"; then > > @@ -487,7 +492,7 @@ upgrade_cluster () { > > elif ovsdb-tool compare-versions "$db_version" ">" > > "$target_version"; then > > log_warning_msg "Database $schema_name has newer schema > > version ($db_version) than our local schema ($target_version), > > possibly an upgrade is partially complete?" > > else > > - action "Upgrading database $schema_name from schema > > version $db_version to $target_version" ovsdb-client -t 30 convert > > "$DB_SERVER" "$DB_SCHEMA" > > + action "Upgrading database $schema_name from schema > > version $db_version to $target_version" ovsdb-client -t > > $timeout_arg convert "$DB_SERVER" "$DB_SCHEMA" > > fi > > } > > > > -- > > 2.38.1 > > > > _______________________________________________ > > dev mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > Thank you for proposing this patch! > > We've been seeing reports of schema upgrades failing in the too and > have waited for some way to reproduce and see if this would be a fix. > > Are you seeing this with clustered databases, and could your problem > be related to the election timer? If it is, raising the client side > timer alone could be problematic.
Yes clustered. A 450MB LB-heavy database generated by ovn-kubernetes with OVN 22.06 (which lacks DP groups for SB load balancers) was being upgraded from OVN 22.06 -> 22.09 (but using same OVS 2.17 version, so no change to ovsdb-server) and when the ovsdb-servers got restarted as part of the ovn-kube container upgrade, they took longer to read+parse the database than the 30s upgrade_cluster() timer, thus the container failed and was put in CrashLoopBackoff by Kubernetes. ovn-ctl was never able to update the schema, and thus 22.09 ovn-northd was never able to reduce the DB size by rewriting it to use DP groups for SB LBs and recover. This patch is really a workaround and needs a corresponding ovn-ctl patch to accept a timeout for the NB/SB DB start functions that our OpenShift container scripts would pass. The real fix is, like Ilya suggests, "reduce the size of the DB" as we've found that the most effective scale strategy for OpenShift and ovn-kubernetes. And I think that strategy has paid off tremendously over the last 2 years we've been working on OVN & ovsdb-server scale. Huge credit to Ilya and the OVN team for making that happen... Dan > > I recently raised a discussion about this on the list to figure out > possible paths forward [0][1]. > > 0: > https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html > 1: https://bugs.launchpad.net/bugs/1999605 > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
