On Wed, Aug 5, 2020 at 3:04 PM Dumitru Ceara <[email protected]> wrote: > > On 8/5/20 11:58 PM, Han Zhou wrote: > > > > > > On Wed, Aug 5, 2020 at 12:48 PM Dumitru Ceara <[email protected] > > <mailto:[email protected]>> wrote: > >> > >> On 8/5/20 7:48 PM, Han Zhou wrote: > >> > > >> > > >> > On Wed, Aug 5, 2020 at 8:28 AM Dumitru Ceara <[email protected] > > <mailto:[email protected]> > >> > <mailto:[email protected] <mailto:[email protected]>>> wrote: > >> >> > >> >> Every time a follower has to install a snapshot received from the > >> >> leader, it should also replace the data in memory. Right now this only > >> >> happens when snapshots are installed that also change the schema. > >> >> > >> >> This can lead to inconsistent DB data on follower nodes and the > > snapshot > >> >> may fail to get applied. > >> >> > >> >> CC: Han Zhou <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> > >> >> Fixes: bda1f6b60588 ("ovsdb-server: Don't disconnect clients after > >> > raft install_snapshot.") > >> >> Signed-off-by: Dumitru Ceara <[email protected] > > <mailto:[email protected]> > >> > <mailto:[email protected] <mailto:[email protected]>>> > >> > > >> > Thanks Dumitru! This is a great finding, and sorry for my mistake. > >> > This patch looks good to me. Just one minor comment below on the test > >> > case. Otherwise: > >> > > >> > Acked-by: Han Zhou <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> > >> > > >> > >> Thanks Han for the review! I fixed the test case as you suggested and > >> sent v2. > >> > >> I was wondering if this is also the root cause for the issue you > >> reported a while back during the OVN meeting. In my scenario, if a > >> follower ends up in this situation, and if the DB gets compacted online > >> afterwards, the DB file also becomes inconsistent and in some cases > >> (after the DB server is restarted) all write transactions from clients > >> are rejected with "ovsdb-error: inconsistent data". > >> > > Yes, I believe it is the root cause. I thought this patch was exactly > > for that issue. Is it also for something else? > > > > This patch is for the issue I described above: inconsistent DB on > follower followed by online compacting of the DB which corrupts the DB > file too. I wasn't sure if this was also what you were hitting in your > deployment, I just wanted to check if there are any other known > potential issues we need to investigate. >
OK, I think it should be the same issue. There are no other potential issues related to "inconsistent data" discovered so far. > >> Related to that I also sent the following patch to make the ovsdb-server > >> storage state available via appctl commands: > >> > >> > > https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ > >> > > > > I will take a look. > > > > Thanks, > > Han > > > > Thanks! > Dumitru > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
