On 8/5/20 11:58 PM, Han Zhou wrote: > > > On Wed, Aug 5, 2020 at 12:48 PM Dumitru Ceara <dce...@redhat.com > <mailto:dce...@redhat.com>> wrote: >> >> On 8/5/20 7:48 PM, Han Zhou wrote: >> > >> > >> > On Wed, Aug 5, 2020 at 8:28 AM Dumitru Ceara <dce...@redhat.com > <mailto:dce...@redhat.com> >> > <mailto:dce...@redhat.com <mailto:dce...@redhat.com>>> wrote: >> >> >> >> Every time a follower has to install a snapshot received from the >> >> leader, it should also replace the data in memory. Right now this only >> >> happens when snapshots are installed that also change the schema. >> >> >> >> This can lead to inconsistent DB data on follower nodes and the > snapshot >> >> may fail to get applied. >> >> >> >> CC: Han Zhou <hz...@ovn.org <mailto:hz...@ovn.org> > <mailto:hz...@ovn.org <mailto:hz...@ovn.org>>> >> >> Fixes: bda1f6b60588 ("ovsdb-server: Don't disconnect clients after >> > raft install_snapshot.") >> >> Signed-off-by: Dumitru Ceara <dce...@redhat.com > <mailto:dce...@redhat.com> >> > <mailto:dce...@redhat.com <mailto:dce...@redhat.com>>> >> > >> > Thanks Dumitru! This is a great finding, and sorry for my mistake. >> > This patch looks good to me. Just one minor comment below on the test >> > case. Otherwise: >> > >> > Acked-by: Han Zhou <hz...@ovn.org <mailto:hz...@ovn.org> > <mailto:hz...@ovn.org <mailto:hz...@ovn.org>>> >> > >> >> Thanks Han for the review! I fixed the test case as you suggested and >> sent v2. >> >> I was wondering if this is also the root cause for the issue you >> reported a while back during the OVN meeting. In my scenario, if a >> follower ends up in this situation, and if the DB gets compacted online >> afterwards, the DB file also becomes inconsistent and in some cases >> (after the DB server is restarted) all write transactions from clients >> are rejected with "ovsdb-error: inconsistent data". >> > Yes, I believe it is the root cause. I thought this patch was exactly > for that issue. Is it also for something else? >
This patch is for the issue I described above: inconsistent DB on follower followed by online compacting of the DB which corrupts the DB file too. I wasn't sure if this was also what you were hitting in your deployment, I just wanted to check if there are any other known potential issues we need to investigate. >> Related to that I also sent the following patch to make the ovsdb-server >> storage state available via appctl commands: >> >> > https://patchwork.ozlabs.org/project/openvswitch/patch/1596467128-13004-1-git-send-email-dce...@redhat.com/ >> > > I will take a look. > > Thanks, > Han > Thanks! Dumitru _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev