On Mon, Mar 18, 2019 at 2:49 PM Ben Pfaff <[email protected]> wrote:
>
> On Fri, Mar 15, 2019 at 04:17:35PM -0700, Han Zhou wrote:
> > From: Han Zhou <[email protected]>
> >
> > When update is requested from follower, the leader sends AppendRequest
> > to all followers and wait until AppendReply received from majority, and
> > then it will update commit index - the new entry is regarded as committed
> > in raft log. However, this commit will not be notified to followers
> > (including the one initiated the request) until next heartbeat (ping
> > timeout), if no other pending requests. This results in long latency
> > for updates made through followers, especially when a batch of updates
> > are requested through the same follower.
> >
> > $ time for i in `seq 1 100`; do ovn-nbctl ls-add ls$i; done
> >
> > real    0m34.154s
> > user    0m0.083s
> > sys 0m0.250s
> >
> > This patch solves the problem by sending heartbeat as soon as the commit
> > index is updated in leader. It also avoids unnessary heartbeat by resetting
> > the ping timer whenever AppendRequest is broadcasted. With this patch
> > the performance is improved more than 50 times in same test:
> >
> > $ time for i in `seq 1 100`; do ovn-nbctl ls-add ls$i; done
> >
> > real    0m0.564s
> > user    0m0.080s
> > sys 0m0.199s
> >
> > Some sleep is added in torture test cases because of the improved
> > performance, otherwise the tests will all be skipped.
> >
> > Signed-off-by: Han Zhou <[email protected]>
> > ---
> >
> > Notes:
> >     v1->v2: adjust torture test case so that it passes without overload CPU.
>
> With this patch, on my laptop, running test 2525 seems to always skip
> it, with results similar to the following:
>
>     ## ------------------------------- ##
>     ## openvswitch 2.11.90 test suite. ##
>     ## ------------------------------- ##
>     2525: OVSDB 3-server torture test - kill/restart leader skipped 
> (ovsdb-cluster.at:198)
>
>     ## ------------- ##
>     ## Test results. ##
>     ## ------------- ##
>
>     0 tests were successful.
>     1 test was skipped.
>     make[3]: Leaving directory '/home/blp/nicira/ovs/_build'
>     make[2]: Leaving directory '/home/blp/nicira/ovs/_build'
>     make[1]: Leaving directory '/home/blp/nicira/ovs/_build'
>
>     real    0m9.194s
>     user    0m3.693s
>     sys     0m1.658s
>     blp@sigill:~/nicira/ovs/_build(0)$

Sorry to hear :(. It was pretty stable on my laptop - maybe my laptop
is slower than yours :). I just sent V3 to make the test case more
stable. I reduced the interval of the checking loop so that it can
detect phase changes and trigger the operations asap. I ran all
torture tests with -j1, -j5 and -j10. All cases passed without
skipping. I hope it is stable on your laptop, too. Could you try
again?

Thanks,
Han
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to