On 5/28/20 7:06 PM, Ilya Maximets wrote: > On 5/23/20 8:36 PM, Han Zhou wrote: >> >> >> On Sat, May 23, 2020 at 10:34 AM Ilya Maximets <[email protected] >> <mailto:[email protected]>> wrote: >>> >>> Snapshots are huge. In some cases we could receive several outdated >>> append replies from the remote server. This could happen in high >>> scale cases if the remote server is overloaded and not able to process >>> all the raft requests in time. As an action to each outdated append >>> reply we're sending full database snapshot. While remote server is >>> already overloaded those snapshots will stuck in jsonrpc backlog for >>> a long time making it grow up to few GB. Since remote server wasn't >>> able to timely process incoming messages it will likely not able to >>> process snapshots leading to the same situation with low chances to >>> recover. Remote server will likely stuck in 'candidate' state, other >>> servers will grow their memory consumption due to growing jsonrpc >>> backlogs: >> >> Hi Ilya, this patch LGTM. Just not not clear about this last part of the >> commit message. Why would remote server stuck in 'candidate' state if there >> are pending messages from leader for it to handle? If the follower was busy >> processing older messages, it wouldn't have had a chance to see election >> timer timeout without receiving heartbeat from leader, so it shouldn't try >> to start voting, right? Otherwise: >> >> Acked-by: Han Zhou <[email protected] <mailto:[email protected]>> > > Thanks! Applied to master.
As agreed during OVN weekly irc meeting, I also backported this fix to branch-2.13. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
