On 5/28/20 7:06 PM, Ilya Maximets wrote:
> On 5/23/20 8:36 PM, Han Zhou wrote:
>>
>>
>> On Sat, May 23, 2020 at 10:34 AM Ilya Maximets <[email protected] 
>> <mailto:[email protected]>> wrote:
>>>
>>> Snapshots are huge.  In some cases we could receive several outdated
>>> append replies from the remote server.  This could happen in high
>>> scale cases if the remote server is overloaded and not able to process
>>> all the raft requests in time.  As an action to each outdated append
>>> reply we're sending full database snapshot.  While remote server is
>>> already overloaded those snapshots will stuck in jsonrpc backlog for
>>> a long time making it grow up to few GB.  Since remote server wasn't
>>> able to timely process incoming messages it will likely not able to
>>> process snapshots leading to the same situation with low chances to
>>> recover.  Remote server will likely stuck in 'candidate' state, other
>>> servers will grow their memory consumption due to growing jsonrpc
>>> backlogs:
>>
>> Hi Ilya, this patch LGTM. Just not not clear about this last part of the 
>> commit message. Why would remote server stuck in 'candidate' state if there 
>> are pending messages from leader for it to handle? If the follower was busy 
>> processing older messages, it wouldn't have had a chance to see election 
>> timer timeout without receiving heartbeat from leader, so it shouldn't try 
>> to start voting, right? Otherwise:
>>
>> Acked-by: Han Zhou <[email protected] <mailto:[email protected]>>
> 
> Thanks!  Applied to master.

As agreed during OVN weekly irc meeting, I also backported this fix
to branch-2.13.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to