Yakov Zhdanov created IGNITE-5056:
-------------------------------------

             Summary: Implement communication backpressure control
                 Key: IGNITE-5056
                 URL: https://issues.apache.org/jira/browse/IGNITE-5056
             Project: Ignite
          Issue Type: Improvement
            Reporter: Yakov Zhdanov
            Assignee: Yakov Zhdanov
            Priority: Critical
             Fix For: 2.1


Problem
Currently backpressure control relies on semaphore on sending side that ensures 
that sending queue cannot be  overflown and a special counter on receiving side 
that stops reading from the socket when unprocessed message count outgrows 
limit config parameter.

In some scenarios it may lead to a distributed deadlock. E.g. we send many 
async jobs to remote nodes which in turn do sync cache operations. If task 
master node is a backup or primary for some cache updates and has already 
scheduled too many job requests for send it will not be able to respond to 
cache requests thus remote jobs would never complete.

Solution
Reading from socket should never stop

Design notes
* add IgniteConfiguration.maxAsyncRequests and propagate it via node attributes 
to all nodes of the cluster. All nodes may have different value (however this 
is unlikely).
* add a flag to GridIoMessage.async. If flag is false then sender node assumed 
to synchronously wait for response and does not wait otherwise.
* all sent async messages should be tracked on sender node on per-receiver 
basis.
* all received async messages should be tracked on receiver nodes
* nodes should add flag to communication acks on whether they can more async 
messages or not 
* sender should never exceed IgniteConfiguration.maxAsyncRequests async 
requests per node
* if IgniteConfiguration.maxAsyncRequests is exceeded or node sets flag in 
communication ack then all async messages become sync
* above means:
** next compute job from the task is sent to the node only after response for 
some previous comes
** next dht update request (for primary sync or full async) is sent, but node 
doesn't send response to near node unless it has not received response for 
former operation from remote backup or for this operation
** next cache operation becomes sync - we force user code to wait on operation 
future.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to