[ 
https://issues.apache.org/jira/browse/IGNITE-22294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-22294:
---------------------------------------
    Epic Link: IGNITE-21188

> Add backpressure to MessagingService
> ------------------------------------
>
>                 Key: IGNITE-22294
>                 URL: https://issues.apache.org/jira/browse/IGNITE-22294
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3, network
>
> Currently, when send()/transfer()/weakSend() is invoked, a sending task is 
> added to the Netty queue. If the tasks are added faster than they are 
> executed, the queue will grow infinitely, potentially eating up all available 
> heap.
> We could do one of the following if the queue is overflown:
>  # Return a failed future. The callers will have to deal with this new type 
> of exception. Also, this behavior would contradict the current motto 'if a 
> send() is called, either the message will be eventually sent (and delivered), 
> or the logical connection will be closed (to never be established again 
> between nodes with the same ephemeral IDs)'
>  # Block the method until enough capacity gets freed up in the queue. This 
> does not align well with the fact that the API is asynchronous, and also this 
> will create some kind of an implicit queue, piling up the 'waiting' requests, 
> which will still eat up heap, so, if the user spawns new threads, they will 
> be able to eat all heap here as well. Also, blocking a thread is dangerous 
> (we should never block our 'own' internal sends). Only sends produced by the 
> user code (from the embedded client?) should be throttled.
>  # Forcibly close the connection so that this node treats the target node as 
> 'left forever' and does not allow to re-establish the connection; in such 
> case, all send futures will be correctly completed with the corresponding 
> exception, but one of the nodes will have to be restarted.
> None of the approaches seam ideal. A design is needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to