[ 
https://issues.apache.org/jira/browse/IGNITE-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Scherbakov updated IGNITE-20081:
---------------------------------------
    Labels: ignite-3 ignite3_performance  (was: ignite-3)

> Implement "weakSend" properly, add "weakInvoke"
> -----------------------------------------------
>
>                 Key: IGNITE-20081
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20081
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Ivan Bessonov
>            Priority: Major
>              Labels: ignite-3, ignite3_performance
>
> There was an idea. Some components, like RAFT, are allowed to lose messages. 
> Having strict guarantees for messages delivery may not be good for such 
> components.
> But, current implementation of "weakSend" is just a wrapper around "send" 
> that doesn't return any future. This API must be redesigned and properly 
> implemented.
> h3. API
>  * 
> {{CompletableFuture<Void> weakSend(ClusterNode recipient, NetworkMessage msg, 
> long timeout);}}
>  * 
> {{CompletableFuture<NetworkMessage> weakInvoke(ClusterNode recipient, 
> NetworkMessage msg, long timeout);}}
> Futures are being completed in two cases:
>  * ack or response has been received
>  * timeout is exceeded
> This means that huge timeout is probably a bad idea for such messages.
> h3. Implementation
>  * with stable and fast connection, weak communication should work the same 
> way from the client standpoint;
>  * if a message queue for the given connection is full, we may/should:
>  ** remove all weak messages from the existing queue, that 100% have not been 
> sent;
>  ** reject new weak messages;
>  ** maybe throttle, but this is out of scope;
>  * alternatively, if connection breaks, we may start removing weak messages 
> from the queue, and/or rejecting new ones.
> Weak send and weak invoke may behave differently.
> For example, "weakSend" requires ack, so it has to be marked with a "message 
> number" in recovery descriptor.
> But, "weakInvoke" doesn't need an ack, it only requires a response (already 
> has "correlationId"), so "not re-sending" it after reconnect shouldn't break 
> the recovery protocol. It doesn't need to have a "message number" in a 
> recovery descriptor, we can save some resources by reducing the number of 
> acks.
> One more important thing:
>  * when invoke future fails with timeout exception, we must cleanup 
> corresponding correlation ID from the map;
>  * when we receive "node left" event for some node, we should complete all 
> returned futures with some "NodeLeftException", and cleanup all its 
> correlation IDs from the map as well.
> h3. Integration
> will be done separately. All we need, for now, is a set of unit tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to