[ 
https://issues.apache.org/jira/browse/IGNITE-18772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgeny Stanilovsky updated IGNITE-18772:
----------------------------------------
    Fix Version/s: 3.2
                       (was: 3.1)

> Design mechanisms for messaging consistency
> -------------------------------------------
>
>                 Key: IGNITE-18772
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18772
>             Project: Ignite
>          Issue Type: New Feature
>          Components: networking
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.2
>
>
> We have a use case where node A asks node B to notify node A when some event 
> on node B occurs. This requires two round trips: first RT (A invokes B) 
> installs an event listener on B, and second round trip (B makes a strong send 
> to A) notifies A about the event.
> To account for possible topology instability, code at node A subscribes to 
> onDisappeared(B), same does code at node B (but to onDisappeared(A)).
> A timeout might be installed on invocation future on node A.
> Outcomes are:
>  # If B is not in the topology for A, invoke future fails right away (B knows 
> nothing about invocation, there is no request)
>  # If A loses B from sight before invoke response is delivered to A, invoke 
> future fails at A, and B eventually deregisters the listener
>  # If invocation is ok, but nodes lose each other from sight before the event 
> happens, node A stops waiting and node B deregisters the listener
>  # If invocation is ok and event happens while nodes see each other, callback 
> is delivered from B to A (with best effort guarantees, with retries till 
> delivered or timed out or nodes lose each other of sight)
> The outcome must be consistent between nodes A and B. That is, it cannot 
> happen that one node acted as if it thought that another node disappeared, 
> but another node acted as if first node was available.
>  # Relation 'X sees Y' must be symmetric (in an eventual sense)
>  # If node X currently does not see node Y, it cannot accept messages from it
> We could use the following invariant: if a node has disappeared from the 
> topology, it cannot appear there again with same identity (IGNITE-18712 might 
> help on the physical topology level).
> Things that should be carefully considered:
>  # Nodes might have different views of the topology: 'X sees Y' might not be 
> symmetric at some points in time
>  # Messaging with the described consistency guarantees might be useful both 
> over physical and logical topologies. Probably we need a way to abstract out 
> a 'topology' by an abstraction that allows to check whether a node is visible 
> or not, and subscribe to its joined/left events?
>  # How do we deal with non-transient failures (like an NPE) different from 
> failures caused by node disappearance? Do we just keep retrying until timeout 
> is triggered, or we crash the node if some unexpected failure occurs, or...?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to