[jira] [Updated] (IGNITE-18772) Design mechanisms for messaging consistency

Roman Puchkovskiy (Jira) Fri, 10 Feb 2023 08:29:04 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-18772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Roman Puchkovskiy updated IGNITE-18772:
---------------------------------------
    Description: 
We have a use case where node A asks node B to notify node A when some event on 
node B occurs. This requires two round trips: first RT (A invokes B) installs 
an event listener on B, and second round trip (B makes a strong send to A) 
notifies A about the event.

To account for possible topology instability, code at node A subscribes to 
onDisappeared(B), same does code at node B (but to onDisappeared(A)).

A timeout might be installed on invocation future on node A.

Outcomes are:
 # If B is not in the topology for A, invoke future fails right away (B knows 
nothing about invocation, there is no request)
 # If A loses B from sight before invoke response is delivered to A, invoke 
future fails at A, and B eventually deregisters the listener
 # If invocation is ok, but nodes lose each other from sight before the event 
happens, node A stops waiting and node B deregisters the listener
 # Same happens if callback times out
 # If invocation is ok and event happens while nodes see each other, callback 
is delivered from B to A (with best effort guarantees, with retries till 
delivered or timed out or nodes lose each other of sight)

The outcome must be consistent between nodes A and B. That is, it cannot happen 
that one node acted as if it thought that another node disappeared, but another 
node acted as if first node was available.
 # Relation 'X sees Y' must be symmetric (in an eventual sense)
 # If node X currently does not see node Y, it cannot accept messages from it

We could use the following invariant: if a node has disappeared from the 
topology, it cannot appear there again with same identity (IGNITE-18712 might 
help on the physical topology level).

Things that should be carefully considered:
 # Nodes might have different views of the topology: 'X sees Y' might not be 
symmetric at some points in time
 # Messaging with the described consistency guarantees might be useful both 
over physical and logical topologies. Probably we need a way to abstract out a 
'topology' by an abstraction that allows to check whether a node is visible or 
not, and subscribe to its joined/left events?
 # How do we deal with non-transient failures (like an NPE) different from 
failures caused by node disappearance? Do we just keep retrying until timeout 
is triggered, or we crash the node if some unexpected failure occurs, or...?

  was:
We have a use case where node A asks node B to notify node A when some event on 
node B occurs. This requires two round trips: first RT (A invokes B) installs 
an event listener on B, and second round trip (B makes a strong send to A) 
notifies A about the event.

To account for possible topology instability, code at node A subscribes to 
onDisappeared(B), same does code at node B (but to onDisappeared(A)).

A timeout might be installed on node A.

Outcomes are:
 # If B is not in the topology for A, invoke future fails right away (B knows 
nothing about invocation, there is no request)
 # If A loses B from sight before invoke response is delivered to A, invoke 
future fails at A, and B eventually deregisters the listener
 # If invocation is ok, but nodes lose each other from sight before the event 
happens, node A stops waiting and node B deregisters the listener
 # Same happens if callback times out
 # If invocation is ok and event happens while nodes see each other, callback 
is delivered from B to A (with best effort guarantees, with retries till 
delivered or timed out or nodes lose each other of sight)

The outcome must be consistent between nodes A and B. That is, it cannot happen 
that one node acted as if it thought that another node disappeared, but another 
node acted as if first node was available.
 # Relation 'X sees Y' must be symmetric (in an eventual sense)
 # If node X currently does not see node Y, it cannot accept messages from it

We could use the following invariant: if a node has disappeared from the 
topology, it cannot appear there again with same identity (IGNITE-18712 might 
help on the physical topology level).

Things that should be carefully considered:
 # Nodes might have different views of the topology: 'X sees Y' might not be 
symmetric at some points in time
 # Messaging with the described consistency guarantees might be useful both 
over physical and logical topologies. Probably we need a way to abstract out a 
'topology' by an abstraction that allows to check whether a node is visible or 
not, and subscribe to its joined/left events?
 # How do we deal with non-transient failures (like an NPE) different from 
failures caused by node disappearance? Do we just keep retrying until timeout 
is triggered, or we crash the node if some unexpected failure occurs, or...?


> Design mechanisms for messaging consistency
> -------------------------------------------
>
>                 Key: IGNITE-18772
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18772
>             Project: Ignite
>          Issue Type: New Feature
>          Components: networking
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-beta2
>
>
> We have a use case where node A asks node B to notify node A when some event 
> on node B occurs. This requires two round trips: first RT (A invokes B) 
> installs an event listener on B, and second round trip (B makes a strong send 
> to A) notifies A about the event.
> To account for possible topology instability, code at node A subscribes to 
> onDisappeared(B), same does code at node B (but to onDisappeared(A)).
> A timeout might be installed on invocation future on node A.
> Outcomes are:
>  # If B is not in the topology for A, invoke future fails right away (B knows 
> nothing about invocation, there is no request)
>  # If A loses B from sight before invoke response is delivered to A, invoke 
> future fails at A, and B eventually deregisters the listener
>  # If invocation is ok, but nodes lose each other from sight before the event 
> happens, node A stops waiting and node B deregisters the listener
>  # Same happens if callback times out
>  # If invocation is ok and event happens while nodes see each other, callback 
> is delivered from B to A (with best effort guarantees, with retries till 
> delivered or timed out or nodes lose each other of sight)
> The outcome must be consistent between nodes A and B. That is, it cannot 
> happen that one node acted as if it thought that another node disappeared, 
> but another node acted as if first node was available.
>  # Relation 'X sees Y' must be symmetric (in an eventual sense)
>  # If node X currently does not see node Y, it cannot accept messages from it
> We could use the following invariant: if a node has disappeared from the 
> topology, it cannot appear there again with same identity (IGNITE-18712 might 
> help on the physical topology level).
> Things that should be carefully considered:
>  # Nodes might have different views of the topology: 'X sees Y' might not be 
> symmetric at some points in time
>  # Messaging with the described consistency guarantees might be useful both 
> over physical and logical topologies. Probably we need a way to abstract out 
> a 'topology' by an abstraction that allows to check whether a node is visible 
> or not, and subscribe to its joined/left events?
>  # How do we deal with non-transient failures (like an NPE) different from 
> failures caused by node disappearance? Do we just keep retrying until timeout 
> is triggered, or we crash the node if some unexpected failure occurs, or...?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-18772) Design mechanisms for messaging consistency

Reply via email to