+architecture

On Thu, Aug 17, 2017 at 3:24 PM, Anoukh Jayawardena <[email protected]> wrote:

> Hi All,
>
> This is a high level overview of the 2 node minimum high availability (HA)
> deployment feature for the Stream processor (SP). The implementation would
> adopt an Active Passive approach with periodic state persistence. The
> process flow of how this feature would work is as follows
>
> *Prerequisites*
>
>    - 2 SP workers, one would be the Active worker while the other would
>    be the passive. Both nodes should include the same Siddhi Applications
>    deployed.
>    - A specified RDBMS or file location for periodic state persistence of
>    Siddhi App states.
>    - A running zookeeper service or RDBMS instance for coordination among
>    the two nodes (Will be using carbon-coordination [1] for the purpose of
>    distributed coordination).
>    - Siddhi Client that would publish events to both Active and Passive
>    nodes in a synced manner where message is published to passive node when
>    the active node acknowledges message received.
>
>
> [image: 2 Node HA Overview.jpg]
>
>
> *Process*
>
>    - Both nodes will receive events and process them. But only the active
>    node will publish the output events. This ensures that both nodes are in
>    sync.
>    - Active node will periodically persist the siddhi app states so that
>    the state can be retrieved in a failover scenario.
>    - A user defined “Live State Sync” option would determine how the
>    states are synced between active and passive in a failover scenario
>    (Explained below).
>    - Following is how the implementation works in different system states
>
>
>    1. When a new node is starting up when Active node is available
>       - "Live State Sync Enabled" - The new node will detect the Active
>       node is available, so it will register itself as the passive node and 
> call
>       the Active node and borrow the current state. When state borrowing 
> happens
>       the processing of events is paused in both nodes so that data is not 
> lost.
>       After that both nodes will process events in sync.
>       - "Live State Sync Disabled" - The new node will detect the Active
>       node is available, so it will register itself as the passive node and
>       access the Database to get the last persisted state. This option may 
> lead
>       to few data loss since the Database will not contain a real time 
> persisted
>       state.
>    2. When a new node is starting up when Active node is unavailable
>       - The new node will detect the Active node is unavailable, so it
>       will register itself as the Active node and access the Database to get 
> the
>       last persisted state. (For a fresh restart an API should be called 
> before
>       hand to clean the DB)
>       3. When Active node goes down
>       - The Passive node will detect that Active node is unavailable and
>       would switch states and start publishing the output events.
>
>
> *Data loss minimizing strategies*
>
>
>    1. When active node goes down, the passive node does not know what the
>    last event that was published by the active node. Therefore passive node
>    might start publishing events from a later time. For example consider both
>    nodes have processed 5 messages but only 2 messages have been published by
>    Active node before failing. Passive node will start publishing from the 6th
>    message onwards since it does not know what has not been published.
>
> As a solution a queue implemented in the passive node. The active node
> will know the last event it published. Passive node would periodically ping
> the active node and get the last published event and dequeue the buffer
> accordingly, so that events are not lost, but might be duplicated. (This
> might be a problem in live state sync)
>
>
> [1] https://github.com/wso2/carbon-coordination
>
>
> Thank You
> --
> *Anoukh Jayawardena*
> *Software Engineer*
>
> *WSO2 Lanka (Private) Limited: http://wso2.com
> <http://wso2.com/>lean.enterprise.middle-ware*
>
>
> *phone: (+94) 77 99 28932*
> <https://wso2.com/signature>
>



-- 
*Anoukh Jayawardena*
*Software Engineer*

*WSO2 Lanka (Private) Limited: http://wso2.com
<http://wso2.com/>lean.enterprise.middle-ware*


*phone: (+94) 77 99 28932*
<https://wso2.com/signature>
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to