Thanks Ilya,

   1. "So all nodes will know when node A begins hosting that partition as
   primary" - how is that consensus achieved? Will it result in partition map
   exchange and new topology version?
   2. What I actually meant is that it is impossible to know when Node A is
   fully caught up to node B unless you stop all the writes to Node B while
   node A is catching up. So how does Ignite know that it is safe to set A to
   primary again?


On Mon, Sep 17, 2018 at 8:48 AM Ilya Kasnacheev <[email protected]>
wrote:

> Hello!
>
> Apache Ignite is NOT "eventually consistent" if you ask that. Apache
> Ignite is strongly consistent. It has discovery ring (or discovery star
> with Zk) which allows messages to be sent and acknowledged by all nodes.
>
> So all nodes will know when node A begins hosting that partition as
> primary.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 17 сент. 2018 г. в 15:45, eugene miretsky <[email protected]>:
>
>> How is "finish syncing" defined? Since it is a distributed system that is
>> no way to guarantee that node A is 100% caught up to node B. In Kafka there
>> is a replica.lag.time.max.ms settings, is there something similar in
>> Ignite?
>>
>>
>>
>> On Mon, Sep 17, 2018 at 8:37 AM Ilya Kasnacheev <
>> [email protected]> wrote:
>>
>>> Hello!
>>>
>>> Node A will have two choices: either drop partition completely and
>>> re-download it from B, or replicate recent changes on it. Either one will
>>> be choosed internally.
>>> Node A will only become primary again when it finishes syncing that
>>> partition.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> пт, 14 сент. 2018 г. в 22:23, eugene miretsky <[email protected]
>>> >:
>>>
>>>> What is the process when a node goes down and then restarts?
>>>>
>>>> Say backups = 1. We have node A that is primary for some key, and node
>>>> B that is back up.
>>>>
>>>> Node A goes down and then restarts after 5 min. What are the steps?
>>>> 1) Node A is servicing all traffic for key X
>>>> 2) Node A goes down
>>>> 3) Node B starts serving all traffic for key X (I guess the clients
>>>> detect the failover and start calling node B )
>>>> 4) Node A comes back up
>>>> 5) WAL replication is initiated
>>>>
>>>> What happens next? When does node A become the primary again? How are
>>>> in-flight updates happen?
>>>>
>>>>

Reply via email to