Re: [DISCUSS] Data loss handling improvements

2020-05-12 Thread Ivan Pavlukhin
Have not chance to read this thread carefully, but a following discussion sounds very similar and might be somehow useful [1]. [1] http://apache-ignite-developers.2346864.n4.nabble.com/Partition-Loss-Policies-issues-td37304.html Best regards, Ivan Pavlukhin чт, 7 мая 2020 г. в 11:36, Alexei

Re: [DISCUSS] Data loss handling improvements

2020-05-07 Thread Alexei Scherbakov
Yes, it will work this way. чт, 7 мая 2020 г. в 10:43, Anton Vinogradov : > Seems I got the vision, thanks. > There should be only 2 ways to reset lost partition: to gain an owner from > resurrected first or to remove ex-owner from baseline (partition will be > rearranged). > And we should make

Re: [DISCUSS] Data loss handling improvements

2020-05-07 Thread Anton Vinogradov
Seems I got the vision, thanks. There should be only 2 ways to reset lost partition: to gain an owner from resurrected first or to remove ex-owner from baseline (partition will be rearranged). And we should make a decision for every lost partition before calling the reset. On Wed, May 6, 2020 at

Re: [DISCUSS] Data loss handling improvements

2020-05-06 Thread Alexei Scherbakov
ср, 6 мая 2020 г. в 12:54, Anton Vinogradov : > Alexei, > > 1,2,4,5 - looks good to me, no objections here. > > >> 3. Lost state is impossible to reset if a topology doesn't have at least > >> one owner for each lost partition. > > Do you mean that, according to your example, where > >> a node2

Re: [DISCUSS] Data loss handling improvements

2020-05-06 Thread Anton Vinogradov
Alexei, 1,2,4,5 - looks good to me, no objections here. >> 3. Lost state is impossible to reset if a topology doesn't have at least >> one owner for each lost partition. Do you mean that, according to your example, where >> a node2 has left, soon a node3 has left. If the node2 is returned to >>

[DISCUSS] Data loss handling improvements

2020-05-06 Thread Alexei Scherbakov
Folks, I've almost finished a patch bringing some improvements to the data loss handling code, and I wish to discuss proposed changes with the community before submitting. *The issue* During the grid's lifetime, it's possible to get into a situation when some data nodes have failed or