[jira] [Comment Edited] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave

2019-12-12 Thread Alexei Scherbakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994406#comment-16994406
 ] 

Alexei Scherbakov edited comment on IGNITE-9913 at 12/12/19 9:27 AM:
-

[~avinogradov]

1. I've came to a conclusion having rebalanced state calculated on a 
coordinator is the most robust way to say the grid is rebalanced. Let's keep it.
2. I've left two comments in your PR regarding the change.
3. ok.


was (Author: ascherbakov):
[~avinogradov]

1. I've came to a conclusion having rebalanced state calculated on coordinator 
is the most robust way to say the grid is rebalanced. Let's keep it.
2. I've left two comments in your PR regarding the change.
3. ok.

> Prevent data updates blocking in case of backup BLT server node leave
> -
>
> Key: IGNITE-9913
> URL: https://issues.apache.org/jira/browse/IGNITE-9913
> Project: Ignite
>  Issue Type: Improvement
>  Components: general
>Reporter: Ivan Rakov
>Assignee: Anton Vinogradov
>Priority: Major
> Attachments: 9913_yardstick.png, master_yardstick.png
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Ignite cluster performs distributed partition map exchange when any server 
> node leaves or joins the topology.
> Distributed PME blocks all updates and may take a long time. If all 
> partitions are assigned according to the baseline topology and server node 
> leaves, there's no actual need to perform distributed PME: every cluster node 
> is able to recalculate new affinity assigments and partition states locally. 
> If we'll implement such lightweight PME and handle mapping and lock requests 
> on new topology version correctly, updates won't be stopped (except updates 
> of partitions that lost their primary copy).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave

2019-10-12 Thread Alexei Scherbakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16950019#comment-16950019
 ] 

Alexei Scherbakov edited comment on IGNITE-9913 at 10/12/19 12:14 PM:
--

[~avinogradov]

I've reviewed changes. Seems it follows the architecture we discussed privately.
I left comments in the PR.
Besides what you definitely should add more tests.
Important scenarious migh be:
1. Baseline node is left under tx load while rebalancing is in progress 
(rebalancing is due to other node joins).
2. Owners are left one by one under tx load until subset of partitions will 
have single owner.
3. Owners are left one by one under tx load until subset of partitions will 
have no owner. Validate partition loss.

All tests should check partition integrity: see 
org.apache.ignite.testframework.junits.common.GridCommonAbstractTest#assertPartitionsSame

Do you have plans to implement non-blocking mapping for transactions not 
affected by topology change by the same ticket ?

Let me know if some personal discussion is required.




was (Author: ascherbakov):
[~avinogradov]

I've reviewed changes. Seems it follows the architecture we discussed privately.
I left comments in the PR.
Besides what you definitely should add more tests.
Important scenarious migh be:
1. Baseline node is left under tx load while rebalancing is in progress 
(rebalancing is due to other node joins).
2. Owners are left one by one under tx load until subset of partitions will 
have single owner.
3. Owners are left one by one under tx load until subset of partitions will 
have no owner. Validate partition loss.

All tests should check partition integrity: see 
org.apache.ignite.testframework.junits.common.GridCommonAbstractTest#assertPartitionsSame

Do you have plans to implement non-blocking mapping for transactions not 
affected by topology change by the same ticket ?




> Prevent data updates blocking in case of backup BLT server node leave
> -
>
> Key: IGNITE-9913
> URL: https://issues.apache.org/jira/browse/IGNITE-9913
> Project: Ignite
>  Issue Type: Improvement
>  Components: general
>Reporter: Ivan Rakov
>Assignee: Anton Vinogradov
>Priority: Major
> Fix For: 2.8
>
> Attachments: 9913_yardstick.png, master_yardstick.png
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Ignite cluster performs distributed partition map exchange when any server 
> node leaves or joins the topology.
> Distributed PME blocks all updates and may take a long time. If all 
> partitions are assigned according to the baseline topology and server node 
> leaves, there's no actual need to perform distributed PME: every cluster node 
> is able to recalculate new affinity assigments and partition states locally. 
> If we'll implement such lightweight PME and handle mapping and lock requests 
> on new topology version correctly, updates won't be stopped (except updates 
> of partitions that lost their primary copy).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave

2019-05-24 Thread Amelchev Nikita (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847456#comment-16847456
 ] 

Amelchev Nikita edited comment on IGNITE-9913 at 5/24/19 2:14 PM:
--

I investigate the issue about MOVING partitions - should we allow lightweight 
PME if the cluster has those partitions or not. 
[Dev-list 
discussion.|http://apache-ignite-developers.2346864.n4.nabble.com/Lightweight-version-of-partitions-map-exchange-td41551.html]


was (Author: nsamelchev):
I investigate the issue about MOVING partitions - should we allow lightweight 
PME if the cluster has those partitions. 
[Dev-list 
discussion.|http://apache-ignite-developers.2346864.n4.nabble.com/Lightweight-version-of-partitions-map-exchange-td41551.html]

> Prevent data updates blocking in case of backup BLT server node leave
> -
>
> Key: IGNITE-9913
> URL: https://issues.apache.org/jira/browse/IGNITE-9913
> Project: Ignite
>  Issue Type: Improvement
>  Components: general
>Reporter: Ivan Rakov
>Assignee: Amelchev Nikita
>Priority: Major
> Fix For: 2.8
>
> Attachments: 9913_yardstick.png, master_yardstick.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Ignite cluster performs distributed partition map exchange when any server 
> node leaves or joins the topology.
> Distributed PME blocks all updates and may take a long time. If all 
> partitions are assigned according to the baseline topology and server node 
> leaves, there's no actual need to perform distributed PME: every cluster node 
> is able to recalculate new affinity assigments and partition states locally. 
> If we'll implement such lightweight PME and handle mapping and lock requests 
> on new topology version correctly, updates won't be stopped (except updates 
> of partitions that lost their primary copy).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)