[
https://issues.apache.org/jira/browse/IGNITE-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176971#comment-15176971
]
Valentin Kulichenko commented on IGNITE-1267:
---------------------------------------------
Removing the check completely is wrong, because in this case a job can be
stealed by (or failed over to) a node that is not included in the cluster group
specified when the task or closure was executed.
I think the best fix is to include the cluster group along with the list of
nodes into the task session. Refer to any method in {{IgniteCompute}} and
{{TC_SUBGRID}} context key. Once we have it, all such checks can be properly
fixed.
> JobStealingCollisionSpi never sends jobs to a node that joined after task was
> executed
> --------------------------------------------------------------------------------------
>
> Key: IGNITE-1267
> URL: https://issues.apache.org/jira/browse/IGNITE-1267
> Project: Ignite
> Issue Type: Bug
> Components: compute
> Affects Versions: 1.1.4
> Reporter: Valentin Kulichenko
> Labels: user-request
>
> Corresponding user thread (contains detailed description of the scenario that
> doesn't work):
> http://apache-ignite-users.70518.x6.nabble.com/Dynamic-ComputeTask-distribution-with-new-nodes-td997.html
> Essentially, {{JobStealingCollisionSpi}} always skips jobs that are not in
> task topology (see line 713). Task topology is static and created when task
> is executed, so newly joined node can't steal jobs. I think it should be able
> to do this if it satisfies initial cluster group predicate.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)