Hi Ryan, Ignite has JobStealingCollisionSpi and JobStealingFailoverSpi for it. See CollisionSpi [1] and FailoverSpi [2] documentation. However, there is an unresolved issue [3] that doesn't allow tasks to be stolen by newly joined node.
[1] https://apacheignite.readme.io/docs/job-scheduling [2] https://apacheignite.readme.io/docs/fault-tolerance [3] https://issues.apache.org/jira/browse/IGNITE-1267 On Tue, May 16, 2017 at 11:04 PM, Ryan Ripken <[email protected]> wrote: > In the GridGain days I was able to add nodes to an already started task. > Is there a way to do that in Ignite? I occasionally have nodes disconnect > or crash in a custom native library. I'd like to be able to restart those > failed nodes or add additional nodes if the compute isn't progressing as > quickly as initially hoped. > > If its not possible to add nodes to an already started task, are there > patterns or tricks that can be used to accomplish something similar? > > It seems like one trick might be to take the original task (100 jobs) and > turn it into many more tasks (10) with fewer jobs (10) per task. The new > list of tasks aren't started all at once but staggered over time so that if > additional nodes join (after task 1 has already started) they can > contribute to the later tasks. > > Its a crude solution but it seems like it would have to work. Before I > refactor my tasks and jobs to try it out I'm wondering if someone has a > better suggestion. Is there a better way to accomplish something similar? > > Thanks! > > -- Best regards, Andrey V. Mashenkov
