Thanks for pointing me to ComputeJobMasterLeaveAware interface.
I've implemented this the following way:

1) Created TaskInfo object with task parameters object, status, result, task
class name, etc.
2) During node selection in map method, taskInfo saved to cache with master
node id, assigned node id and status "running" (CPU adaptive load balancer
used).
3) Execute task on some node, get results, wrap exceptions to taskInfo's
task result field.
4) Initiator node receives callback from task future object and updates
cache with taskInfo changed to finished or failed state.
5) If master node dies, no callbacks issued of course and taskInfo remains
in running state.
6) After receiving client API request for task status/result I am using this
code to ping initiator node (only running state checked)

            if (!Engine.isNodeAlive(taskInfo.masterNodeId)) {
                Class<? extends Task<Object,TaskResult>> taskClass =
(Class<? extends Task<Object,TaskResult>>)Class.forName(taskInfo.taskClass);
                taskInfo = Engine.addTask(taskClass, taskInfo.params);
            }

7) Updated task info goes to cache with running status.
8) All finished or failed taskInfo's removed from cache on first client API
request or after 1 hour of no access (probably will leave only time
expiration).



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Async-workers-tp4813p4914.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Reply via email to