Hello, I don't seem to have reconcileTasks() working for me and was wondering if I am either using it incorrectly or hitting a problem. Here's what's happening:
1. There's one Mesos (0.18) master, one slave, one framework, all running on Ubuntu 12.04 2. Mesos master and slave come up fine (using Zookeeper, but that isn't relevant here, I'd think) 3. My framework registers and gets offers 4. Two tasks are launched, both start running fine on the single available slave 5. I restart my framework. During restart my framework knows that it had previously launched two tasks that were last known to be in running state. Therefore, upon getting the registered() callback, it calls driver.reconcileTasks() for the two tasks. In actuality, the tasks are still running fine. I see this in mesos master logs: I0417 12:26:27.207361 27301 master.cpp:2154] Performing task state reconciliation for framework MyFramework But, no other logs about reconciliation. 6. My framework gets no callback about status of tasks that it requested reconciliation on. At this point, I am not sure if the lack of a callback for status update is due to a) the fact that my framework asked for reconciliation on running state, which Mesos also knows to be true, therefore, no status update b) Or, if the reconcile is not working. (hopefully this; reason (a) would be problematic) So, I then proceed to another test: 7. kill my framework and mesos master 8. Then, kill the slave (as an aside, this seems to have killed the tasks as well) 9. Restart mesos master 10. Restart my framework. Now, again the reconciliation is requested. 11. Still no callback. At this time, mesos master doesn't know about the slave because it hasn't returned since master restarted. What is the expected behavior for reconciliation under these circumstances? 12. Restarted slave 13. Killed and restarted my framework. 14. Still no callback for reconciliation. Given these results, I can't see how reconciliation is working at all. I did try this with Mesos 0.16 first and then upgraded to 0.18 to see if it makes a difference. Thank you for any ideas on getting this resolved. Sharma