Currently the reconciliation phase is done in register and in reregister so that the scheduler only knows the real status of their tasks if we go through these methods. If a task is launched, it does not performs an explicit quick reconciliation. On the other hand, the reconciliation phase blocks the thread until the process has been completed and can get to sleep too much under certain circumstances.
I propose to perform a non-blocking reconciliation phase, based on an exponential backoff of time intervals and to deal with the situation according to these options: First option: Reconcile in register, reregister and resourceOffers. Even if the reconciliation is invoked, if the interval has not been exceeded, it will not be reconciled. There will be explicit conciliations in all phases of the backoof except in the last one that will be implied. They will be explicit reconciliations of all the tasks that the Myriad scheduler has, they will be stored in a structure within the reconciliation service and will be emptied when the messages arrive by the update method. If you need to reconcile and there are tasks that have not yet been reconciled, only those that are missing will be reconciled. Example: 2m + method call delay (explicit) 4m + method call delay (explicit) 8m + method call delay (implicit) Explicit phase if taskReconcile == empty {taskReconcile = scheduler.state; reconcile (taskReconcile)} else {reconcile (taskReconcile) // not empty} Implicit phase taskReconcile = emptyList; reconcile (taskReconcile) Second option: The tasks in the reconciliation class will not be followed up. Reconciliations will simply be launched when the backoff considers and the scheduler will be responsible for implementing a behavior to detect if there are tasks that do not reconcile well (in the future). I think there is no sense in a mechanism to know if reconciliation has ended well in ReconcileService. 2m + method call delay (explicit) 4m + method call delay (explicit) 8m + method call delay (implicit) if! maxIterationBackoff {reconcile (emptyList)} else {reconcile (stateScheduler.state)} Third option: A proper thread responsible for reconciliation. It can be implemented with option 1 or 2 (I prefer 2). 2m (explicit) 4m (explicit) 8m (implicit) Fourth option: Make a more complex reconciliation. In special circumstances susceptible to cause an adjustment or caused by an error in the system such as: launch new task, register, reregister; provide a reconciliation mechanism addoc for each ciscrcustance. We will also continue launching reconciliations with backoff. In all the options an implicit reconciliation will be made in the last phase of the backoff to know the possible unknown states. Some questions that I have asked: - Does it make sense to set the reconciliation interval? - Can we propose an explicit reconciliation for a task that has just been launched or is it sufficient with an explicit reconciliation of everything by default? - Is it necessary before launching a task, waiting for the reconciliation to end? - Maybe we should force an implicit reconciliation in register and reregister? - Could we remove the reconciliation phase to our own thread and not depend on the driver's calls (register, reregister, etc ...)? - Could we remove the synchronization (taskReconcile) and just make the call to reconcile? Does it make sense to know if reconciliation has been made or not and to do nothing different? What do you think?? Many thanks, JuanP