Re: [Xenomai-core] [RFC] Fixes for domain migration races
On 2011-07-27 20:44, Gilles Chanteperdrix wrote: On 07/19/2011 08:44 AM, Jan Kiszka wrote: Hi, I've just uploaded my upstream queue that mostly deals with the various races I found in the domain migration code. One of my concerns raised earlier turned out to be for no reason: We do not allow Linux to wake up a task that has TASK_ATOMICSWITCH set. So the deletion race can indeed be fixed by the patch I sent earlier. So, I still have the same question: is not the solution of synchronizing with the gatekeeper as soon as we get out from schedule in secondary mode better than waiting the task_exit callback? It looks more correct, and it avoids gksched. Yes, I was on the wrong track /wrt wakeup races during the early migration phase. It is a possible and valid scenario that the task returns from schedule() without being migrated. That can only happen if a signal was queued in the meantime. The task will not be woken up again, that is prevented by ATOMICSWITCH, but it will check for pending signals itself before falling asleep. In that case it will enter TASK_RUNNING again and return either before the gatekeeper could run or, on SMP, may continue in parallel on a different CPU. What saves us now from the fatal scenario that both the task runs and the gatekeeper resumes its Xenomai part is that TASK_INTERRUPTIBLE state was left. And if we wait for the gatekeeper to realize this like you suggested, we ensure that neither the object is deleted too early nor TASK_INTERRUPTIBLE is reentered again by doing Linux work. I've cleaned up my queue correspondingly and just pushed it. Thanks, Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RFC] Fixes for domain migration races
On 07/19/2011 08:44 AM, Jan Kiszka wrote: Hi, I've just uploaded my upstream queue that mostly deals with the various races I found in the domain migration code. One of my concerns raised earlier turned out to be for no reason: We do not allow Linux to wake up a task that has TASK_ATOMICSWITCH set. So the deletion race can indeed be fixed by the patch I sent earlier. So, I still have the same question: is not the solution of synchronizing with the gatekeeper as soon as we get out from schedule in secondary mode better than waiting the task_exit callback? It looks more correct, and it avoids gksched. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [RFC] Fixes for domain migration races
Hi, I've just uploaded my upstream queue that mostly deals with the various races I found in the domain migration code. One of my concerns raised earlier turned out to be for no reason: We do not allow Linux to wake up a task that has TASK_ATOMICSWITCH set. So the deletion race can indeed be fixed by the patch I sent earlier. However, we do not synchronize setting and testing of TASK_ATOMICSWITCH (because we cannot hold the rq lock), thus we still face a small race window that allows premature wakeups, at least in theory. That's now addressed by patch 3. Besides another race around set/clear_task_nowakeup, there should have been a window during early migration to RT where we silently swallowed Linux signals. Closed by patch 4, hopefully also fixing our spurious gdb lockups on SMP boxes - time will tell. Please review carefully. Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core