Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails
On 2011-06-19 17:41, Gilles Chanteperdrix wrote: Merged your whole branch, but took the liberty to change it a bit (replacing the commit concerning unlocked context switches with comments changes only, and changing the commit about xntbase_tick). What makes splmax() redundant for the unlocked context switch case? IMO that bug is still present. We can clean up xnintr_clock_handler a bit after the changes, will follow up with a patch. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PULL] native: Fix msendq fastlock leakage
On 2011-06-19 15:00, Gilles Chanteperdrix wrote: On 06/19/2011 01:17 PM, Gilles Chanteperdrix wrote: On 06/19/2011 12:14 PM, Gilles Chanteperdrix wrote: I am working on this ppd cleanup issue again, I am asking for help to find a fix in -head for all cases where the sys_ppd is needed during some cleanup. The problem is that when the ppd cleanup is invoked: - we have no guarantee that current is a thread from the Xenomai application; - if it is, current-mm is NULL. So, associating the sys_ppd to either current or current-mm does not work. What we could do is pass the sys_ppd to all the other ppds cleanup handlers, this would fix cases such as freeing mutexes fastlock, but that does not help when the sys_ppd is needed during a thread deletion hook. I would like to find a solution where simply calling xnsys_ppd_get() will work, where we do not have an xnsys_ppd_get for each context, such as for instance xnsys_ppd_get_by_mm/xnsys_ppd_get_by_task_struct, because it would be too error-prone. Any idea anyone? The best I could come up with: use a ptd to store the mm currently being cleaned up, so that xnshadow_ppd_get continues to work, even in the middle of a cleanup. In order to also get xnshadow_ppd_get to work in task deletion hooks (which is needed to avoid the issue at the origin of this thread), we also need to set this ptd upon shadow mapping, so it is still there when reaching the task deletion hook (where current-mm may be NULL). Hence the patch: diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c index b243600..6bc4210 100644 --- a/ksrc/nucleus/shadow.c +++ b/ksrc/nucleus/shadow.c @@ -65,6 +65,11 @@ int nkthrptd; EXPORT_SYMBOL_GPL(nkthrptd); int nkerrptd; EXPORT_SYMBOL_GPL(nkerrptd); +int nkmmptd; +EXPORT_SYMBOL_GPL(nkmmptd); + +#define xnshadow_mmptd(t) ((t)-ptd[nkmmptd]) +#define xnshadow_mm(t) ((struct mm_struct *)xnshadow_mmptd(t)) xnshadow_mm() can now return a no longer existing mm. So no user of xnshadow_mm should ever dereference that pointer. Thus we better change all that user to treat the return value as a void pointer e.g. struct xnskin_slot { struct xnskin_props *props; @@ -1304,6 +1309,8 @@ int xnshadow_map(xnthread_t *thread, xncompletion_t __user *u_completion, * friends. */ xnshadow_thrptd(current) = thread; + xnshadow_mmptd(current) = current-mm; + rthal_enable_notifier(current); if (xnthread_base_priority(thread) == 0 @@ -2759,7 +2766,15 @@ static void detach_ppd(xnshadow_ppd_t * ppd) static inline void do_cleanup_event(struct mm_struct *mm) { + struct task_struct *p = current; + struct mm_struct *old; + + old = xnshadow_mm(p); + xnshadow_mmptd(p) = mm; + ppd_remove_mm(mm, detach_ppd); + + xnshadow_mmptd(p) = old; I don't have the full picture yet, but that feels racy: If the context over which we clean up that foreign mm is also using xnshadow_mmptd, other threads in that process may dislike this temporary change. } RTHAL_DECLARE_CLEANUP_EVENT(cleanup_event); @@ -2925,7 +2940,7 @@ EXPORT_SYMBOL_GPL(xnshadow_unregister_interface); xnshadow_ppd_t *xnshadow_ppd_get(unsigned muxid) { if (xnpod_userspace_p()) - return ppd_lookup(muxid, current-mm); + return ppd_lookup(muxid, xnshadow_mm(current) ?: current-mm); return NULL; } @@ -2960,8 +2975,9 @@ int xnshadow_mount(void) sema_init(completion_mutex, 1); nkthrptd = rthal_alloc_ptdkey(); nkerrptd = rthal_alloc_ptdkey(); + nkmmptd = rthal_alloc_ptdkey(); - if (nkthrptd 0 || nkerrptd 0) { + if (nkthrptd 0 || nkerrptd 0 || nkmmptd 0) { printk(KERN_ERR Xenomai: cannot allocate PTD slots\n); return -ENOMEM; } diff --git a/ksrc/skins/posix/mutex.c b/ksrc/skins/posix/mutex.c index 6ce75e5..cc86852 100644 --- a/ksrc/skins/posix/mutex.c +++ b/ksrc/skins/posix/mutex.c @@ -219,10 +219,6 @@ void pse51_mutex_destroy_internal(pse51_mutex_t *mutex, xnlock_put_irqrestore(nklock, s); #ifdef CONFIG_XENO_FASTSYNCH - /* We call xnheap_free even if the mutex is not pshared; when -this function is called from pse51_mutexq_cleanup, the -sem_heap is destroyed, or not the one to which the fastlock -belongs, xnheap will simply return an error. */ I think this comment is not completely obsolete. It still applies /wrt shared/non-shared. xnheap_free(xnsys_ppd_get(mutex-attr.pshared)-sem_heap, mutex-synchbase.fastlock); #endif /* CONFIG_XENO_FASTSYNCH */ If we can resolve that potential race, this looks like a nice solution. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails
On 06/20/2011 06:43 PM, Jan Kiszka wrote: On 2011-06-19 17:41, Gilles Chanteperdrix wrote: Merged your whole branch, but took the liberty to change it a bit (replacing the commit concerning unlocked context switches with comments changes only, and changing the commit about xntbase_tick). What makes splmax() redundant for the unlocked context switch case? IMO that bug is still present. No, the bug is between my keyboard and chair. On architectures with unlocked context switches, the Linux task switch still happens with irqs off, only the mm switch happens with irqs on. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PULL] native: Fix msendq fastlock leakage
On 06/20/2011 07:07 PM, Jan Kiszka wrote: static inline void do_cleanup_event(struct mm_struct *mm) { +struct task_struct *p = current; +struct mm_struct *old; + +old = xnshadow_mm(p); +xnshadow_mmptd(p) = mm; + ppd_remove_mm(mm, detach_ppd); + +xnshadow_mmptd(p) = old; I don't have the full picture yet, but that feels racy: If the context over which we clean up that foreign mm is also using xnshadow_mmptd, other threads in that process may dislike this temporary change. This mmptd is only used by xnshadow_ppd_get (which never dereferences it by the way). And if the current thread is running the cleanup event, it is not running any other service at the same time. So, this is safe. An alternative implementation would be to use some global per-cpu variable and disable the preemption during cleanup. But that would not fix the case of the shadow cleanups where current-mm is already null. The remaining problem is an irq interrupting the cleanup, and using xnshadow_ppd_get. But I would not expect that to happen. I mean, xnshadow_ppd_get is more a service to be used for implementation of system calls. diff --git a/ksrc/skins/posix/mutex.c b/ksrc/skins/posix/mutex.c index 6ce75e5..cc86852 100644 --- a/ksrc/skins/posix/mutex.c +++ b/ksrc/skins/posix/mutex.c @@ -219,10 +219,6 @@ void pse51_mutex_destroy_internal(pse51_mutex_t *mutex, xnlock_put_irqrestore(nklock, s); #ifdef CONFIG_XENO_FASTSYNCH -/* We call xnheap_free even if the mutex is not pshared; when - this function is called from pse51_mutexq_cleanup, the - sem_heap is destroyed, or not the one to which the fastlock - belongs, xnheap will simply return an error. */ I think this comment is not completely obsolete. It still applies /wrt shared/non-shared. If we apply this patch after the one changing the ppd cleanup order, everything falls in place (I can even put a warning in xnheap_destroy if the heap has some bytes still in use when destroyed). If we can resolve that potential race, this looks like a nice solution. It may look a bit overkill, since it uses an adeos ptd. On the other hand, who else uses them anyway ;-) -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails
On 2011-06-20 19:33, Gilles Chanteperdrix wrote: On 06/20/2011 06:43 PM, Jan Kiszka wrote: On 2011-06-19 17:41, Gilles Chanteperdrix wrote: Merged your whole branch, but took the liberty to change it a bit (replacing the commit concerning unlocked context switches with comments changes only, and changing the commit about xntbase_tick). What makes splmax() redundant for the unlocked context switch case? IMO that bug is still present. No, the bug is between my keyboard and chair. On architectures with unlocked context switches, the Linux task switch still happens with irqs off, only the mm switch happens with irqs on. Then why do we call xnlock_get_irqsave in xnsched_finish_unlocked_switch? Why not simply xnlock_get if irqs are off anyway? Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails
On 06/20/2011 09:38 PM, Jan Kiszka wrote: On 2011-06-20 19:33, Gilles Chanteperdrix wrote: On 06/20/2011 06:43 PM, Jan Kiszka wrote: On 2011-06-19 17:41, Gilles Chanteperdrix wrote: Merged your whole branch, but took the liberty to change it a bit (replacing the commit concerning unlocked context switches with comments changes only, and changing the commit about xntbase_tick). What makes splmax() redundant for the unlocked context switch case? IMO that bug is still present. No, the bug is between my keyboard and chair. On architectures with unlocked context switches, the Linux task switch still happens with irqs off, only the mm switch happens with irqs on. Then why do we call xnlock_get_irqsave in xnsched_finish_unlocked_switch? Why not simply xnlock_get if irqs are off anyway? Because of the Xenomai task switch, not the Linux task switch. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails
On 2011-06-20 21:41, Gilles Chanteperdrix wrote: On 06/20/2011 09:38 PM, Jan Kiszka wrote: On 2011-06-20 19:33, Gilles Chanteperdrix wrote: On 06/20/2011 06:43 PM, Jan Kiszka wrote: On 2011-06-19 17:41, Gilles Chanteperdrix wrote: Merged your whole branch, but took the liberty to change it a bit (replacing the commit concerning unlocked context switches with comments changes only, and changing the commit about xntbase_tick). What makes splmax() redundant for the unlocked context switch case? IMO that bug is still present. No, the bug is between my keyboard and chair. On architectures with unlocked context switches, the Linux task switch still happens with irqs off, only the mm switch happens with irqs on. Then why do we call xnlock_get_irqsave in xnsched_finish_unlocked_switch? Why not simply xnlock_get if irqs are off anyway? Because of the Xenomai task switch, not the Linux task switch. --verbose please. Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails
On 06/20/2011 09:41 PM, Jan Kiszka wrote: On 2011-06-20 21:41, Gilles Chanteperdrix wrote: On 06/20/2011 09:38 PM, Jan Kiszka wrote: On 2011-06-20 19:33, Gilles Chanteperdrix wrote: On 06/20/2011 06:43 PM, Jan Kiszka wrote: On 2011-06-19 17:41, Gilles Chanteperdrix wrote: Merged your whole branch, but took the liberty to change it a bit (replacing the commit concerning unlocked context switches with comments changes only, and changing the commit about xntbase_tick). What makes splmax() redundant for the unlocked context switch case? IMO that bug is still present. No, the bug is between my keyboard and chair. On architectures with unlocked context switches, the Linux task switch still happens with irqs off, only the mm switch happens with irqs on. Then why do we call xnlock_get_irqsave in xnsched_finish_unlocked_switch? Why not simply xnlock_get if irqs are off anyway? Because of the Xenomai task switch, not the Linux task switch. --verbose please. There are two kind of task switches, switches between Linux tasks, handled by Linux kernel function/macro/inline asm switch_to(). And those between Xenomai tasks, handled by function/macro/inline asm xnarch_switch_to(). Since a Linux kernel context switch may still be interrupted by a (primary mode) interrupt which could decide to switch context, it can not happen with interrupts enabled, due to the way it works (spill the registers in a place relative to the SP, then change SP not atomically). The Xenomai context switches have no such risk, so, they may happen with irqs on, completely. In case of relax, the two halves of context switches are not of the same kind. The first half of the context switch is a Xenomai switch, but the second half is the epilogue of a Linux context switch (which, by the way, is why we need skipping all the house keeping in __xnpod_schedule in that case, and also why we go to all the pain for keeping the two context switches compatible), hence, even on machines with unlocked context switches, irqs are off at this point. Hope this is more clear. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails
On 2011-06-20 21:51, Gilles Chanteperdrix wrote: On 06/20/2011 09:41 PM, Jan Kiszka wrote: On 2011-06-20 21:41, Gilles Chanteperdrix wrote: On 06/20/2011 09:38 PM, Jan Kiszka wrote: On 2011-06-20 19:33, Gilles Chanteperdrix wrote: On 06/20/2011 06:43 PM, Jan Kiszka wrote: On 2011-06-19 17:41, Gilles Chanteperdrix wrote: Merged your whole branch, but took the liberty to change it a bit (replacing the commit concerning unlocked context switches with comments changes only, and changing the commit about xntbase_tick). What makes splmax() redundant for the unlocked context switch case? IMO that bug is still present. No, the bug is between my keyboard and chair. On architectures with unlocked context switches, the Linux task switch still happens with irqs off, only the mm switch happens with irqs on. Then why do we call xnlock_get_irqsave in xnsched_finish_unlocked_switch? Why not simply xnlock_get if irqs are off anyway? Because of the Xenomai task switch, not the Linux task switch. --verbose please. There are two kind of task switches, switches between Linux tasks, handled by Linux kernel function/macro/inline asm switch_to(). And those between Xenomai tasks, handled by function/macro/inline asm xnarch_switch_to(). xnarch_switch_to is the central entry point for everyone. It may decide to branch to switch_to or __switch_to, or it simply handles all on its own - that's depending on the arch. Since a Linux kernel context switch may still be interrupted by a (primary mode) interrupt which could decide to switch context, it can not happen with interrupts enabled, due to the way it works (spill the registers in a place relative to the SP, then change SP not atomically). The Xenomai context switches have no such risk, so, they may happen with irqs on, completely. In case of relax, the two halves of context switches are not of the same kind. The first half of the context switch is a Xenomai switch, but the second half is the epilogue of a Linux context switch (which, by the way, is why we need skipping all the house keeping in __xnpod_schedule in that case, and also why we go to all the pain for keeping the two context switches compatible), hence, even on machines with unlocked context switches, irqs are off at this point. Hope this is more clear. That's all clear. But it's unclear how this maps to our key question. Can you point out where in those paths irqs are disabled again (after entering xnarch_switch_to) and left off for each of the unlocked switching archs? I'm still skeptical that the need for disable irqs during thread switch on some archs also leads to unconditionally disabled hard irqs when returning from xnarch_switch_to. But even if that's all the case today, we would better set this requirement in stone: diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c index f2fc2ab..c4c5807 100644 --- a/ksrc/nucleus/pod.c +++ b/ksrc/nucleus/pod.c @@ -2273,6 +2273,8 @@ reschedule: xnpod_switch_to(sched, prev, next); + XENO_BUGON(NUCLEUS, !irqs_disabled_hw()); + #ifdef CONFIG_XENO_OPT_PERVASIVE /* * Test whether we transitioned from primary mode to secondary [ just demonstrating, would require some cleanup ] Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails
On 06/20/2011 10:41 PM, Jan Kiszka wrote: xnarch_switch_to is the central entry point for everyone. It may decide to branch to switch_to or __switch_to, or it simply handles all on its own - that's depending on the arch. No, the Linux kernel does not know anything about xnarch_switch_to, so the schedule() function continues to use switch_to happily. xnarch_switch_to is only used to switch from xnthread_t to xnthread_t, by __xnpod_schedule(). Now, that some architecture (namely x86) decide that xnarch_switch_to should use switch_to (or more likely an inner __switch_to) when the xnthread_t has a non NULL user_task member is an implementation detail. Can you point out where in those paths irqs are disabled again (after entering xnarch_switch_to) They are not disabled again after xnarch_switch_to, they are disabled when starting switch_to. and left off for each of the unlocked switching archs? I'm still skeptical that the need for disable irqs during thread switch on some archs also leads to unconditionally disabled hard irqs when returning from xnarch_switch_to. But even if that's all the case today, we would better set this requirement in stone: diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c index f2fc2ab..c4c5807 100644 --- a/ksrc/nucleus/pod.c +++ b/ksrc/nucleus/pod.c @@ -2273,6 +2273,8 @@ reschedule: xnpod_switch_to(sched, prev, next); + XENO_BUGON(NUCLEUS, !irqs_disabled_hw()); + You misunderstand me: only after the second half context switch in the case of xnshadow_relax are the interrupts disabled. Because this second half-switch started as a switch_to and not an xnarch_switch_to, so, started as: #define switch_to(prev,next,last) \ do {\ local_irq_disable_hw_cond();\ last = __switch_to(prev,task_thread_info(prev), task_thread_info(next));\ local_irq_enable_hw_cond(); \ } while (0) (On ARM for instance). But that is true, we could assert this in the shadow epilogue case. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PULL] native: Fix msendq fastlock leakage
On 2011-06-20 19:46, Gilles Chanteperdrix wrote: On 06/20/2011 07:07 PM, Jan Kiszka wrote: static inline void do_cleanup_event(struct mm_struct *mm) { + struct task_struct *p = current; + struct mm_struct *old; + + old = xnshadow_mm(p); + xnshadow_mmptd(p) = mm; + ppd_remove_mm(mm, detach_ppd); + + xnshadow_mmptd(p) = old; I don't have the full picture yet, but that feels racy: If the context over which we clean up that foreign mm is also using xnshadow_mmptd, other threads in that process may dislike this temporary change. This mmptd is only used by xnshadow_ppd_get (which never dereferences it by the way I know that the current code is safe. Avoiding mm_struct types is about keeping it safe in the future. Why track the type if its just an opaque key and should be handled as such? ). And if the current thread is running the cleanup event, it is not running any other service at the same time. So, this is safe. Ah, xnshadow_mmptd is per-thread, not per-process as I incorrectly assumed. Then it's safe. An alternative implementation would be to use some global per-cpu variable and disable the preemption during cleanup. But that would not fix the case of the shadow cleanups where current-mm is already null. The remaining problem is an irq interrupting the cleanup, and using xnshadow_ppd_get. But I would not expect that to happen. I mean, xnshadow_ppd_get is more a service to be used for implementation of system calls. Yes, that kind of usage would be a bug in the first place. diff --git a/ksrc/skins/posix/mutex.c b/ksrc/skins/posix/mutex.c index 6ce75e5..cc86852 100644 --- a/ksrc/skins/posix/mutex.c +++ b/ksrc/skins/posix/mutex.c @@ -219,10 +219,6 @@ void pse51_mutex_destroy_internal(pse51_mutex_t *mutex, xnlock_put_irqrestore(nklock, s); #ifdef CONFIG_XENO_FASTSYNCH - /* We call xnheap_free even if the mutex is not pshared; when - this function is called from pse51_mutexq_cleanup, the - sem_heap is destroyed, or not the one to which the fastlock - belongs, xnheap will simply return an error. */ I think this comment is not completely obsolete. It still applies /wrt shared/non-shared. If we apply this patch after the one changing the ppd cleanup order, everything falls in place (I can even put a warning in xnheap_destroy if the heap has some bytes still in use when destroyed). We call xnheap_free even if the mutex is not pshared. This still applies and is unrelated to the ppd changes. Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix interrupt handler tails
On 2011-06-20 22:52, Gilles Chanteperdrix wrote: On 06/20/2011 10:41 PM, Jan Kiszka wrote: xnarch_switch_to is the central entry point for everyone. It may decide to branch to switch_to or __switch_to, or it simply handles all on its own - that's depending on the arch. No, the Linux kernel does not know anything about xnarch_switch_to, so the schedule() function continues to use switch_to happily. xnarch_switch_to is only used to switch from xnthread_t to xnthread_t, by __xnpod_schedule(). Now, that some architecture (namely x86) decide that xnarch_switch_to should use switch_to (or more likely an inner __switch_to) when the xnthread_t has a non NULL user_task member is an implementation detail. Can you point out where in those paths irqs are disabled again (after entering xnarch_switch_to) They are not disabled again after xnarch_switch_to, they are disabled when starting switch_to. and left off for each of the unlocked switching archs? I'm still skeptical that the need for disable irqs during thread switch on some archs also leads to unconditionally disabled hard irqs when returning from xnarch_switch_to. But even if that's all the case today, we would better set this requirement in stone: diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c index f2fc2ab..c4c5807 100644 --- a/ksrc/nucleus/pod.c +++ b/ksrc/nucleus/pod.c @@ -2273,6 +2273,8 @@ reschedule: xnpod_switch_to(sched, prev, next); +XENO_BUGON(NUCLEUS, !irqs_disabled_hw()); + You misunderstand me: only after the second half context switch in the case of xnshadow_relax are the interrupts disabled. Because this second half-switch started as a switch_to and not an xnarch_switch_to, so, started as: #define switch_to(prev,next,last) \ do {\ local_irq_disable_hw_cond();\ last = __switch_to(prev,task_thread_info(prev), task_thread_info(next));\ local_irq_enable_hw_cond(); \ } while (0) (On ARM for instance). OK, that's now clear, thanks. But that is true, we could assert this in the shadow epilogue case. I've queued a patch. Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core