Re: [Xenomai-core] [BUG] rt_task_delete kills caller
On Fri, 2006-07-21 at 17:31 +0200, Jan Kiszka wrote: Jan Kiszka wrote: Jan Kiszka wrote: Jan Kiszka wrote: Hi, I stumbled over a strange behaviour of rt_task_delete for a created, set periodic, but non-started task. The process gets killed on invocation, More precisely: (gdb) cont Program received signal SIG32, Real-time event 32. Weird. No kernel oops BTW. but only if rt_task_set_periodic was called with a non-zero start time. Here is the demo code: #include stdio.h #include sys/mman.h #include native/task.h main() { RT_TASK task; mlockall(MCL_CURRENT|MCL_FUTURE); printf(rt_task_create=%d\n, rt_task_create(task, task, 8192*4, 10, 0)); printf(rt_task_set_periodic=%d\n, rt_task_set_periodic(task, rt_timer_read()+1, 10)); printf(rt_task_delete=%d\n, rt_task_delete(task)); } Once you skip rt_task_set_periodic or call it like this rt_task_set_periodic(task, TM_NOW, 10), everything is fine. Tested over trunk, but I guess over versions should suffer as well. I noticed that the difference seems to be related to the xnpod_suspend_thread in xnpod_set_thread_periodic. That suspend is not called on idate == XN_INFINITE. What is it for then, specifically if you would call xnpod_suspend_thread(thread, xnpod_get_time()+period, period) which should have the same effect like xnpod_suspend_thread(thread, 0, period)? That difference is clear to me now: set_periodic with a start date != XN_INFINITE means suspend the task immediately until the provided release date (RTFM...) while date == XN_INFINITE means keep the task running and schedule the first release on now+period. The actual problem seems to be related to sending SIGKILL on rt_task_delete to the dying thread. This happens only in the failing case. When xnpod_suspend_thread was not called, the thread seems to self-terminate first so that rt_task_delete becomes a nop (no more task registered at that point). I think we had this issue before. Was it solved? [/me querying the archive now...] The termination may be just a symptom. There is more likely a bug in the cross-task-set-periodic code. I just ran this code with XENO_OPT_DEBUG on: #include stdio.h #include sys/mman.h #include native/task.h void thread(void *arg) { printf(thread started\n); while (1) { rt_task_wait_period(NULL); } } main() { RT_TASK task; mlockall(MCL_CURRENT|MCL_FUTURE); printf(rt_task_create=%d\n, rt_task_create(task, task, 0, 10, 0)); printf(rt_task_set_periodic=%d\n, rt_task_set_periodic(task, rt_timer_read()+100, 100)); printf(rt_task_start=%d\n, rt_task_start(task, thread, NULL)); printf(rt_task_delete=%d\n, rt_task_delete(task)); } The result (trunk rev. #1369): [EMAIL PROTECTED] :/root# /tmp/task-delete rt_task_create=0 rt_task_set_periodic=0 c1187f38 c01335c2 0004 c75116a8 c75115a0 5704ea7c 0006 0008ca33 0001 0001 002176c4 c1186000 c11c3360 c75115a0 c1187f4c c013e530 0010 0010 c1187f54 c013e9ad c1187f74 c013e68c Call Trace: c013e530 xnshadow_harden+0x94/0x14a Xenomai: fatal: Hardened thread task[989] running in Linux domain?! (status=0xc00084, sig=0, prev=task-delete[987]) CPU PIDPRI TIMEOUT STAT NAME 0 0 10 001400080 ROOT 0 00 00082 timsPipeReceiver 0 989 10 000c00180 task Timer: oneshot [tickval=1 ns, elapsed=27273167087] c116df04 c02aa242 c02bcd0a c116df40 c013da60 c02af4d3 c1144000 00c00084 c7511090 c02de300 c02de300 0282 c116df74 c0133938 0022 c02de300 c75115a0 c75115a0 0022 c02de288 0001 Call Trace: c0103835 show_stack_log_lvl+0x86/0x91 c0103862 show_stack+0x22/0x27 c013da60 schedule_event+0x1aa/0x2ee c0133938 __ipipe_dispatch_event+0x5e/0xdd c02998e0 schedule+0x426/0x632 c01030c7 work_resched+0x6/0x1c I-pipe tracer log (30 points): func0 ipipe_trace_panic_freeze+0x8 (schedule_event+0x143) func -4 schedule_event+0xe (__ipipe_dispatch_event+0x5e) func -6 __ipipe_dispatch_event+0xe (schedule+0x426) func -9 __ipipe_stall_root+0x8 (schedule+0x197) func -11 sched_clock+0xa (schedule+0x112) func -12 profile_hit+0x9 (schedule+0x69) func -13 schedule+0xe (work_resched+0x6) func -15 __ipipe_stall_root+0x8 (syscall_exit+0x5) func -17 irq_exit+0x8 (__ipipe_sync_stage+0x107) func -19 __ipipe_unstall_iret_root+0x8
Re: [Xenomai-core] [BUG] rt_task_delete kills caller
Philippe Gerum wrote: The second bug causing spurious RT32 signals to be notified is more of a GDB issue When running the attached program inside gdb, the RT32 signal seems to be used as the asynchronous cancellation signal. Or at least when running with libthread_db.so. -- Gilles Chanteperdrix. #include pthread.h void *routine(void *cookie) { pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL); for (;;) ; return cookie; } int main(int argc, const char *argv[]) { pthread_t tid; pthread_create(tid, NULL, routine, NULL); pthread_cancel(tid); pthread_join(tid, NULL); return 0; } ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] rt_task_delete kills caller
On Sun, 2006-07-30 at 18:48 +0200, Gilles Chanteperdrix wrote: Philippe Gerum wrote: The second bug causing spurious RT32 signals to be notified is more of a GDB issue When running the attached program inside gdb, the RT32 signal seems to be used as the asynchronous cancellation signal. Or at least when running with libthread_db.so. Mm, ok. This would also correlate with GDB going south when ptracing the asynchronous cancellation handler defined by the pthread library. plain text document attachment (cancel.c) #include pthread.h void *routine(void *cookie) { pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL); for (;;) ; return cookie; } int main(int argc, const char *argv[]) { pthread_t tid; pthread_create(tid, NULL, routine, NULL); pthread_cancel(tid); pthread_join(tid, NULL); return 0; } -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] rt_task_delete kills caller
Jan Kiszka wrote: Hi, I stumbled over a strange behaviour of rt_task_delete for a created, set periodic, but non-started task. The process gets killed on invocation, More precisely: (gdb) cont Program received signal SIG32, Real-time event 32. Weird. No kernel oops BTW. but only if rt_task_set_periodic was called with a non-zero start time. Here is the demo code: #include stdio.h #include sys/mman.h #include native/task.h main() { RT_TASK task; mlockall(MCL_CURRENT|MCL_FUTURE); printf(rt_task_create=%d\n, rt_task_create(task, task, 8192*4, 10, 0)); printf(rt_task_set_periodic=%d\n, rt_task_set_periodic(task, rt_timer_read()+1, 10)); printf(rt_task_delete=%d\n, rt_task_delete(task)); } Once you skip rt_task_set_periodic or call it like this rt_task_set_periodic(task, TM_NOW, 10), everything is fine. Tested over trunk, but I guess over versions should suffer as well. I noticed that the difference seems to be related to the xnpod_suspend_thread in xnpod_set_thread_periodic. That suspend is not called on idate == XN_INFINITE. What is it for then, specifically if you would call xnpod_suspend_thread(thread, xnpod_get_time()+period, period) which should have the same effect like xnpod_suspend_thread(thread, 0, period)? Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] rt_task_delete kills caller
Jan Kiszka wrote: Jan Kiszka wrote: Jan Kiszka wrote: Hi, I stumbled over a strange behaviour of rt_task_delete for a created, set periodic, but non-started task. The process gets killed on invocation, More precisely: (gdb) cont Program received signal SIG32, Real-time event 32. Weird. No kernel oops BTW. but only if rt_task_set_periodic was called with a non-zero start time. Here is the demo code: #include stdio.h #include sys/mman.h #include native/task.h main() { RT_TASK task; mlockall(MCL_CURRENT|MCL_FUTURE); printf(rt_task_create=%d\n, rt_task_create(task, task, 8192*4, 10, 0)); printf(rt_task_set_periodic=%d\n, rt_task_set_periodic(task, rt_timer_read()+1, 10)); printf(rt_task_delete=%d\n, rt_task_delete(task)); } Once you skip rt_task_set_periodic or call it like this rt_task_set_periodic(task, TM_NOW, 10), everything is fine. Tested over trunk, but I guess over versions should suffer as well. I noticed that the difference seems to be related to the xnpod_suspend_thread in xnpod_set_thread_periodic. That suspend is not called on idate == XN_INFINITE. What is it for then, specifically if you would call xnpod_suspend_thread(thread, xnpod_get_time()+period, period) which should have the same effect like xnpod_suspend_thread(thread, 0, period)? That difference is clear to me now: set_periodic with a start date != XN_INFINITE means suspend the task immediately until the provided release date (RTFM...) while date == XN_INFINITE means keep the task running and schedule the first release on now+period. The actual problem seems to be related to sending SIGKILL on rt_task_delete to the dying thread. This happens only in the failing case. When xnpod_suspend_thread was not called, the thread seems to self-terminate first so that rt_task_delete becomes a nop (no more task registered at that point). I think we had this issue before. Was it solved? [/me querying the archive now...] The termination may be just a symptom. There is more likely a bug in the cross-task-set-periodic code. I just ran this code with XENO_OPT_DEBUG on: #include stdio.h #include sys/mman.h #include native/task.h void thread(void *arg) { printf(thread started\n); while (1) { rt_task_wait_period(NULL); } } main() { RT_TASK task; mlockall(MCL_CURRENT|MCL_FUTURE); printf(rt_task_create=%d\n, rt_task_create(task, task, 0, 10, 0)); printf(rt_task_set_periodic=%d\n, rt_task_set_periodic(task, rt_timer_read()+100, 100)); printf(rt_task_start=%d\n, rt_task_start(task, thread, NULL)); printf(rt_task_delete=%d\n, rt_task_delete(task)); } The result (trunk rev. #1369): [EMAIL PROTECTED] :/root# /tmp/task-delete rt_task_create=0 rt_task_set_periodic=0 c1187f38 c01335c2 0004 c75116a8 c75115a0 5704ea7c 0006 0008ca33 0001 0001 002176c4 c1186000 c11c3360 c75115a0 c1187f4c c013e530 0010 0010 c1187f54 c013e9ad c1187f74 c013e68c Call Trace: c013e530 xnshadow_harden+0x94/0x14a Xenomai: fatal: Hardened thread task[989] running in Linux domain?! (status=0xc00084, sig=0, prev=task-delete[987]) CPU PIDPRI TIMEOUT STAT NAME 0 0 10 001400080 ROOT 0 00 00082 timsPipeReceiver 0 989 10 000c00180 task Timer: oneshot [tickval=1 ns, elapsed=27273167087] c116df04 c02aa242 c02bcd0a c116df40 c013da60 c02af4d3 c1144000 00c00084 c7511090 c02de300 c02de300 0282 c116df74 c0133938 0022 c02de300 c75115a0 c75115a0 0022 c02de288 0001 Call Trace: c0103835 show_stack_log_lvl+0x86/0x91 c0103862 show_stack+0x22/0x27 c013da60 schedule_event+0x1aa/0x2ee c0133938 __ipipe_dispatch_event+0x5e/0xdd c02998e0 schedule+0x426/0x632 c01030c7 work_resched+0x6/0x1c I-pipe tracer log (30 points): func0 ipipe_trace_panic_freeze+0x8 (schedule_event+0x143) func -4 schedule_event+0xe (__ipipe_dispatch_event+0x5e) func -6 __ipipe_dispatch_event+0xe (schedule+0x426) func -9 __ipipe_stall_root+0x8 (schedule+0x197) func -11 sched_clock+0xa (schedule+0x112) func -12 profile_hit+0x9 (schedule+0x69) func -13 schedule+0xe (work_resched+0x6) func -15 __ipipe_stall_root+0x8 (syscall_exit+0x5) func -17 irq_exit+0x8 (__ipipe_sync_stage+0x107) func -19 __ipipe_unstall_iret_root+0x8 (restore_raw+0x0) func -25 preempt_schedule+0xb (try_to_wake_up+0x12d) func -26 __ipipe_restore_root+0x8 (try_to_wake_up+0xf6) func
Re: [Xenomai-core] [BUG] rt_task_delete kills caller
On Fri, 2006-07-21 at 17:31 +0200, Jan Kiszka wrote: Jan Kiszka wrote: Jan Kiszka wrote: Jan Kiszka wrote: Hi, I stumbled over a strange behaviour of rt_task_delete for a created, set periodic, but non-started task. The process gets killed on invocation, More precisely: (gdb) cont Program received signal SIG32, Real-time event 32. Weird. No kernel oops BTW. but only if rt_task_set_periodic was called with a non-zero start time. Here is the demo code: #include stdio.h #include sys/mman.h #include native/task.h main() { RT_TASK task; mlockall(MCL_CURRENT|MCL_FUTURE); printf(rt_task_create=%d\n, rt_task_create(task, task, 8192*4, 10, 0)); printf(rt_task_set_periodic=%d\n, rt_task_set_periodic(task, rt_timer_read()+1, 10)); printf(rt_task_delete=%d\n, rt_task_delete(task)); } Once you skip rt_task_set_periodic or call it like this rt_task_set_periodic(task, TM_NOW, 10), everything is fine. Tested over trunk, but I guess over versions should suffer as well. I noticed that the difference seems to be related to the xnpod_suspend_thread in xnpod_set_thread_periodic. That suspend is not called on idate == XN_INFINITE. What is it for then, specifically if you would call xnpod_suspend_thread(thread, xnpod_get_time()+period, period) which should have the same effect like xnpod_suspend_thread(thread, 0, period)? That difference is clear to me now: set_periodic with a start date != XN_INFINITE means suspend the task immediately until the provided release date (RTFM...) while date == XN_INFINITE means keep the task running and schedule the first release on now+period. The actual problem seems to be related to sending SIGKILL on rt_task_delete to the dying thread. This happens only in the failing case. When xnpod_suspend_thread was not called, the thread seems to self-terminate first so that rt_task_delete becomes a nop (no more task registered at that point). I think we had this issue before. Was it solved? [/me querying the archive now...] The termination may be just a symptom. There is more likely a bug in the cross-task-set-periodic code. I just ran this code with XENO_OPT_DEBUG on: #include stdio.h #include sys/mman.h #include native/task.h void thread(void *arg) { printf(thread started\n); while (1) { rt_task_wait_period(NULL); } } main() { RT_TASK task; mlockall(MCL_CURRENT|MCL_FUTURE); printf(rt_task_create=%d\n, rt_task_create(task, task, 0, 10, 0)); printf(rt_task_set_periodic=%d\n, rt_task_set_periodic(task, rt_timer_read()+100, 100)); printf(rt_task_start=%d\n, rt_task_start(task, thread, NULL)); printf(rt_task_delete=%d\n, rt_task_delete(task)); } The result (trunk rev. #1369): [EMAIL PROTECTED] :/root# /tmp/task-delete rt_task_create=0 rt_task_set_periodic=0 c1187f38 c01335c2 0004 c75116a8 c75115a0 5704ea7c 0006 0008ca33 0001 0001 002176c4 c1186000 c11c3360 c75115a0 c1187f4c c013e530 0010 0010 c1187f54 c013e9ad c1187f74 c013e68c Call Trace: c013e530 xnshadow_harden+0x94/0x14a Xenomai: fatal: Hardened thread task[989] running in Linux domain?! (status=0xc00084, sig=0, prev=task-delete[987]) CPU PIDPRI TIMEOUT STAT NAME 0 0 10 001400080 ROOT 0 00 00082 timsPipeReceiver 0 989 10 000c00180 task Timer: oneshot [tickval=1 ns, elapsed=27273167087] c116df04 c02aa242 c02bcd0a c116df40 c013da60 c02af4d3 c1144000 00c00084 c7511090 c02de300 c02de300 0282 c116df74 c0133938 0022 c02de300 c75115a0 c75115a0 0022 c02de288 0001 Call Trace: c0103835 show_stack_log_lvl+0x86/0x91 c0103862 show_stack+0x22/0x27 c013da60 schedule_event+0x1aa/0x2ee c0133938 __ipipe_dispatch_event+0x5e/0xdd c02998e0 schedule+0x426/0x632 c01030c7 work_resched+0x6/0x1c I-pipe tracer log (30 points): func0 ipipe_trace_panic_freeze+0x8 (schedule_event+0x143) func -4 schedule_event+0xe (__ipipe_dispatch_event+0x5e) func -6 __ipipe_dispatch_event+0xe (schedule+0x426) func -9 __ipipe_stall_root+0x8 (schedule+0x197) func -11 sched_clock+0xa (schedule+0x112) func -12 profile_hit+0x9 (schedule+0x69) func -13 schedule+0xe (work_resched+0x6) func -15 __ipipe_stall_root+0x8 (syscall_exit+0x5) func -17 irq_exit+0x8 (__ipipe_sync_stage+0x107) func -19 __ipipe_unstall_iret_root+0x8