[Xenomai-core] Stalled xenomai domain with head-optimisation

Jan Kiszka Thu, 11 May 2006 09:33:42 -0700

Hi Philippe,

I had a bit "fun" today trying to get some of our robotic hardware
running with latest Xenomai / Ipipe, also in order to test recent RTDM
fixes. It turned out that the head-optimised variant easily creates that
infamous stalled Xenomai domain, e.g. like this one:


> :     fn                 -212+   3.323  sched_clock+0xd (schedule+0x112)
> :     fn                 -209+   2.045  __ipipe_stall_root+0x8 
> (schedule+0x18e)
> :    *fn                 -207+   1.428  deactivate_task+0x9 (schedule+0x21e)
> :    *fn                 -205+   4.417  dequeue_task+0xa 
> (deactivate_task+0x1a)
> :    *fn                 -201+   2.635  recalc_task_prio+0xd (schedule+0x317)
> :    *fn                 -198+   2.345  effective_prio+0x9 
> (recalc_task_prio+0x108)
> :    *fn                 -196+   3.443  requeue_task+0xa (schedule+0x344)
> :    *fn                 -192+   2.582  __ipipe_dispatch_event+0xe 
> (schedule+0x412)
> :    *fn                 -190!  11.808  schedule_event+0xd 
> (__ipipe_dispatch_event+0x5e)
> :|   *fn                 -178+   8.135  __switch_to+0xc (schedule+0x4fe)
> :    *fn                 -170+   3.714  __ipipe_unstall_root+0x8 
> (schedule+0x536)
> :     fn                 -166+   2.105  finish_wait+0xa (xnpipe_read+0x17c)
> :     fn                 -164+   1.368  __ipipe_test_and_stall_root+0x8 
> (finish_wait+0xae)
> :    *fn                 -163+   1.203  __ipipe_restore_root+0x8 
> (finish_wait+0x70)
> :    *fn                 -161+   6.210  __ipipe_unstall_root+0x8 
> (__ipipe_restore_root+0x2b)
> :|  * fn                 -155+   1.706  fput+0x8 (sys_read+0x5d)
> :|  * fn                 -153+   2.413  __ipipe_stall_root+0x8 
> (syscall_exit+0x5)
> :   **fn                 -151+   1.984  do_notify_resume+0x9 
> (work_notifysig+0x13)
> :   **fn                 -149+   1.894  do_signal+0x11 (do_notify_resume+0x2f)
> :   **fn                 -147+   1.330  get_signal_to_deliver+0xe 
> (do_signal+0x4a)
> :   **fn                 -146+   2.022  __ipipe_stall_root+0x8 
> (get_signal_to_deliver+0x24)
> :   **fn                 -144+   2.060  dequeue_signal+0xb 
> (get_signal_to_deliver+0xe9)
> :   **fn                 -142+   2.030  __dequeue_signal+0xe 
> (dequeue_signal+0x21)
> :   **fn                 -140+   1.902  next_signal+0x9 
> (__dequeue_signal+0x1c)

This does not happen when I switch off Xenomai's head-optimisation. I
took this trace by patching shadow.c like this:

--- ksrc/nucleus/shadow.c       (revision 1074)
+++ ksrc/nucleus/shadow.c       (working copy)
@@ -1096,6 +1096,8 @@ static inline int do_hisyscall_event(uns
     xnthread_t *thread;
     u_long sysflags;

+    if (test_bit(IPIPE_STALL_FLAG, &rthal_domain.cpudata[0].status))
+        ipipe_trace_freeze(0);
     if (!nkpod || testbits(nkpod->status, XNPIDLE))
         goto no_skin;


You can reproduce the problem without special hardware by loading the
tims.ko module of our RACK framework [1], then starting tims_msg_client
(main/tims/router), and finally terminating it with ^C. The issue seems
to be somehow related to the pipe usage of TiMS.


Besides these bad news, there is fortunately also a lot of light: The
RTDM fixes and reorganisation did not cause regressions (puh...). Well,
and our RACK framework (+ various in-house extensions) runs really
smoothly over Xenomai. Specifically terminating and reloading
applications during runtime, which used to be a nightmare with /other
RT-extensions/, works fine and cause neither latency pikes nor even
worse effects.

I did some benchmarking on a production system today with "latency -p
1000 -f", and got about 130 us worst-case jitter (266 MHz Pentium-MMX,
tracer enabled) for this highest-prio task. And all this happened while
running various RT and non-RT jobs (e.g. cache calibrator) + xeno_16550A
(2 ports, one at 500 kbit/s) in background. =8)

Jan


[1]http://developer.berlios.de/projects/rack

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

[Xenomai-core] Stalled xenomai domain with head-optimisation

Reply via email to