Re: [Xenomai-core] [bug] broken PI-chain

2006-08-26 Thread Jan Kiszka
Philippe Gerum wrote:
 On Fri, 2006-08-25 at 20:33 +0200, Philippe Gerum wrote:
 On Fri, 2006-08-25 at 18:56 +0200, Jan Kiszka wrote:
 Hi,

 the following (academic?) scenario is not handled as expected by the
 nucleus:

  Prio   ^
 |  -- T2
 | /
 |  T0 - T1 - M1
 | (M0)  M0  (M1)
 +

 That means: Task T0 at, e.g., prio 1 holds mutex M0. T1, also at prio 1,
 happens to interrupt T0 (round-robin?). T1 already holds M1 and now
 tries to acquire M0. As both are at the same prio level, no boosting
 takes place, all fine. Then T2 comes in play. It's at prio 2 and tries
 to gain access to M1. T1 gets therefore boosted to prio 2 as well, but
 T0 will stay at prio 1 under Xenomai.

 Ok, confirmed. The way the PI chain is handled is very academically
 broken.

 
 xnsynch_renice_sleeper() is the culprit. It does not boost the current
 owner of an unclaimed resource when moving down the PI chain, albeit it
 should. The main patch below fixes the issue here.
 
 Btw, the test code needs likely needs fixing too:
 
 - for (i = 1; i  MAX; i++)
 + for (i = 1; i  MAX; i++) {
   rt_task_spawn(task[i], NULL, 0, (i == MAX-1) ? 2 : 1, 0, 
 task_func, (void *)i);
 + rt_task_yield();
 + }

True for kernel-space and simulator. It's in my model as well but I
forgot to port it to Real Life. User-space does this yield implicitly
due to the Linux thread creation underneath.

 
 AFAICS, since the main thread (T0) is shadowed with priority 1,
 the sequence is going to be:
 
 T0, T2, T0 ... rt_task_inquire()
 
 With T1 remaining in limbo since there is no reason to preempt T0,
 in which case T0's priority should remain 1, since there is no
 reason to boost it in the first place (i.e. T1 did not run).
 With the added yield, the test works as expected, and T0 properly
 undergoes a priority boost (=2).
 
 I hacked this scenario in the attached demo. Set MAX to 2 to let it
 work, leave it 3 and it breaks.
 Ouch. It does indeed...

 The problem looks on first sight like this test [1]: the claiming
 relation is only establish if there is a priority delta on entry, but
 that breaks when the claiming thread's prio changes later. Can we simply
 remove this test?
 It seems that there is even more than this, and I'm wondering now if
 something fishy is not happening with the CLAIMED bit handling too, when
 an attempt is made to clear the priority boost.
 http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/synch.c?v=SVN-trunk#L265

 
 Ok, this one was a red herring. The CLAIMED bit is properly handled.
 
 I've slightly adapted your test code so that my friend the simulator
 helps us, and a segfault is raised upon exit of task_func(), which goes
 south, likely trying to flush the pend queue of some synch object it
 should not still be waiting for.
 
 Red herring again. Wrong simulation libraries. Sorry for the noise.
 
 Index: ksrc/nucleus/synch.c
 ===
 --- ksrc/nucleus/synch.c  (revision 1509)
 +++ ksrc/nucleus/synch.c  (working copy)
 @@ -116,8 +116,8 @@
   xnsynch_renice_sleeper(thread);
   else if (thread != xnpod_current_thread() 
testbits(thread-status, XNREADY))
 - /* xnpod_resume_thread() must be called for runnable threads
 -but the running one. */
 + /* xnpod_resume_thread() must be called for runnable
 +threads but the running one. */
   xnpod_resume_thread(thread, 0);
  
  #if defined(__KERNEL__)  defined(CONFIG_XENO_OPT_PERVASIVE)
 @@ -210,8 +210,8 @@
   else
   __setbits(synch-status, XNSYNCH_CLAIMED);
  
 + insertpqf(owner-claimq, synch-link, thread-cprio);
   insertpqf(synch-pendq, thread-plink, thread-cprio);
 - insertpqf(owner-claimq, synch-link, thread-cprio);
   xnsynch_renice_thread(owner, thread-cprio);
   } else
   insertpqf(synch-pendq, thread-plink, thread-cprio);
 @@ -262,8 +262,8 @@
   int downprio;
  
   removepq(lastowner-claimq, synch-link);
 + downprio = lastowner-bprio;
   __clrbits(synch-status, XNSYNCH_CLAIMED);
 - downprio = lastowner-bprio;
  
   if (emptypq_p(lastowner-claimq))
   __clrbits(lastowner-status, XNBOOST);
 @@ -295,19 +295,35 @@
  void xnsynch_renice_sleeper(xnthread_t *thread)
  {
   xnsynch_t *synch = thread-wchan;
 + xnthread_t *owner;
  
 - if (testbits(synch-status, XNSYNCH_PRIO)) {
 - xnthread_t *owner = synch-owner;
 + if (!testbits(synch-status, XNSYNCH_PRIO))
 + return;
  
 - removepq(synch-pendq, thread-plink);
 - insertpqf(synch-pendq, thread-plink, thread-cprio);
 + owner = synch-owner;
 + removepq(synch-pendq, thread-plink);
 + insertpqf(synch-pendq, thread-plink, thread-cprio);
  
 -

[Xenomai-core] Buildbot: Small glitch in simulator

2006-08-26 Thread Niklaus Giger
Hi

http://ngiger.dyndns.org/buildbot/sim/builds/18/step-check_sim/0
reports a small error in the simulator.

Some problems were reported in:
posix-tmq:: starting POSIX services.
vxworks-t010823-2:t010823-2.c:163: Expected sequence: SEQ(Test2,2003); got 
SEQ(Test2,2004)
vxworks-t010823-2: t010823-2.c:166, test finished: 1 failures/ 33 tests
program finished with exit code 3

-- 
Niklaus Giger

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core