Re: [Xenomai-core] [PATCH] Fix host IRQ propagation

2009-05-17 Thread Philippe Gerum
On Thu, 2009-05-14 at 15:10 +0200, Gilles Chanteperdrix wrote: 
 Philippe Gerum wrote:
  On Thu, 2009-05-14 at 14:52 +0200, Gilles Chanteperdrix wrote:
  Philippe Gerum wrote:
  On Thu, 2009-05-14 at 12:20 +0200, Jan Kiszka wrote:
  Philippe Gerum wrote:
  On Wed, 2009-05-13 at 18:10 +0200, Jan Kiszka wrote:
  Philippe Gerum wrote:
  On Wed, 2009-05-13 at 17:28 +0200, Jan Kiszka wrote:
  Philippe Gerum wrote:
  On Wed, 2009-05-13 at 15:18 +0200, Jan Kiszka wrote:
  Gilles Chanteperdrix wrote:
  Jan Kiszka wrote:
  Hi Gilles,
 
  I'm currently facing a nasty effect with switchtest over latest 
  git head
  (only tested this so far): running it inside my test VM (ie. with
  frequent excessive latencies) I get a stalled Linux timer IRQ 
  quite
  quickly. System is otherwise still responsive, Xenomai timers 
  are still
  being delivered, other Linux IRQs too. switchtest complained 
  about
 
  Warning: Linux is compiled to use FPU in kernel-space.
 
  when it was started. Kernels are 2.6.28.9/ipipe-x86-2.2-07 and
  2.6.29.3/ipipe-x86-2.3-01 (LTTng patched in, but unused), both 
  show the
  same effect.
 
  Seen this before?
  The warning about Linux being compiled to use FPU in kernel-space 
  means
  that you enabled soft RAID or compiled for K7, Geode, or any other
  RAID is on (ordinary server config).
 
  configuration using 3DNow for such simple operations as memcpy. 
  It is
  harmless, it simply means that switchtest can not use fpu in 
  kernel-space.
 
  The bug you have is probably the same as the one described here, 
  which I
  am able to reproduce on my atom:
  https://mail.gna.org/public/xenomai-help/2009-04/msg00200.html
 
  Unfortunately, I for one am working on ARM issues and am not 
  available
  to debug x86 issues. I think Philippe is busy too...
  OK, looks like I got the same flu here.
 
  Philippe, did you find out any more details in the meantime? Then 
  I'm
  afraid I have to pick this up.
  No, I did not resume this task yet. Working from the powerpc side 
  of the
  universe here.
  Hoho, don't think this rain here over x86 would have never made it 
  down
  to ARM or PPC land! ;)
 
  Martin, could you check if this helps you, too?
 
  Jan
 
  (as usual, ready to be pulled from 'for-upstream')
 
  -
 
  Host IRQs may not only be triggered from non-root domains.
  Are you sure of this? I can't find any spot where this assumption 
  would
  be wrong. host_pend() is basically there to relay RT timer ticks and
  device IRQs, and this only happens on behalf of the pipeline head. At
  least, this is how rthal_irq_host_pend() should be used in any case. 
  If
  you did find a spot where this interface is being called from the 
  lower
  stage, then this is the root bug to fix.
  I haven't studied the I-pipe trace /wrt this in details yet, but I 
  could
  imagine that some shadow task is interrupted in primary mode by the
  timer IRQ and then leaves the handler in secondary mode due to whatever
  events between schedule-out and in at the end of xnintr_clock_handler.
 
  You need a thread context to move to secondary, I just can't see how
  such scenario would be possible.
  Here is the trace of events:
 
  = Shadow task starts migration to secondary
  = in xnpod_suspend_thread, nklock is briefly released before
 xnpod_schedule
  Which is the root bug. Blame on me; this recent change in -head breaks a
  basic rule a lot of code is based on: a self-suspending thread may not
  be preempted while scheduling out, i.e. suspension and rescheduling must
  be atomically performed. xnshadow_relax() counts on this too.
  Actually, I think the idea was mine in the first place... Maybe we can
  specify a special flag to xnpod_suspend_thread to ask fo the atomic
  suspension (maybe reuse XNATOMIC ?).
 
  
  I don't think so. We really need the basic assumption to hold in any
  case, because this is expected by most of the callers, and this
  micro-optimization is not worth the risk of introducing a race if
  misused.
 
 Well, I tend to disagree. The assumption that the thread is suspended
 from the point of view of the scheduler still holds even when the nklock
 is released, and it is what callers like rt_cond_wait are expecting. The
 assumptions of xnshadow_relax do not seem to me like a common assumption.
 

The assumption is that the thread has been suspended _and_ scheduled out
atomically, not only put in a suspended thread, which is quite different
when considered from an interrupt context. I'm worried by the fact that
re-enabling interrupts in the middle of this critical transition breaks
the unspoken rule that sched-curr may not be seen as bearing any block
bit in its status word from anywhere in the code executed from the local
CPU but xnpod_suspend_thread().

Another issue may arise in the SMP case, where xnpod_suspend_thread()
would block a thread running on a remote CPU; in theory, re-enabling
interrupts before the IPIs are sent from xnpod_schedule() - to kick the
remote 

Re: [Xenomai-core] Xenomai standalone

2009-05-17 Thread Philippe Gerum
On Thu, 2009-05-14 at 21:44 +0200, Patrick wrote:
 Hello all,
 
 I would like to know if it's possible to simply separate Xenomai
 (nucleus) from Linux and adeos ? My goal is to have a simple RTOS based
 on Xenomai (only native skin in kernel space).
 
 I have done a quick look at the source and it's seems ok to remove adeos
 by editing hal. About linux, Xenomai seems to need only MM, timer, and
 some part of irq management. Is that right ?
 
 So do you think that it would be possible to run Xenomai nucleus as a
 standalone RTOS ?
 

It is, and has already been done actually.

You may want to track how the nucleus and skins are moved on top of the
event-driven simulator running in user-space: i.e. include/asm-sim,
__XENO_SIM__ define. The simulator is basically a C++ library providing
co-routines, and a set of building blocks to mimick typical RTOS
resources (synchs, threads, interrupts etc).

Two more hints:
- the simulator is a good analogy for your problem, because the
simulation engine cannot provide any Linux kernel services, since it is
fully based on userland resources (in this case: from the glibc).
Therefore, if you don't have Linux underneath, this applies as well to
your case.
- track the xnarch_* interface from asm-sim/ (and elsewhere), how it is
implemented, what set of services is defined here. This is key to your
problem.

 Thanks in advance for any help
 
 Patrick
 
 
 
 ___
 Xenomai-core mailing list
 Xenomai-core@gna.org
 https://mail.gna.org/listinfo/xenomai-core
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] Allow break statement in RTDM_EXECUTE_ATOMICALLY()

2009-05-17 Thread Philippe Gerum

Jan,

Would you consider allowing break for leaving an atomic code block as
illustrated below?

I understand this would create a potential issue with newer RTDM drivers
relying on this feature when backported to former RTDM implementations,
but only having if/else constructs to control the execution flow within
an atomic block may be painful sometimes and lead to uselessly hairy
code, especially for error handling.
Maybe adding another helper macro with the desired behavior, and
documenting it separately from RTDM_EXECUTE_ATOMICALLY() would mitigate
the portability issue?

Additionally, I would definitely shadow/hide the spl value from the code
block in a way or another, since using s as a socket identifier for
RTDM-based protocol drivers is not that unusual.

e.g.

diff --git a/include/rtdm/rtdm_driver.h b/include/rtdm/rtdm_driver.h
index 058a9f8..18b6001 100644
--- a/include/rtdm/rtdm_driver.h
+++ b/include/rtdm/rtdm_driver.h
@@ -595,14 +595,16 @@ int rtdm_select_bind(int fd, rtdm_selector_t *selector,
LEAVE_ATOMIC_SECTION  \
 }
 #else /* This is how it really works */
-#define RTDM_EXECUTE_ATOMICALLY(code_block)\
-{  \
+#define RTDM_ATOMIC_BLOCK(code_block)  \
+do {   \
spl_t s;\
-   \
xnlock_get_irqsave(nklock, s); \
-   code_block; \
+   do {\
+ code_block;   \
+   } while(0); \
xnlock_put_irqrestore(nklock, s);  \
-}
+} while(0)
+#define RTDM_EXECUTE_ATOMICALLY(code_block)RTDM_ATOMIC_BLOCK(code_block)
 #endif
 /** @} Global Lock across Scheduler Invocation */
 
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core