Re: [Xenomai-core] Support for 2.6.22/x86

2007-06-30 Thread Philippe Gerum
On Sat, 2007-06-30 at 16:26 +0200, Philippe Gerum wrote:
> On Sat, 2007-06-30 at 13:02 +0200, Philippe Gerum wrote:
> > On Sat, 2007-06-30 at 09:48 +0200, Jan Kiszka wrote:
> > > Philippe Gerum wrote:
> > > > Our development trunk now contains the necessary support for running
> > > > Xenomai over 2.6.22/x86. This work boils down to enabling Xenomai to use
> > > > the generic clock event device abstraction that comes with newest
> > > > kernels. Other archs / kernel versions still work the older way, until
> > > > all archs eventually catch up with clockevents upstream.
> > > > 
> > > > This support won't be backported to 2.3.x, because it has some
> > > > significant impact on the nucleus. Tested as thoroughly as possible here
> > > > on low-end and mid-range x86 boxen, including SMP.
> > > > 
> > > > Please give this hell.
> > > > 
> > > > http://download.gna.org/adeos/patches/v2.6/i386/adeos-ipipe-2.6.22-rc6-i386-1.9-00.patch
> > > > 
> > > 
> > > Running some tests, the gate to hell just opened:
> > > 
> > > [  210.247006] BUG: sleeping function called from invalid context at
> > > kernel/sched.c:3941
> > > [  210.248171] in_atomic():1, irqs_disabled():1
> > > [  210.248828] no locks held by frag-ip/881.
> > > [  210.249494]  [] show_trace_log_lvl+0x1f/0x34
> > > [  210.250523]  [] show_trace+0x17/0x19
> > > [  210.257778]  [] dump_stack+0x1b/0x1d
> > > [  210.258070]  [] __might_sleep+0xda/0xe1
> > > [  210.258365]  [] wait_for_completion+0x1f/0xc3
> > > [  210.258688]  [] set_cpus_allowed+0x77/0x95
> > > [  210.258992]  [] lostage_handler+0x75/0x201 [xeno_nucleus]
> > > [  210.259551]  [] rthal_apc_handler+0x5c/0x89
> > > [  210.259869]  [] __ipipe_sync_stage+0x13a/0x147
> > > [  210.260204]  [] __ipipe_syscall_root+0x1a6/0x1c8
> > > [  210.260536]  [] system_call+0x29/0x41
> > > 
> > > Setup is latest SVN + a "few" patches (the well-known ones), CONFIG_SMP,
> > > qemu -smp 2, RTnet in loopback mode, just terminating the frag-ip example.
> > > 
> > > However, this gremlin looks like it is /far/ older than 2.6.22 support.
> > > Calling set_cpus_allowed() from atomic lostage_handler is simply bogus,
> > > I'm afraid. :-/
> > > 
> > 
> > Confirmed, this is an old bug. Just adding a might_sleep() statement
> > even in UP config inside the lostage handler would trigger the warning.
> 
> Ok, found it. It's an I-pipe issue. Working on a fix.

Well, it wasn't an I-pipe issue, even if simulating the hardirq context
(irq_enter/exit) also when running a virtual IRQ may be discussed,
compared to considering those as Linux softirqs; still, the logic is
correct.

Attempting to track CPU affinities over the lostage handler from the
nucleus was the wrong thing, and beyond that, the way it was done was
logically flawed, and pointless. #2687 should be better.

> 
> > 
> > > Jan
> > > 
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Support for 2.6.22/x86

2007-06-30 Thread Philippe Gerum
On Sat, 2007-06-30 at 13:02 +0200, Philippe Gerum wrote:
> On Sat, 2007-06-30 at 09:48 +0200, Jan Kiszka wrote:
> > Philippe Gerum wrote:
> > > Our development trunk now contains the necessary support for running
> > > Xenomai over 2.6.22/x86. This work boils down to enabling Xenomai to use
> > > the generic clock event device abstraction that comes with newest
> > > kernels. Other archs / kernel versions still work the older way, until
> > > all archs eventually catch up with clockevents upstream.
> > > 
> > > This support won't be backported to 2.3.x, because it has some
> > > significant impact on the nucleus. Tested as thoroughly as possible here
> > > on low-end and mid-range x86 boxen, including SMP.
> > > 
> > > Please give this hell.
> > > 
> > > http://download.gna.org/adeos/patches/v2.6/i386/adeos-ipipe-2.6.22-rc6-i386-1.9-00.patch
> > > 
> > 
> > Running some tests, the gate to hell just opened:
> > 
> > [  210.247006] BUG: sleeping function called from invalid context at
> > kernel/sched.c:3941
> > [  210.248171] in_atomic():1, irqs_disabled():1
> > [  210.248828] no locks held by frag-ip/881.
> > [  210.249494]  [] show_trace_log_lvl+0x1f/0x34
> > [  210.250523]  [] show_trace+0x17/0x19
> > [  210.257778]  [] dump_stack+0x1b/0x1d
> > [  210.258070]  [] __might_sleep+0xda/0xe1
> > [  210.258365]  [] wait_for_completion+0x1f/0xc3
> > [  210.258688]  [] set_cpus_allowed+0x77/0x95
> > [  210.258992]  [] lostage_handler+0x75/0x201 [xeno_nucleus]
> > [  210.259551]  [] rthal_apc_handler+0x5c/0x89
> > [  210.259869]  [] __ipipe_sync_stage+0x13a/0x147
> > [  210.260204]  [] __ipipe_syscall_root+0x1a6/0x1c8
> > [  210.260536]  [] system_call+0x29/0x41
> > 
> > Setup is latest SVN + a "few" patches (the well-known ones), CONFIG_SMP,
> > qemu -smp 2, RTnet in loopback mode, just terminating the frag-ip example.
> > 
> > However, this gremlin looks like it is /far/ older than 2.6.22 support.
> > Calling set_cpus_allowed() from atomic lostage_handler is simply bogus,
> > I'm afraid. :-/
> > 
> 
> Confirmed, this is an old bug. Just adding a might_sleep() statement
> even in UP config inside the lostage handler would trigger the warning.

Ok, found it. It's an I-pipe issue. Working on a fix.

> 
> > Jan
> > 
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH-STACK] more timer rework, group-based access control

2007-06-30 Thread Philippe Gerum
On Sat, 2007-06-30 at 07:56 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> >> rt-caps-group.patch
> > 
> > Good idea, but totally untested on my side (did I already mention my
> > laziness?). Will merge and backport to 2.3.x, XENO_OPT_SECURITY_ACCESS
> > was too cheap to be good anyway.
> 
> What about moving that scheduling parameter settings into the
> trampoline? Some legacy RTOS libs would still have to be adapted.
> 

We could do that since the caller is always synchronized with the new
thread and blocks on the completion flag until xnshadow_map() has run,
so this would solve potential priority issues. This would also keep the
existing work around for a silly bug from some Linuxthreads
implementations which just did not consider the scheduling policy
setting passed to pthread_create(), and would leave the spawned thread
in SCHED_NORMAL mode in any case. However, this would still not work
with pthread libs overeagerly checking for root permissions.

> > 
> >> ---
> >>
> >> See dedicated posts earlier on this list.
> >>
> >>
> >> refactor-timer-modes.patch
> >> --
> >>
> >> xntimers are now of three kinds: XNTM_MONOREL, XNTM_MONOABS, or
> >> XNTM_REALABS. This mode is passed on xntimer_start or on invocation of
> >> higher services (xnpod_suspend_thread e.g.). Users were widely
> >> automatically converted and may lack optimisation for the new scheme.
> >> Please review carefully for regressions!
> >>
> > 
> > The only reason for me to whine about this one so far concerns naming
> > issues. Why do things need to be that picky?
> > 
> > I mean, we have only three modes, relative, absolute monotonic and
> > absolute realtime and we discussed in great length why we won't have
> > more than those. Given that "relative non-monotonic" would make no sense
> > here, and that "realtime" carries a strong notion of absoluteness in
> > relation to the wallclock, let's keep "relative", "absolute" and
> > "realtime". In any case, "mono" is not widespread enough for people to
> > immediately catch "monotonic" anyway, so in case of doubt: I'd say
> > _RTFM_.
> > 
> > Additionally, having XN_INFINITE and XNTM_RELATIVE/ABSOLUTE/REALTIME is
> > a clear sign of discrepancy. Please, keep my laziness intact, and let's
> > keep XN_ for general macros. I know that this is somehow in
> > contradiction with the current practice for the native skin, but I'm
> > full of contradictions anyway, so...
> 
> Hmm, everything has it's pros and cons, but mostly you are right. Here
> is a new proposal, please tell me if it's acceptable so that I can
> rebase the rest:
> 
> --- xenomai.orig/include/nucleus/types.h
> +++ xenomai/include/nucleus/types.h
> @@ -62,8 +62,13 @@ typedef int (*xniack_t)(unsigned irq);
> 
>  #define XN_INFINITE   (0)
>  #define XN_NONBLOCK   ((xnticks_t)-1)
> -#define XN_RELATIVE   0
> -#define XN_ABSOLUTE   1
> +
> +/* Timer modes */
> +typedef enum xntmode {
> +   XN_RELATIVE,
> +   XN_ABSOLUTE,
> +   XN_REALTIME
> +} xntmode_t;
> 
>  #define XN_APERIODIC_TICK  0
>  #define XN_NO_TICK ((xnticks_t)-1)
> 

Looks ok, provided we can always represent the mode as a set of discrete
values. This said, switching from a enum to a plain integer would be a
no-brainer if needed.

> 
> Last issue pending is the xintr locking thing. Will try to have a closer
> look again and kick Dmitry for some more explanations of his patch.
> 
> Jan
> 
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Support for 2.6.22/x86

2007-06-30 Thread Philippe Gerum
On Sat, 2007-06-30 at 09:48 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > Our development trunk now contains the necessary support for running
> > Xenomai over 2.6.22/x86. This work boils down to enabling Xenomai to use
> > the generic clock event device abstraction that comes with newest
> > kernels. Other archs / kernel versions still work the older way, until
> > all archs eventually catch up with clockevents upstream.
> > 
> > This support won't be backported to 2.3.x, because it has some
> > significant impact on the nucleus. Tested as thoroughly as possible here
> > on low-end and mid-range x86 boxen, including SMP.
> > 
> > Please give this hell.
> > 
> > http://download.gna.org/adeos/patches/v2.6/i386/adeos-ipipe-2.6.22-rc6-i386-1.9-00.patch
> > 
> 
> Running some tests, the gate to hell just opened:
> 
> [  210.247006] BUG: sleeping function called from invalid context at
> kernel/sched.c:3941
> [  210.248171] in_atomic():1, irqs_disabled():1
> [  210.248828] no locks held by frag-ip/881.
> [  210.249494]  [] show_trace_log_lvl+0x1f/0x34
> [  210.250523]  [] show_trace+0x17/0x19
> [  210.257778]  [] dump_stack+0x1b/0x1d
> [  210.258070]  [] __might_sleep+0xda/0xe1
> [  210.258365]  [] wait_for_completion+0x1f/0xc3
> [  210.258688]  [] set_cpus_allowed+0x77/0x95
> [  210.258992]  [] lostage_handler+0x75/0x201 [xeno_nucleus]
> [  210.259551]  [] rthal_apc_handler+0x5c/0x89
> [  210.259869]  [] __ipipe_sync_stage+0x13a/0x147
> [  210.260204]  [] __ipipe_syscall_root+0x1a6/0x1c8
> [  210.260536]  [] system_call+0x29/0x41
> 
> Setup is latest SVN + a "few" patches (the well-known ones), CONFIG_SMP,
> qemu -smp 2, RTnet in loopback mode, just terminating the frag-ip example.
> 
> However, this gremlin looks like it is /far/ older than 2.6.22 support.
> Calling set_cpus_allowed() from atomic lostage_handler is simply bogus,
> I'm afraid. :-/
> 

Confirmed, this is an old bug. Just adding a might_sleep() statement
even in UP config inside the lostage handler would trigger the warning.

> Jan
> 
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Support for 2.6.22/x86

2007-06-30 Thread Philippe Gerum
On Sat, 2007-06-30 at 09:48 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > Our development trunk now contains the necessary support for running
> > Xenomai over 2.6.22/x86. This work boils down to enabling Xenomai to use
> > the generic clock event device abstraction that comes with newest
> > kernels. Other archs / kernel versions still work the older way, until
> > all archs eventually catch up with clockevents upstream.
> > 
> > This support won't be backported to 2.3.x, because it has some
> > significant impact on the nucleus. Tested as thoroughly as possible here
> > on low-end and mid-range x86 boxen, including SMP.
> > 
> > Please give this hell.
> > 
> > http://download.gna.org/adeos/patches/v2.6/i386/adeos-ipipe-2.6.22-rc6-i386-1.9-00.patch
> > 
> 
> Running some tests, the gate to hell just opened:
> 
> [  210.247006] BUG: sleeping function called from invalid context at
> kernel/sched.c:3941
> [  210.248171] in_atomic():1, irqs_disabled():1
> [  210.248828] no locks held by frag-ip/881.
> [  210.249494]  [] show_trace_log_lvl+0x1f/0x34
> [  210.250523]  [] show_trace+0x17/0x19
> [  210.257778]  [] dump_stack+0x1b/0x1d
> [  210.258070]  [] __might_sleep+0xda/0xe1
> [  210.258365]  [] wait_for_completion+0x1f/0xc3
> [  210.258688]  [] set_cpus_allowed+0x77/0x95
> [  210.258992]  [] lostage_handler+0x75/0x201 [xeno_nucleus]
> [  210.259551]  [] rthal_apc_handler+0x5c/0x89
> [  210.259869]  [] __ipipe_sync_stage+0x13a/0x147
> [  210.260204]  [] __ipipe_syscall_root+0x1a6/0x1c8
> [  210.260536]  [] system_call+0x29/0x41
> 
> Setup is latest SVN + a "few" patches (the well-known ones), CONFIG_SMP,
> qemu -smp 2, RTnet in loopback mode, just terminating the frag-ip example.
> 
> However, this gremlin looks like it is /far/ older than 2.6.22 support.
> Calling set_cpus_allowed() from atomic lostage_handler is simply bogus,
> I'm afraid. :-/

Btw, you should have a look at a critical change in the way raw I-pipe
spinlocks are now manipulated (include/linux/spinlock.h wrappers).
In short, to solve a deadly bug in all previous implementations, a set
of dedicated helpers is now used to stall/unstall the current stage for
the spin_lock_irq* forms, the way it has to be, i.e. touching both the
real and virtual IRQ masks.

Such bug would accidentally clear the hardware IRQ mask, which would
lead to a recursive lock attempt whenever an interrupt is caught at the
wrong time on the same CPU, e.g.:

mask_and_ack_8259A
local_irq_save_hw()+spinlock
printk("spurious IRQ #...")
printk() ->vprintk()
...
spin_lock_irqsave()
spin_unlock_irqrestore()
local_irq_enable_hw()
 -> mask_and_ack_8259A

The way to solve this is to make sure that the stall bit for the current
domain always reflects the state of the hardware mask when operating raw
I-pipe locks.

As a consequence of this, you may not assume anymore that calling
spin_unlock() + local_irq_restore_hw() in sequence would have the same
effect than calling spin_unlock_irqrestore() on any ipipe_spinlock_t
locks. This would have the very undesirable side-effect of leaving the
virtual IRQ mask in stalled mode. I fixed an issue of this kind in the
tracer code (__ipipe_global_path_unlock) already, precisely caught after
getting a might_sleep() warning when reading /proc/ipe/trace/{max,
frozen}.

So you may want to double-check whether some constructs of this kind
might exist in any of your local patches. I did not find any in the
vanilla code, but another round of verifications may be useful.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Support for 2.6.22/x86

2007-06-30 Thread Philippe Gerum
On Sat, 2007-06-30 at 09:48 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > Our development trunk now contains the necessary support for running
> > Xenomai over 2.6.22/x86. This work boils down to enabling Xenomai to use
> > the generic clock event device abstraction that comes with newest
> > kernels. Other archs / kernel versions still work the older way, until
> > all archs eventually catch up with clockevents upstream.
> > 
> > This support won't be backported to 2.3.x, because it has some
> > significant impact on the nucleus. Tested as thoroughly as possible here
> > on low-end and mid-range x86 boxen, including SMP.
> > 
> > Please give this hell.
> > 
> > http://download.gna.org/adeos/patches/v2.6/i386/adeos-ipipe-2.6.22-rc6-i386-1.9-00.patch
> > 
> 
> Running some tests, the gate to hell just opened:
> 
> [  210.247006] BUG: sleeping function called from invalid context at
> kernel/sched.c:3941
> [  210.248171] in_atomic():1, irqs_disabled():1
> [  210.248828] no locks held by frag-ip/881.
> [  210.249494]  [] show_trace_log_lvl+0x1f/0x34
> [  210.250523]  [] show_trace+0x17/0x19
> [  210.257778]  [] dump_stack+0x1b/0x1d
> [  210.258070]  [] __might_sleep+0xda/0xe1
> [  210.258365]  [] wait_for_completion+0x1f/0xc3
> [  210.258688]  [] set_cpus_allowed+0x77/0x95
> [  210.258992]  [] lostage_handler+0x75/0x201 [xeno_nucleus]
> [  210.259551]  [] rthal_apc_handler+0x5c/0x89
> [  210.259869]  [] __ipipe_sync_stage+0x13a/0x147
> [  210.260204]  [] __ipipe_syscall_root+0x1a6/0x1c8
> [  210.260536]  [] system_call+0x29/0x41
> 
> Setup is latest SVN + a "few" patches (the well-known ones), CONFIG_SMP,
> qemu -smp 2, RTnet in loopback mode, just terminating the frag-ip example.
> 
> However, this gremlin looks like it is /far/ older than 2.6.22 support.
> Calling set_cpus_allowed() from atomic lostage_handler is simply bogus,
> I'm afraid. :-/

Why did we never get this migration case before? I'm running with all
debug knobs on too, and never hit this issue. Anyway... The APC
dispatcher does explicitly unlock the APC serialization lock. However,
the I-pipe syncer would stall the stage before calling the dispatcher,
so we need to bracket the dispatch loop within an unstall/stall block.
This said, I'm still wondering why the preemption is disabled here.

Do you happen to run with the tracer on when testing?

> 
> Jan
> 
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC][PATCH] shirq locking rework

2007-06-30 Thread Dmitry Adamushko
Hello Jan,

I appologize for the huge reply latency.

>
> Yeah, that might explain while already trying to parse it manually
> failed: What is xnintr_sync_stat_references? :)

yeah.. it was supposed to be xnintr_sync_stat_refs()


> > 'prev = xnstat_get_current()' reference is also tracked as reference 
> > accounting becomes
> > a part of the xnstat interface (not sure we do need it though).
>
> Mind to elaborate on _why_ you think we need this, specifically if it
> adds new atomic counters?

Forget about it, it was a wrong approach. We do reschedule in
xnintr_*_handler() and if 'prev->refs' is non-zero and a newly
scheduled thread calls xnstat_runtime_synch() (well, how it could be
in theory with this interfcae) before deleting the first thread..
oops. so this 'referencing' scheme is bad anyway.

Note, that if the real re-schedule took place in xnpod_schedule() , we
actually don't need to _restore_ 'prev' when we get control back.. it
must be already restored by xnpod_schedule() when the preempted thread
('prev' is normally a thread in which context an interrupt occurs)
gets CPU back. if I'm not missing something. hum?

...
if (--sched->inesting == 0 && xnsched_resched_p())
xnpod_schedule();

(*) < 'sched->current_account' should be already == 'prev' in case
xnpod_schedule() took place

xnltt_log_event(xeno_ev_iexit, irq);
xnstat_runtime_switch(sched, prev);
...

The simpler scheme with xnstat_ accounting would be if we account only
time spent in intr->isr() to corresponding intr->stat[cpu].account...
This way, all accesses to the later one would be inside
xnlock_{get,put}(&xnirqs[irq].lock) sections [*].

It's preciceness (although, it's arguable to some extent) vs.
simplicity (e.g. no need for any xnintr_sync_stat_references()). I
would still prefer this approach :-)

Otherwise, so far I don't see any much nicer solution that the one
illustrated by your first patch.


> Uhh, be careful, I burned my fingers with similar things recently as
> well. You have to make sure that all types are resolvable for _all_
> includers of that header. Otherwise, I'm fine with cleanups like this.
> But I think there was once a reason for #define.

yeah.. now I recall it as well :-)


>
> Thanks,
> Jan
>

-- 
Best regards,
Dmitry Adamushko

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Support for 2.6.22/x86

2007-06-30 Thread Jan Kiszka
Philippe Gerum wrote:
> Our development trunk now contains the necessary support for running
> Xenomai over 2.6.22/x86. This work boils down to enabling Xenomai to use
> the generic clock event device abstraction that comes with newest
> kernels. Other archs / kernel versions still work the older way, until
> all archs eventually catch up with clockevents upstream.
> 
> This support won't be backported to 2.3.x, because it has some
> significant impact on the nucleus. Tested as thoroughly as possible here
> on low-end and mid-range x86 boxen, including SMP.
> 
> Please give this hell.
> 
> http://download.gna.org/adeos/patches/v2.6/i386/adeos-ipipe-2.6.22-rc6-i386-1.9-00.patch
> 

Running some tests, the gate to hell just opened:

[  210.247006] BUG: sleeping function called from invalid context at
kernel/sched.c:3941
[  210.248171] in_atomic():1, irqs_disabled():1
[  210.248828] no locks held by frag-ip/881.
[  210.249494]  [] show_trace_log_lvl+0x1f/0x34
[  210.250523]  [] show_trace+0x17/0x19
[  210.257778]  [] dump_stack+0x1b/0x1d
[  210.258070]  [] __might_sleep+0xda/0xe1
[  210.258365]  [] wait_for_completion+0x1f/0xc3
[  210.258688]  [] set_cpus_allowed+0x77/0x95
[  210.258992]  [] lostage_handler+0x75/0x201 [xeno_nucleus]
[  210.259551]  [] rthal_apc_handler+0x5c/0x89
[  210.259869]  [] __ipipe_sync_stage+0x13a/0x147
[  210.260204]  [] __ipipe_syscall_root+0x1a6/0x1c8
[  210.260536]  [] system_call+0x29/0x41

Setup is latest SVN + a "few" patches (the well-known ones), CONFIG_SMP,
qemu -smp 2, RTnet in loopback mode, just terminating the frag-ip example.

However, this gremlin looks like it is /far/ older than 2.6.22 support.
Calling set_cpus_allowed() from atomic lostage_handler is simply bogus,
I'm afraid. :-/

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core