Re: [Xenomai-core] Overcoming the foreign stack

2010-10-07 Thread Philippe Gerum
On Wed, 2010-10-06 at 11:20 +0200, Jan Kiszka wrote: 
 Am 05.10.2010 16:21, Gilles Chanteperdrix wrote:
  Jan Kiszka wrote:
  Am 05.10.2010 15:50, Gilles Chanteperdrix wrote:
  Jan Kiszka wrote:
  Am 05.10.2010 15:42, Gilles Chanteperdrix wrote:
  Jan Kiszka wrote:
  Am 05.10.2010 15:15, Gilles Chanteperdrix wrote:
  Jan Kiszka wrote:
  Hi,
 
  quite a few limitations and complications of using Linux services 
  over
  non-Linux domains relate to potentially invalid current and
  thread_info. The non-Linux domain could maintain their own kernel
  stacks while Linux tend to derive current and thread_info from the 
  stack
  pointer. This is not an issue anymore on x86-64 (both states are 
  stored
  in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use 
  the
  stack and may continue to do so.
 
  I just looked into this thing again as I'm evaluating ways to exploit
  the kernel's tracing framework also under Xenomai. Unfortunately, it
  does a lot of fiddling with preempt_count and need_resched, so 
  patching
  it for Xenomai use would become a maintenance nightmare.
 
  An alternative, also for other use cases like kgdb and probably 
  perf, is
  to get rid of our dependency on home-grown stacks. I think we are on
  that way already as in-kernel skins have been deprecated. The only
  remaining user after them will be RTDM driver tasks. But I think 
  those
  could simply become in-kernel shadows of kthreads which would bind 
  their
  stacks to what Linux provides. Moreover, Xenomai could start updating
  current and thread_info on context switches (unless this already
  happens implicitly). That would give us proper contexts for 
  system-level
  tracing and profiling.
 
  My key question is currently if and how much of this could be 
  realized
  in 2.6. Could we drop in-kernel skins in that version? If not, what
  about disabling them by default, converting RTDM tasks to a
  kthread-based approach, and enabling tracing etc. only in that case?
  However, this might be a bit fragile unless we can establish
  compile-time or run-time requirements negotiation between Adeos and 
  its
  users (Xenomai) about the stack model.
  A stupid question: why not make things the other way around: patch the
  current and current_thread_info functions to be made I-pipe aware and
  use an ipipe_current pointer to the current thread task_struct. Of
  course, there are places where the current or current_thread_info 
  macros
  are implemented in assembly, so it may be not simple as it sounds, but
  it would allow to keep 128 Kb stacks if we want. This also means that 
  we
  would have to put a task_struct at the bottom of every Xenomai task.
  First of all, overhead vs. maintenance. Either every access to
  preempt_count() would require a check for the current domain and its
  foreign stack flag, or I would have to patch dozens (if that is enough)
  of code sites in the tracer framework.
  No. I mean we would dereference a pointer named ipipe_current. That is
  all, no other check. This pointer would be maintained elsewhere. And we
  modify the current macro, like:
 
  #ifdef CONFIG_IPIPE
  extern struct task_struct *ipipe_current;
  #define current ipipe_current
  #endif
 
  Any calll site gets modified automatically. Or current_thread_info, if
  it is current_thread_info which is obtained using the stack pointer mask
  trick.
  The stack pointer mask trick only works with fixed-sized stacks, not a
  guaranteed property of in-kernel Xenomai threads.
  Precisely the reason why I propose to replace it with a global variable
  reference, or a per-cpu variable for SMP systems.
 
  Then why is Linux not using this in favor of the stack pointer approach
  on, say, ARM?
 
  For sure, we can patch all Adeos-supported archs away from stack-based
  to per-cpu current  thread_info, but I don't feel comfortable with this
  in some way invasive approach as well. Well, maybe it's just my personal
  misperception.
  
  It is as much invasive as modifying local_irq_save/local_irq_restore.
  The real question about the global pointer approach, is, if it so much
  less efficient, how does Xenomai, which uses this scheme, manage to have
  good performances on ARM?
 
 Xenomai has no heavily-used preempt_disable/enable that is built on top
 of thread_info. But I also have no numbers on this.
 
 I looked closer at the kernel dependencies on a fixed stack size.
 Besides current and thread_info, further features that make use of this
 are stack unwinding (boundary checks) and overflow checking. So while we
 can work around the dependency for some tracing requirements, I really
 see no point in heading for this long-term. It just creates more subtle
 patching needs in Adeos, and it also requires work on Xenomai side. I
 really think it's better provide a compatible context to reduce
 maintenance efforts.
 
 So I played a bit with converting RTDM tasks to in-kernel shadows. It
 works but needs more fine-tuning. My proposal for 

Re: [Xenomai-core] Overcoming the foreign stack

2010-10-07 Thread Jan Kiszka
Am 07.10.2010 19:08, Philippe Gerum wrote:
 On Wed, 2010-10-06 at 11:20 +0200, Jan Kiszka wrote: 
 Am 05.10.2010 16:21, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:50, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:42, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:15, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Hi,

 quite a few limitations and complications of using Linux services 
 over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the 
 stack
 pointer. This is not an issue anymore on x86-64 (both states are 
 stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use 
 the
 stack and may continue to do so.

 I just looked into this thing again as I'm evaluating ways to exploit
 the kernel's tracing framework also under Xenomai. Unfortunately, it
 does a lot of fiddling with preempt_count and need_resched, so 
 patching
 it for Xenomai use would become a maintenance nightmare.

 An alternative, also for other use cases like kgdb and probably 
 perf, is
 to get rid of our dependency on home-grown stacks. I think we are on
 that way already as in-kernel skins have been deprecated. The only
 remaining user after them will be RTDM driver tasks. But I think 
 those
 could simply become in-kernel shadows of kthreads which would bind 
 their
 stacks to what Linux provides. Moreover, Xenomai could start updating
 current and thread_info on context switches (unless this already
 happens implicitly). That would give us proper contexts for 
 system-level
 tracing and profiling.

 My key question is currently if and how much of this could be 
 realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and 
 its
 users (Xenomai) about the stack model.
 A stupid question: why not make things the other way around: patch the
 current and current_thread_info functions to be made I-pipe aware and
 use an ipipe_current pointer to the current thread task_struct. Of
 course, there are places where the current or current_thread_info 
 macros
 are implemented in assembly, so it may be not simple as it sounds, but
 it would allow to keep 128 Kb stacks if we want. This also means that 
 we
 would have to put a task_struct at the bottom of every Xenomai task.
 First of all, overhead vs. maintenance. Either every access to
 preempt_count() would require a check for the current domain and its
 foreign stack flag, or I would have to patch dozens (if that is enough)
 of code sites in the tracer framework.
 No. I mean we would dereference a pointer named ipipe_current. That is
 all, no other check. This pointer would be maintained elsewhere. And we
 modify the current macro, like:

 #ifdef CONFIG_IPIPE
 extern struct task_struct *ipipe_current;
 #define current ipipe_current
 #endif

 Any calll site gets modified automatically. Or current_thread_info, if
 it is current_thread_info which is obtained using the stack pointer mask
 trick.
 The stack pointer mask trick only works with fixed-sized stacks, not a
 guaranteed property of in-kernel Xenomai threads.
 Precisely the reason why I propose to replace it with a global variable
 reference, or a per-cpu variable for SMP systems.

 Then why is Linux not using this in favor of the stack pointer approach
 on, say, ARM?

 For sure, we can patch all Adeos-supported archs away from stack-based
 to per-cpu current  thread_info, but I don't feel comfortable with this
 in some way invasive approach as well. Well, maybe it's just my personal
 misperception.

 It is as much invasive as modifying local_irq_save/local_irq_restore.
 The real question about the global pointer approach, is, if it so much
 less efficient, how does Xenomai, which uses this scheme, manage to have
 good performances on ARM?

 Xenomai has no heavily-used preempt_disable/enable that is built on top
 of thread_info. But I also have no numbers on this.

 I looked closer at the kernel dependencies on a fixed stack size.
 Besides current and thread_info, further features that make use of this
 are stack unwinding (boundary checks) and overflow checking. So while we
 can work around the dependency for some tracing requirements, I really
 see no point in heading for this long-term. It just creates more subtle
 patching needs in Adeos, and it also requires work on Xenomai side. I
 really think it's better provide a compatible context to reduce
 maintenance efforts.

 So I played a bit with converting RTDM tasks to in-kernel shadows. It
 works but needs more fine-tuning. My proposal for 2.6 now looks like this:

  - add mm-less shadow 

Re: [Xenomai-core] Overcoming the foreign stack

2010-10-07 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
 I'm on a wait and see stance about generalizing the use of the ftrace
 framework for our needs; like Gilles saw with ARM, I must admit that I
 did notice a massive overhead on low-end ppc as well when we moved the
 pipeline tracer over it. I'm aware of the mcount optimizations that
 should be there when cycles really matter, and that ftrace does branch
 directly to the trace function when only a single one exists, but this
 may not be easy to keep after the generalization has taken place.
 Anyway, I'll wait for more data to make my opinion.
 
 As I said, ftrace is more the a simple mcount-tracer. And it's standard,
 distros start to enable it in their production kernels these days
 (except for the function tracer).
 
 If the overhead of the ftrace's mcount is too high on low-end platforms
 (I personally haven't tried it there yet), it would probably be a good
 idea to develop some optimizations or allow some variant that does not
 suffer that much - but upstream then.

I can talk about ARM on that subject: the fix is to use dynamic ftrace
(I did not get it working, so I am not sure it still has not one too
many indirection layers, but it looks like it does not). But the patches
to get dynamic ftrace working on ARM, though known since march, have not
been merged yet, so will not be here in the upcoming 2.6.36. I suspect
other architectures such as blackfin also lag behind x86. So, in the
mean-time, if we want to get the I-pipe tracer working with a reasonable
overhead, we have to make our own version, and from that perspective,
the version where mcount calls directly the ipipe tracer, is much
simpler than importing the whole dynamic ftrace stuff. So, I vote for
keeping some #ifdefs or Kconfig stuff in the ipipe tracer code to be
able to use a standalone tracer, as it will also simplify getting it to
work with architectures which lag even more behind x86 than ARM or
Blackfin. Say for instance, microblaze, nios, or sparc.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-07 Thread Jan Kiszka
Am 07.10.2010 20:12, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 I'm on a wait and see stance about generalizing the use of the ftrace
 framework for our needs; like Gilles saw with ARM, I must admit that I
 did notice a massive overhead on low-end ppc as well when we moved the
 pipeline tracer over it. I'm aware of the mcount optimizations that
 should be there when cycles really matter, and that ftrace does branch
 directly to the trace function when only a single one exists, but this
 may not be easy to keep after the generalization has taken place.
 Anyway, I'll wait for more data to make my opinion.

 As I said, ftrace is more the a simple mcount-tracer. And it's standard,
 distros start to enable it in their production kernels these days
 (except for the function tracer).

 If the overhead of the ftrace's mcount is too high on low-end platforms
 (I personally haven't tried it there yet), it would probably be a good
 idea to develop some optimizations or allow some variant that does not
 suffer that much - but upstream then.
 
 I can talk about ARM on that subject: the fix is to use dynamic ftrace
 (I did not get it working, so I am not sure it still has not one too
 many indirection layers, but it looks like it does not). But the patches
 to get dynamic ftrace working on ARM, though known since march, have not
 been merged yet, so will not be here in the upcoming 2.6.36. I suspect
 other architectures such as blackfin also lag behind x86. So, in the
 mean-time, if we want to get the I-pipe tracer working with a reasonable
 overhead, we have to make our own version, and from that perspective,
 the version where mcount calls directly the ipipe tracer, is much
 simpler than importing the whole dynamic ftrace stuff. So, I vote for
 keeping some #ifdefs or Kconfig stuff in the ipipe tracer code to be
 able to use a standalone tracer, as it will also simplify getting it to
 work with architectures which lag even more behind x86 than ARM or
 Blackfin. Say for instance, microblaze, nios, or sparc.
 

No question, even if we already had an ipipe tracer replacement for
ftrace, the existing generic bits would not be removed as long as we
have users or there are unacceptable limitations.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-06 Thread Jan Kiszka
Am 05.10.2010 16:21, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:50, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:42, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:15, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Hi,

 quite a few limitations and complications of using Linux services over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the 
 stack
 pointer. This is not an issue anymore on x86-64 (both states are stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use 
 the
 stack and may continue to do so.

 I just looked into this thing again as I'm evaluating ways to exploit
 the kernel's tracing framework also under Xenomai. Unfortunately, it
 does a lot of fiddling with preempt_count and need_resched, so patching
 it for Xenomai use would become a maintenance nightmare.

 An alternative, also for other use cases like kgdb and probably perf, 
 is
 to get rid of our dependency on home-grown stacks. I think we are on
 that way already as in-kernel skins have been deprecated. The only
 remaining user after them will be RTDM driver tasks. But I think those
 could simply become in-kernel shadows of kthreads which would bind 
 their
 stacks to what Linux provides. Moreover, Xenomai could start updating
 current and thread_info on context switches (unless this already
 happens implicitly). That would give us proper contexts for 
 system-level
 tracing and profiling.

 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.
 A stupid question: why not make things the other way around: patch the
 current and current_thread_info functions to be made I-pipe aware and
 use an ipipe_current pointer to the current thread task_struct. Of
 course, there are places where the current or current_thread_info macros
 are implemented in assembly, so it may be not simple as it sounds, but
 it would allow to keep 128 Kb stacks if we want. This also means that we
 would have to put a task_struct at the bottom of every Xenomai task.
 First of all, overhead vs. maintenance. Either every access to
 preempt_count() would require a check for the current domain and its
 foreign stack flag, or I would have to patch dozens (if that is enough)
 of code sites in the tracer framework.
 No. I mean we would dereference a pointer named ipipe_current. That is
 all, no other check. This pointer would be maintained elsewhere. And we
 modify the current macro, like:

 #ifdef CONFIG_IPIPE
 extern struct task_struct *ipipe_current;
 #define current ipipe_current
 #endif

 Any calll site gets modified automatically. Or current_thread_info, if
 it is current_thread_info which is obtained using the stack pointer mask
 trick.
 The stack pointer mask trick only works with fixed-sized stacks, not a
 guaranteed property of in-kernel Xenomai threads.
 Precisely the reason why I propose to replace it with a global variable
 reference, or a per-cpu variable for SMP systems.

 Then why is Linux not using this in favor of the stack pointer approach
 on, say, ARM?

 For sure, we can patch all Adeos-supported archs away from stack-based
 to per-cpu current  thread_info, but I don't feel comfortable with this
 in some way invasive approach as well. Well, maybe it's just my personal
 misperception.
 
 It is as much invasive as modifying local_irq_save/local_irq_restore.
 The real question about the global pointer approach, is, if it so much
 less efficient, how does Xenomai, which uses this scheme, manage to have
 good performances on ARM?

Xenomai has no heavily-used preempt_disable/enable that is built on top
of thread_info. But I also have no numbers on this.

I looked closer at the kernel dependencies on a fixed stack size.
Besides current and thread_info, further features that make use of this
are stack unwinding (boundary checks) and overflow checking. So while we
can work around the dependency for some tracing requirements, I really
see no point in heading for this long-term. It just creates more subtle
patching needs in Adeos, and it also requires work on Xenomai side. I
really think it's better provide a compatible context to reduce
maintenance efforts.

So I played a bit with converting RTDM tasks to in-kernel shadows. It
works but needs more fine-tuning. My proposal for 2.6 now looks like this:

 - add mm-less shadow support to the nucleus (changes in
   xnarch_switch_to and xnshadow_map)
 - convert RTDM tasks to in-kernel shadows
 - switch 

[Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Jan Kiszka
Hi,

quite a few limitations and complications of using Linux services over
non-Linux domains relate to potentially invalid current and
thread_info. The non-Linux domain could maintain their own kernel
stacks while Linux tend to derive current and thread_info from the stack
pointer. This is not an issue anymore on x86-64 (both states are stored
in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use the
stack and may continue to do so.

I just looked into this thing again as I'm evaluating ways to exploit
the kernel's tracing framework also under Xenomai. Unfortunately, it
does a lot of fiddling with preempt_count and need_resched, so patching
it for Xenomai use would become a maintenance nightmare.

An alternative, also for other use cases like kgdb and probably perf, is
to get rid of our dependency on home-grown stacks. I think we are on
that way already as in-kernel skins have been deprecated. The only
remaining user after them will be RTDM driver tasks. But I think those
could simply become in-kernel shadows of kthreads which would bind their
stacks to what Linux provides. Moreover, Xenomai could start updating
current and thread_info on context switches (unless this already
happens implicitly). That would give us proper contexts for system-level
tracing and profiling.

My key question is currently if and how much of this could be realized
in 2.6. Could we drop in-kernel skins in that version? If not, what
about disabling them by default, converting RTDM tasks to a
kthread-based approach, and enabling tracing etc. only in that case?
However, this might be a bit fragile unless we can establish
compile-time or run-time requirements negotiation between Adeos and its
users (Xenomai) about the stack model.

Comments, ideas?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
 Hi,
 
 quite a few limitations and complications of using Linux services over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the stack
 pointer. This is not an issue anymore on x86-64 (both states are stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use the
 stack and may continue to do so.

On ARM vanilla tracing capabilities have way too much overhead to be
really usable. I have tested a standalone version of mcount which
calls directly the Ipipe tracer, and I get less overhead, something like
20%.

The philosophy of ftrace is well explained by the following sentence,
extracted from the ftrace design text file: Also keep in mind that this
mcount function will be called *a lot*, so optimizing for the default
case of no tracer will help the smooth running of your system when
tracing is disabled.

When using the I-pipe tracer, we are interested in precisely the reverse
optimization: we do not care about the overhead of mcount when the
tracer is not enabled, we will not keep the tracer enabled in
configuration when not tracint anyway, but we really want mcount to have
as little overhead as possible when the tracer is enabled.

Still on ARM, the perf code requires handling an interrupt when the
performance counters overflow. So, getting this to work with the I-pipe
would mean ironing this irq handler and all the functions it calls, and
all the spinlocks that are involved.

What I am trying to say is that trying to use the vanilla
infrastructures will probably cause many more problems than just the
stack issue, and if we look at the I-pipe tracer example again, it is
really not obvious what the vanilla infrastructure bring us: only the
ftrace_register/ftrace_unregister services, at the expense of 20% more
overhead.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Jan Kiszka
Am 05.10.2010 11:56, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Hi,

 quite a few limitations and complications of using Linux services over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the stack
 pointer. This is not an issue anymore on x86-64 (both states are stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use the
 stack and may continue to do so.
 
 On ARM vanilla tracing capabilities have way too much overhead to be
 really usable. I have tested a standalone version of mcount which
 calls directly the Ipipe tracer, and I get less overhead, something like
 20%.
 
 The philosophy of ftrace is well explained by the following sentence,
 extracted from the ftrace design text file: Also keep in mind that this
 mcount function will be called *a lot*, so optimizing for the default
 case of no tracer will help the smooth running of your system when
 tracing is disabled.
 
 When using the I-pipe tracer, we are interested in precisely the reverse
 optimization: we do not care about the overhead of mcount when the
 tracer is not enabled, we will not keep the tracer enabled in
 configuration when not tracint anyway, but we really want mcount to have
 as little overhead as possible when the tracer is enabled.

There are use cases for both, also on smaller embedded targets. But I
agree that it can be worth to continue optimizing mcount separately on
some archs. Still, mcount is only a smaller part of ftrace these days,
and I'm actually way more interested in event tracing (as a potential
substitution for LTTng adaptions).

 
 Still on ARM, the perf code requires handling an interrupt when the
 performance counters overflow. So, getting this to work with the I-pipe
 would mean ironing this irq handler and all the functions it calls, and
 all the spinlocks that are involved.

On x86, perf triggers NMIs, so the code is at least conceptually
prepared for any context. We just need to provide a few states
independent of what Xenomai does. Feasible I would say.

 
 What I am trying to say is that trying to use the vanilla
 infrastructures will probably cause many more problems than just the
 stack issue, and if we look at the I-pipe tracer example again, it is
 really not obvious what the vanilla infrastructure bring us: only the
 ftrace_register/ftrace_unregister services, at the expense of 20% more
 overhead.

 - event tracing
 - function graph tracing
 - stack tracing (including user land)
 - $your-whatever-tracer
 - consistent view on the system (if current is always valid, there are
   no more confusions between current Linux vs. Xenomai task)
 - full-blown management interface (debugfs)
 - growing user land tools (specifically Kernelshark, which has an
   interesting plugin concept)
 - reduced maintenance (hopefully)

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
  - consistent view on the system (if current is always valid, there are
no more confusions between current Linux vs. Xenomai task)

That is inherently incompatible with the co-kernel approach. Xenomai
will always be able to preempt Linux at a place where the state is not
consistent.


-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Jan Kiszka
Am 05.10.2010 12:32, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
  - consistent view on the system (if current is always valid, there are
no more confusions between current Linux vs. Xenomai task)
 
 That is inherently incompatible with the co-kernel approach. Xenomai
 will always be able to preempt Linux at a place where the state is not
 consistent.

Depends on what state the trace examines and what it derives from it.
The current task or the current preemption counter are definitely not
critical and can easily be provided in a way that make tracer output
consistent. In the end, shadow tasks are Linux tasks in a special mode.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
 Am 05.10.2010 12:32, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
  - consistent view on the system (if current is always valid, there are
no more confusions between current Linux vs. Xenomai task)
 That is inherently incompatible with the co-kernel approach. Xenomai
 will always be able to preempt Linux at a place where the state is not
 consistent.
 
 Depends on what state the trace examines and what it derives from it.
 The current task or the current preemption counter are definitely not
 critical and can easily be provided in a way that make tracer output
 consistent. In the end, shadow tasks are Linux tasks in a special mode.

To summarize what I was trying to say: the more you want to use Linux
infrastructures, the more you will want to have consistent state, the
more you will need to patch things around. This will incur overhead, to
the point where the modifications have influence on the results, and
this will also mean spaghetti-like troubles.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.

Just to clarify about 2.6. 2.6 is supposed to have a short-lived
unstable state, where we implement fixes for the few problems on the
2.5 branch which require breaking the ABI. So, removing kernel-space
threads for other skins than RTDM is not the plan. However, does it
really matter whether other skins than RTDM use kernel-space threads for
changing the way kernel-space threads stacks are allocated?

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Jan Kiszka
Am 05.10.2010 12:59, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.
 
 Just to clarify about 2.6. 2.6 is supposed to have a short-lived
 unstable state, where we implement fixes for the few problems on the
 2.5 branch which require breaking the ABI. So, removing kernel-space
 threads for other skins than RTDM is not the plan. However, does it
 really matter whether other skins than RTDM use kernel-space threads for
 changing the way kernel-space threads stacks are allocated?

Not if we break the API for all in-kernel skins (not only RTDM) by
freezing the stack size (that will be the side effect of moving over
kthreads). I've no problem to do this for RTDM, but in-kernel
applications may have different stack footprints than ordinary RTDM
drivers, thus could break subtly. Therefore the consideration to do a
hard cut then, forcing remaining users to port now (instead of fixing
once again and port later on).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Jan Kiszka
Am 05.10.2010 13:06, Jan Kiszka wrote:
 Am 05.10.2010 12:59, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.

 Just to clarify about 2.6. 2.6 is supposed to have a short-lived
 unstable state, where we implement fixes for the few problems on the
 2.5 branch which require breaking the ABI. So, removing kernel-space
 threads for other skins than RTDM is not the plan. However, does it
 really matter whether other skins than RTDM use kernel-space threads for
 changing the way kernel-space threads stacks are allocated?
 
 Not if we break the API for all in-kernel skins (not only RTDM) by
 freezing the stack size (that will be the side effect of moving over
 kthreads). I've no problem to do this for RTDM,

I totally forgot that RTDM does not expose any interface to specify the
task stack size. So the user will not notice the difference directly
(but the available stack will shrink).

 but in-kernel
 applications may have different stack footprints than ordinary RTDM
 drivers, thus could break subtly. Therefore the consideration to do a
 hard cut then, forcing remaining users to port now (instead of fixing
 once again and port later on).
 

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
 Hi,
 
 quite a few limitations and complications of using Linux services over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the stack
 pointer. This is not an issue anymore on x86-64 (both states are stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use the
 stack and may continue to do so.
 
 I just looked into this thing again as I'm evaluating ways to exploit
 the kernel's tracing framework also under Xenomai. Unfortunately, it
 does a lot of fiddling with preempt_count and need_resched, so patching
 it for Xenomai use would become a maintenance nightmare.
 
 An alternative, also for other use cases like kgdb and probably perf, is
 to get rid of our dependency on home-grown stacks. I think we are on
 that way already as in-kernel skins have been deprecated. The only
 remaining user after them will be RTDM driver tasks. But I think those
 could simply become in-kernel shadows of kthreads which would bind their
 stacks to what Linux provides. Moreover, Xenomai could start updating
 current and thread_info on context switches (unless this already
 happens implicitly). That would give us proper contexts for system-level
 tracing and profiling.
 
 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.

A stupid question: why not make things the other way around: patch the
current and current_thread_info functions to be made I-pipe aware and
use an ipipe_current pointer to the current thread task_struct. Of
course, there are places where the current or current_thread_info macros
are implemented in assembly, so it may be not simple as it sounds, but
it would allow to keep 128 Kb stacks if we want. This also means that we
would have to put a task_struct at the bottom of every Xenomai task.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Jan Kiszka
Am 05.10.2010 15:15, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Hi,

 quite a few limitations and complications of using Linux services over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the stack
 pointer. This is not an issue anymore on x86-64 (both states are stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use the
 stack and may continue to do so.

 I just looked into this thing again as I'm evaluating ways to exploit
 the kernel's tracing framework also under Xenomai. Unfortunately, it
 does a lot of fiddling with preempt_count and need_resched, so patching
 it for Xenomai use would become a maintenance nightmare.

 An alternative, also for other use cases like kgdb and probably perf, is
 to get rid of our dependency on home-grown stacks. I think we are on
 that way already as in-kernel skins have been deprecated. The only
 remaining user after them will be RTDM driver tasks. But I think those
 could simply become in-kernel shadows of kthreads which would bind their
 stacks to what Linux provides. Moreover, Xenomai could start updating
 current and thread_info on context switches (unless this already
 happens implicitly). That would give us proper contexts for system-level
 tracing and profiling.

 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.
 
 A stupid question: why not make things the other way around: patch the
 current and current_thread_info functions to be made I-pipe aware and
 use an ipipe_current pointer to the current thread task_struct. Of
 course, there are places where the current or current_thread_info macros
 are implemented in assembly, so it may be not simple as it sounds, but
 it would allow to keep 128 Kb stacks if we want. This also means that we
 would have to put a task_struct at the bottom of every Xenomai task.

First of all, overhead vs. maintenance. Either every access to
preempt_count() would require a check for the current domain and its
foreign stack flag, or I would have to patch dozens (if that is enough)
of code sites in the tracer framework.

And, second, this would prevent aligning current/thread_info with the
currently running shadow, the nice add-on I would like to gain with this
rework.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
 Am 05.10.2010 15:15, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Hi,

 quite a few limitations and complications of using Linux services over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the stack
 pointer. This is not an issue anymore on x86-64 (both states are stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use the
 stack and may continue to do so.

 I just looked into this thing again as I'm evaluating ways to exploit
 the kernel's tracing framework also under Xenomai. Unfortunately, it
 does a lot of fiddling with preempt_count and need_resched, so patching
 it for Xenomai use would become a maintenance nightmare.

 An alternative, also for other use cases like kgdb and probably perf, is
 to get rid of our dependency on home-grown stacks. I think we are on
 that way already as in-kernel skins have been deprecated. The only
 remaining user after them will be RTDM driver tasks. But I think those
 could simply become in-kernel shadows of kthreads which would bind their
 stacks to what Linux provides. Moreover, Xenomai could start updating
 current and thread_info on context switches (unless this already
 happens implicitly). That would give us proper contexts for system-level
 tracing and profiling.

 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.
 A stupid question: why not make things the other way around: patch the
 current and current_thread_info functions to be made I-pipe aware and
 use an ipipe_current pointer to the current thread task_struct. Of
 course, there are places where the current or current_thread_info macros
 are implemented in assembly, so it may be not simple as it sounds, but
 it would allow to keep 128 Kb stacks if we want. This also means that we
 would have to put a task_struct at the bottom of every Xenomai task.
 
 First of all, overhead vs. maintenance. Either every access to
 preempt_count() would require a check for the current domain and its
 foreign stack flag, or I would have to patch dozens (if that is enough)
 of code sites in the tracer framework.

No. I mean we would dereference a pointer named ipipe_current. That is
all, no other check. This pointer would be maintained elsewhere. And we
modify the current macro, like:

#ifdef CONFIG_IPIPE
extern struct task_struct *ipipe_current;
#define current ipipe_current
#endif

Any calll site gets modified automatically. Or current_thread_info, if
it is current_thread_info which is obtained using the stack pointer mask
trick.

 
 And, second, this would prevent aligning current/thread_info with the
 currently running shadow, the nice add-on I would like to gain with this
 rework.

We would put a task_struct at the bottom of every Xenomai kthread. So,
yes, we align on the current/thread_info stuff.

I am not convinced that the stack pointer trick is really that much a
big performance gain.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Jan Kiszka
Am 05.10.2010 15:42, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:15, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Hi,

 quite a few limitations and complications of using Linux services over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the stack
 pointer. This is not an issue anymore on x86-64 (both states are stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use the
 stack and may continue to do so.

 I just looked into this thing again as I'm evaluating ways to exploit
 the kernel's tracing framework also under Xenomai. Unfortunately, it
 does a lot of fiddling with preempt_count and need_resched, so patching
 it for Xenomai use would become a maintenance nightmare.

 An alternative, also for other use cases like kgdb and probably perf, is
 to get rid of our dependency on home-grown stacks. I think we are on
 that way already as in-kernel skins have been deprecated. The only
 remaining user after them will be RTDM driver tasks. But I think those
 could simply become in-kernel shadows of kthreads which would bind their
 stacks to what Linux provides. Moreover, Xenomai could start updating
 current and thread_info on context switches (unless this already
 happens implicitly). That would give us proper contexts for system-level
 tracing and profiling.

 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.
 A stupid question: why not make things the other way around: patch the
 current and current_thread_info functions to be made I-pipe aware and
 use an ipipe_current pointer to the current thread task_struct. Of
 course, there are places where the current or current_thread_info macros
 are implemented in assembly, so it may be not simple as it sounds, but
 it would allow to keep 128 Kb stacks if we want. This also means that we
 would have to put a task_struct at the bottom of every Xenomai task.

 First of all, overhead vs. maintenance. Either every access to
 preempt_count() would require a check for the current domain and its
 foreign stack flag, or I would have to patch dozens (if that is enough)
 of code sites in the tracer framework.
 
 No. I mean we would dereference a pointer named ipipe_current. That is
 all, no other check. This pointer would be maintained elsewhere. And we
 modify the current macro, like:
 
 #ifdef CONFIG_IPIPE
 extern struct task_struct *ipipe_current;
 #define current ipipe_current
 #endif
 
 Any calll site gets modified automatically. Or current_thread_info, if
 it is current_thread_info which is obtained using the stack pointer mask
 trick.

The stack pointer mask trick only works with fixed-sized stacks, not a
guaranteed property of in-kernel Xenomai threads.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
 Am 05.10.2010 15:42, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:15, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Hi,

 quite a few limitations and complications of using Linux services over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the stack
 pointer. This is not an issue anymore on x86-64 (both states are stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use the
 stack and may continue to do so.

 I just looked into this thing again as I'm evaluating ways to exploit
 the kernel's tracing framework also under Xenomai. Unfortunately, it
 does a lot of fiddling with preempt_count and need_resched, so patching
 it for Xenomai use would become a maintenance nightmare.

 An alternative, also for other use cases like kgdb and probably perf, is
 to get rid of our dependency on home-grown stacks. I think we are on
 that way already as in-kernel skins have been deprecated. The only
 remaining user after them will be RTDM driver tasks. But I think those
 could simply become in-kernel shadows of kthreads which would bind their
 stacks to what Linux provides. Moreover, Xenomai could start updating
 current and thread_info on context switches (unless this already
 happens implicitly). That would give us proper contexts for system-level
 tracing and profiling.

 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.
 A stupid question: why not make things the other way around: patch the
 current and current_thread_info functions to be made I-pipe aware and
 use an ipipe_current pointer to the current thread task_struct. Of
 course, there are places where the current or current_thread_info macros
 are implemented in assembly, so it may be not simple as it sounds, but
 it would allow to keep 128 Kb stacks if we want. This also means that we
 would have to put a task_struct at the bottom of every Xenomai task.
 First of all, overhead vs. maintenance. Either every access to
 preempt_count() would require a check for the current domain and its
 foreign stack flag, or I would have to patch dozens (if that is enough)
 of code sites in the tracer framework.
 No. I mean we would dereference a pointer named ipipe_current. That is
 all, no other check. This pointer would be maintained elsewhere. And we
 modify the current macro, like:

 #ifdef CONFIG_IPIPE
 extern struct task_struct *ipipe_current;
 #define current ipipe_current
 #endif

 Any calll site gets modified automatically. Or current_thread_info, if
 it is current_thread_info which is obtained using the stack pointer mask
 trick.
 
 The stack pointer mask trick only works with fixed-sized stacks, not a
 guaranteed property of in-kernel Xenomai threads.

Precisely the reason why I propose to replace it with a global variable
reference, or a per-cpu variable for SMP systems.

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Jan Kiszka
Am 05.10.2010 15:50, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:42, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:15, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Hi,

 quite a few limitations and complications of using Linux services over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the stack
 pointer. This is not an issue anymore on x86-64 (both states are stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use the
 stack and may continue to do so.

 I just looked into this thing again as I'm evaluating ways to exploit
 the kernel's tracing framework also under Xenomai. Unfortunately, it
 does a lot of fiddling with preempt_count and need_resched, so patching
 it for Xenomai use would become a maintenance nightmare.

 An alternative, also for other use cases like kgdb and probably perf, is
 to get rid of our dependency on home-grown stacks. I think we are on
 that way already as in-kernel skins have been deprecated. The only
 remaining user after them will be RTDM driver tasks. But I think those
 could simply become in-kernel shadows of kthreads which would bind their
 stacks to what Linux provides. Moreover, Xenomai could start updating
 current and thread_info on context switches (unless this already
 happens implicitly). That would give us proper contexts for system-level
 tracing and profiling.

 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.
 A stupid question: why not make things the other way around: patch the
 current and current_thread_info functions to be made I-pipe aware and
 use an ipipe_current pointer to the current thread task_struct. Of
 course, there are places where the current or current_thread_info macros
 are implemented in assembly, so it may be not simple as it sounds, but
 it would allow to keep 128 Kb stacks if we want. This also means that we
 would have to put a task_struct at the bottom of every Xenomai task.
 First of all, overhead vs. maintenance. Either every access to
 preempt_count() would require a check for the current domain and its
 foreign stack flag, or I would have to patch dozens (if that is enough)
 of code sites in the tracer framework.
 No. I mean we would dereference a pointer named ipipe_current. That is
 all, no other check. This pointer would be maintained elsewhere. And we
 modify the current macro, like:

 #ifdef CONFIG_IPIPE
 extern struct task_struct *ipipe_current;
 #define current ipipe_current
 #endif

 Any calll site gets modified automatically. Or current_thread_info, if
 it is current_thread_info which is obtained using the stack pointer mask
 trick.

 The stack pointer mask trick only works with fixed-sized stacks, not a
 guaranteed property of in-kernel Xenomai threads.
 
 Precisely the reason why I propose to replace it with a global variable
 reference, or a per-cpu variable for SMP systems.

Then why is Linux not using this in favor of the stack pointer approach
on, say, ARM?

For sure, we can patch all Adeos-supported archs away from stack-based
to per-cpu current  thread_info, but I don't feel comfortable with this
in some way invasive approach as well. Well, maybe it's just my personal
misperception.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Overcoming the foreign stack

2010-10-05 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
 Am 05.10.2010 15:50, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:42, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Am 05.10.2010 15:15, Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
 Hi,

 quite a few limitations and complications of using Linux services over
 non-Linux domains relate to potentially invalid current and
 thread_info. The non-Linux domain could maintain their own kernel
 stacks while Linux tend to derive current and thread_info from the stack
 pointer. This is not an issue anymore on x86-64 (both states are stored
 in per-cpu variables) but other archs (e.g. x86-32 or ARM) still use the
 stack and may continue to do so.

 I just looked into this thing again as I'm evaluating ways to exploit
 the kernel's tracing framework also under Xenomai. Unfortunately, it
 does a lot of fiddling with preempt_count and need_resched, so patching
 it for Xenomai use would become a maintenance nightmare.

 An alternative, also for other use cases like kgdb and probably perf, is
 to get rid of our dependency on home-grown stacks. I think we are on
 that way already as in-kernel skins have been deprecated. The only
 remaining user after them will be RTDM driver tasks. But I think those
 could simply become in-kernel shadows of kthreads which would bind their
 stacks to what Linux provides. Moreover, Xenomai could start updating
 current and thread_info on context switches (unless this already
 happens implicitly). That would give us proper contexts for system-level
 tracing and profiling.

 My key question is currently if and how much of this could be realized
 in 2.6. Could we drop in-kernel skins in that version? If not, what
 about disabling them by default, converting RTDM tasks to a
 kthread-based approach, and enabling tracing etc. only in that case?
 However, this might be a bit fragile unless we can establish
 compile-time or run-time requirements negotiation between Adeos and its
 users (Xenomai) about the stack model.
 A stupid question: why not make things the other way around: patch the
 current and current_thread_info functions to be made I-pipe aware and
 use an ipipe_current pointer to the current thread task_struct. Of
 course, there are places where the current or current_thread_info macros
 are implemented in assembly, so it may be not simple as it sounds, but
 it would allow to keep 128 Kb stacks if we want. This also means that we
 would have to put a task_struct at the bottom of every Xenomai task.
 First of all, overhead vs. maintenance. Either every access to
 preempt_count() would require a check for the current domain and its
 foreign stack flag, or I would have to patch dozens (if that is enough)
 of code sites in the tracer framework.
 No. I mean we would dereference a pointer named ipipe_current. That is
 all, no other check. This pointer would be maintained elsewhere. And we
 modify the current macro, like:

 #ifdef CONFIG_IPIPE
 extern struct task_struct *ipipe_current;
 #define current ipipe_current
 #endif

 Any calll site gets modified automatically. Or current_thread_info, if
 it is current_thread_info which is obtained using the stack pointer mask
 trick.
 The stack pointer mask trick only works with fixed-sized stacks, not a
 guaranteed property of in-kernel Xenomai threads.
 Precisely the reason why I propose to replace it with a global variable
 reference, or a per-cpu variable for SMP systems.
 
 Then why is Linux not using this in favor of the stack pointer approach
 on, say, ARM?
 
 For sure, we can patch all Adeos-supported archs away from stack-based
 to per-cpu current  thread_info, but I don't feel comfortable with this
 in some way invasive approach as well. Well, maybe it's just my personal
 misperception.

It is as much invasive as modifying local_irq_save/local_irq_restore.
The real question about the global pointer approach, is, if it so much
less efficient, how does Xenomai, which uses this scheme, manage to have
good performances on ARM?

-- 
Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core